Skip to content

Commit fa23f9d

Browse files
committed
clean table of contents for ipynb and update readme
1 parent 10ed38f commit fa23f9d

4 files changed

+29
-13
lines changed

1. Supervised Learning on California Test Scores.ipynb renamed to 1. Supervised Learning.ipynb

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -7,14 +7,14 @@
77
},
88
"source": [
99
"<h1>Table of Contents<span class=\"tocSkip\"></span></h1>\n",
10-
"<div class=\"toc\"><ul class=\"toc-item\"><li><span><a href=\"#Supervised-Learning-on-California-Test-Scores\" data-toc-modified-id=\"Supervised-Learning-on-California-Test-Scores-1\">Supervised Learning on California Test Scores</a></span><ul class=\"toc-item\"><li><span><a href=\"#Part-1:-Regression-on-California-Test-Scores\" data-toc-modified-id=\"Part-1:-Regression-on-California-Test-Scores-1.1\">Part 1: Regression on California Test Scores</a></span><ul class=\"toc-item\"><li><span><a href=\"#1.1-Visualize-the-univariate-distribution\" data-toc-modified-id=\"1.1-Visualize-the-univariate-distribution-1.1.1\">1.1 Visualize the univariate distribution</a></span></li><li><span><a href=\"#1.2-Visualize-the-dependency-of-the-target-on-each-feature-from-1.1.\" data-toc-modified-id=\"1.2-Visualize-the-dependency-of-the-target-on-each-feature-from-1.1.-1.1.2\">1.2 Visualize the dependency of the target on each feature from 1.1.</a></span></li><li><span><a href=\"#1.3-Modelling-to-evaluate-relationship\" data-toc-modified-id=\"1.3-Modelling-to-evaluate-relationship-1.1.3\">1.3 Modelling to evaluate relationship</a></span><ul class=\"toc-item\"><li><span><a href=\"#1.3.1-Train-test-split\" data-toc-modified-id=\"1.3.1-Train-test-split-1.1.3.1\">1.3.1 Train test split</a></span></li><li><span><a href=\"#1.3.2-Build-Models\" data-toc-modified-id=\"1.3.2-Build-Models-1.1.3.2\">1.3.2 Build Models</a></span></li><li><span><a href=\"#1.3.3-Evaluate-Models-Using-Cross-Validation\" data-toc-modified-id=\"1.3.3-Evaluate-Models-Using-Cross-Validation-1.1.3.3\">1.3.3 Evaluate Models Using Cross-Validation</a></span></li><li><span><a href=\"#1.3.4-Does-scaling-the-data-with-the-StandardScaler-help?\" data-toc-modified-id=\"1.3.4-Does-scaling-the-data-with-the-StandardScaler-help?-1.1.3.4\">1.3.4 Does scaling the data with the StandardScaler help?</a></span></li></ul></li><li><span><a href=\"#1.4-Tune-the-parameters-of-the-models-where-possible-using-GridSearchCV.\" data-toc-modified-id=\"1.4-Tune-the-parameters-of-the-models-where-possible-using-GridSearchCV.-1.1.4\">1.4 Tune the parameters of the models where possible using GridSearchCV.</a></span><ul class=\"toc-item\"><li><span><a href=\"#KNN-for-regression\" data-toc-modified-id=\"KNN-for-regression-1.1.4.1\">KNN for regression</a></span></li><li><span><a href=\"#Ridge-Regression\" data-toc-modified-id=\"Ridge-Regression-1.1.4.2\">Ridge Regression</a></span></li><li><span><a href=\"#Lasso-Regression\" data-toc-modified-id=\"Lasso-Regression-1.1.4.3\">Lasso Regression</a></span></li></ul></li><li><span><a href=\"#1.5-Compare-the-coefficients-of-your-two-best-linear-models-(not-knn)\" data-toc-modified-id=\"1.5-Compare-the-coefficients-of-your-two-best-linear-models-(not-knn)-1.1.5\">1.5 Compare the coefficients of your two best linear models (not knn)</a></span></li><li><span><a href=\"#1.6-Discuss-which-final-model-you-would-choose-to-predict-new-data\" data-toc-modified-id=\"1.6-Discuss-which-final-model-you-would-choose-to-predict-new-data-1.1.6\">1.6 Discuss which final model you would choose to predict new data</a></span></li></ul></li><li><span><a href=\"#Part-2:-Classification-on-red-and-white-wine-characteristics\" data-toc-modified-id=\"Part-2:-Classification-on-red-and-white-wine-characteristics-1.2\">Part 2: Classification on red and white wine characteristics</a></span><ul class=\"toc-item\"><li><span><a href=\"#2.1-Visualize-the-univariate-distribution\" data-toc-modified-id=\"2.1-Visualize-the-univariate-distribution-1.2.1\">2.1 Visualize the univariate distribution</a></span></li><li><span><a href=\"#Visualize-the-dependency-of-the-target-on-each-feature-from-2.1.\" data-toc-modified-id=\"Visualize-the-dependency-of-the-target-on-each-feature-from-2.1.-1.2.2\">Visualize the dependency of the target on each feature from 2.1.</a></span></li><li><span><a href=\"#2.2-Modelling-on-Relationships\" data-toc-modified-id=\"2.2-Modelling-on-Relationships-1.2.3\">2.2 Modelling on Relationships</a></span><ul class=\"toc-item\"><li><span><a href=\"#Split-Training-and-Test\" data-toc-modified-id=\"Split-Training-and-Test-1.2.3.1\">Split Training and Test</a></span></li><li><span><a href=\"#Scaling\" data-toc-modified-id=\"Scaling-1.2.3.2\">Scaling</a></span></li><li><span><a href=\"#Build-Models-and-Evaluate\" data-toc-modified-id=\"Build-Models-and-Evaluate-1.2.3.3\">Build Models and Evaluate</a></span></li><li><span><a href=\"#Logistic-Regression\" data-toc-modified-id=\"Logistic-Regression-1.2.3.4\">Logistic Regression</a></span></li><li><span><a href=\"#Penalized-Logistic-Regression\" data-toc-modified-id=\"Penalized-Logistic-Regression-1.2.3.5\">Penalized Logistic Regression</a></span></li></ul></li><li><span><a href=\"#2.3-Tune-the-parameters-where-possible-using-GridSearchCV.\" data-toc-modified-id=\"2.3-Tune-the-parameters-where-possible-using-GridSearchCV.-1.2.4\">2.3 Tune the parameters where possible using GridSearchCV.</a></span><ul class=\"toc-item\"><li><span><a href=\"#KNN\" data-toc-modified-id=\"KNN-1.2.4.1\">KNN</a></span></li><li><span><a href=\"#Penalized-Logistic-Regression\" data-toc-modified-id=\"Penalized-Logistic-Regression-1.2.4.2\">Penalized Logistic Regression</a></span></li></ul></li><li><span><a href=\"#2.4-Change-the-cross-validation-strategy\" data-toc-modified-id=\"2.4-Change-the-cross-validation-strategy-1.2.5\">2.4 Change the cross-validation strategy</a></span></li></ul></li><li><span><a href=\"#2.5-Compare-the-coefficients\" data-toc-modified-id=\"2.5-Compare-the-coefficients-1.3\">2.5 Compare the coefficients</a></span></li></ul></li></ul></div>"
10+
"<div class=\"toc\"><ul class=\"toc-item\"><li><span><a href=\"#Supervised-Learning\" data-toc-modified-id=\"Supervised-Learning-1\">Supervised Learning</a></span><ul class=\"toc-item\"><li><span><a href=\"#Part-1:-Regression-on-California-Test-Scores\" data-toc-modified-id=\"Part-1:-Regression-on-California-Test-Scores-1.1\">Part 1: Regression on California Test Scores</a></span><ul class=\"toc-item\"><li><span><a href=\"#1.1-Visualize-the-univariate-distribution\" data-toc-modified-id=\"1.1-Visualize-the-univariate-distribution-1.1.1\">1.1 Visualize the univariate distribution</a></span></li><li><span><a href=\"#1.2-Visualize-the-dependency-of-the-target-on-each-feature-from-1.1.\" data-toc-modified-id=\"1.2-Visualize-the-dependency-of-the-target-on-each-feature-from-1.1.-1.1.2\">1.2 Visualize the dependency of the target on each feature from 1.1.</a></span></li><li><span><a href=\"#1.3-Modelling-to-evaluate-relationship\" data-toc-modified-id=\"1.3-Modelling-to-evaluate-relationship-1.1.3\">1.3 Modelling to evaluate relationship</a></span><ul class=\"toc-item\"><li><span><a href=\"#1.3.1-Train-test-split\" data-toc-modified-id=\"1.3.1-Train-test-split-1.1.3.1\">1.3.1 Train test split</a></span></li><li><span><a href=\"#1.3.2-Build-Models\" data-toc-modified-id=\"1.3.2-Build-Models-1.1.3.2\">1.3.2 Build Models</a></span></li><li><span><a href=\"#1.3.3-Evaluate-Models-Using-Cross-Validation\" data-toc-modified-id=\"1.3.3-Evaluate-Models-Using-Cross-Validation-1.1.3.3\">1.3.3 Evaluate Models Using Cross-Validation</a></span></li><li><span><a href=\"#1.3.4-Does-scaling-the-data-with-the-StandardScaler-help?\" data-toc-modified-id=\"1.3.4-Does-scaling-the-data-with-the-StandardScaler-help?-1.1.3.4\">1.3.4 Does scaling the data with the StandardScaler help?</a></span></li></ul></li><li><span><a href=\"#1.4-Tune-the-parameters-of-the-models-where-possible-using-GridSearchCV.\" data-toc-modified-id=\"1.4-Tune-the-parameters-of-the-models-where-possible-using-GridSearchCV.-1.1.4\">1.4 Tune the parameters of the models where possible using GridSearchCV.</a></span><ul class=\"toc-item\"><li><span><a href=\"#KNN-for-regression\" data-toc-modified-id=\"KNN-for-regression-1.1.4.1\">KNN for regression</a></span></li><li><span><a href=\"#Ridge-Regression\" data-toc-modified-id=\"Ridge-Regression-1.1.4.2\">Ridge Regression</a></span></li><li><span><a href=\"#Lasso-Regression\" data-toc-modified-id=\"Lasso-Regression-1.1.4.3\">Lasso Regression</a></span></li></ul></li><li><span><a href=\"#1.5-Compare-the-coefficients-of-your-two-best-linear-models-(not-knn)\" data-toc-modified-id=\"1.5-Compare-the-coefficients-of-your-two-best-linear-models-(not-knn)-1.1.5\">1.5 Compare the coefficients of your two best linear models (not knn)</a></span></li><li><span><a href=\"#1.6-Discuss-which-final-model-you-would-choose-to-predict-new-data\" data-toc-modified-id=\"1.6-Discuss-which-final-model-you-would-choose-to-predict-new-data-1.1.6\">1.6 Discuss which final model you would choose to predict new data</a></span></li></ul></li><li><span><a href=\"#Part-2:-Classification-on-red-and-white-wine-characteristics\" data-toc-modified-id=\"Part-2:-Classification-on-red-and-white-wine-characteristics-1.2\">Part 2: Classification on red and white wine characteristics</a></span><ul class=\"toc-item\"><li><span><a href=\"#2.1-Visualize-the-univariate-distribution\" data-toc-modified-id=\"2.1-Visualize-the-univariate-distribution-1.2.1\">2.1 Visualize the univariate distribution</a></span></li><li><span><a href=\"#Visualize-the-dependency-of-the-target-on-each-feature-from-2.1.\" data-toc-modified-id=\"Visualize-the-dependency-of-the-target-on-each-feature-from-2.1.-1.2.2\">Visualize the dependency of the target on each feature from 2.1.</a></span></li><li><span><a href=\"#2.2-Modelling-on-Relationships\" data-toc-modified-id=\"2.2-Modelling-on-Relationships-1.2.3\">2.2 Modelling on Relationships</a></span><ul class=\"toc-item\"><li><span><a href=\"#Split-Training-and-Test\" data-toc-modified-id=\"Split-Training-and-Test-1.2.3.1\">Split Training and Test</a></span></li><li><span><a href=\"#Scaling\" data-toc-modified-id=\"Scaling-1.2.3.2\">Scaling</a></span></li><li><span><a href=\"#Build-Models-and-Evaluate\" data-toc-modified-id=\"Build-Models-and-Evaluate-1.2.3.3\">Build Models and Evaluate</a></span></li><li><span><a href=\"#Logistic-Regression\" data-toc-modified-id=\"Logistic-Regression-1.2.3.4\">Logistic Regression</a></span></li><li><span><a href=\"#Penalized-Logistic-Regression\" data-toc-modified-id=\"Penalized-Logistic-Regression-1.2.3.5\">Penalized Logistic Regression</a></span></li></ul></li><li><span><a href=\"#2.3-Tune-the-parameters-where-possible-using-GridSearchCV.\" data-toc-modified-id=\"2.3-Tune-the-parameters-where-possible-using-GridSearchCV.-1.2.4\">2.3 Tune the parameters where possible using GridSearchCV.</a></span><ul class=\"toc-item\"><li><span><a href=\"#KNN\" data-toc-modified-id=\"KNN-1.2.4.1\">KNN</a></span></li><li><span><a href=\"#Penalized-Logistic-Regression\" data-toc-modified-id=\"Penalized-Logistic-Regression-1.2.4.2\">Penalized Logistic Regression</a></span></li></ul></li><li><span><a href=\"#2.4-Change-the-cross-validation-strategy\" data-toc-modified-id=\"2.4-Change-the-cross-validation-strategy-1.2.5\">2.4 Change the cross-validation strategy</a></span></li><li><span><a href=\"#2.5-Compare-the-coefficients\" data-toc-modified-id=\"2.5-Compare-the-coefficients-1.2.6\">2.5 Compare the coefficients</a></span></li></ul></li></ul></li></ul></div>"
1111
]
1212
},
1313
{
1414
"cell_type": "markdown",
1515
"metadata": {},
1616
"source": [
17-
"# Supervised Learning on California Test Scores"
17+
"# Supervised Learning"
1818
]
1919
},
2020
{
@@ -2925,7 +2925,7 @@
29252925
"cell_type": "markdown",
29262926
"metadata": {},
29272927
"source": [
2928-
"## 2.5 Compare the coefficients \n",
2928+
"### 2.5 Compare the coefficients \n",
29292929
"\n",
29302930
"Lastly, compare the coefficients for Logistic Regression and Penalized Logistic Regression and discuss which final model you would choose to predict new data."
29312931
]

3. Classification on Text Data.ipynb

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -7,14 +7,14 @@
77
},
88
"source": [
99
"<h1>Table of Contents<span class=\"tocSkip\"></span></h1>\n",
10-
"<div class=\"toc\"><ul class=\"toc-item\"><li><span><a href=\"#Build-a-classification-model-using-text-data\" data-toc-modified-id=\"Build-a-classification-model-using-text-data-1\">Build a classification model using text data</a></span><ul class=\"toc-item\"><li><span><a href=\"#Import-the-text-data\" data-toc-modified-id=\"Import-the-text-data-1.1\">Import the text data</a></span></li><li><span><a href=\"#Vectorize-the-review-column-into-an-X-matrix.\" data-toc-modified-id=\"Vectorize-the-review-column-into-an-X-matrix.-1.2\">Vectorize the review column into an X matrix.</a></span></li><li><span><a href=\"#Run-three-models-and-Select\" data-toc-modified-id=\"Run-three-models-and-Select-1.3\">Run three models and Select</a></span><ul class=\"toc-item\"><li><span><a href=\"#a)-KNN\" data-toc-modified-id=\"a)-KNN-1.3.1\">a) KNN</a></span></li><li><span><a href=\"#b)-Penalized-Logistic-regression\" data-toc-modified-id=\"b)-Penalized-Logistic-regression-1.3.2\">b) Penalized Logistic regression</a></span></li><li><span><a href=\"#c)-Bagged-Tree\" data-toc-modified-id=\"c)-Bagged-Tree-1.3.3\">c) Bagged Tree</a></span></li><li><span><a href=\"#I-choose-the-Penalized-Logistic-regression-because-this-model-perform-best-in-terms-of-positive-and-negtive-case-classification-(highest-F1-score)\" data-toc-modified-id=\"I-choose-the-Penalized-Logistic-regression-because-this-model-perform-best-in-terms-of-positive-and-negtive-case-classification-(highest-F1-score)-1.3.4\">I choose the Penalized Logistic regression because this model perform best in terms of positive and negtive case classification (highest F1 score)</a></span></li></ul></li><li><span><a href=\"#Inspect-all-models-by-visualizing-the-coefficients.\" data-toc-modified-id=\"Inspect-all-models-by-visualizing-the-coefficients.-1.4\">Inspect all models by visualizing the coefficients.</a></span></li></ul></li></ul></div>"
10+
"<div class=\"toc\"><ul class=\"toc-item\"><li><span><a href=\"#Classification-on-text-data\" data-toc-modified-id=\"Classification-on-text-data-1\">Classification on text data</a></span><ul class=\"toc-item\"><li><span><a href=\"#Import-the-text-data\" data-toc-modified-id=\"Import-the-text-data-1.1\">Import the text data</a></span></li><li><span><a href=\"#Vectorize\" data-toc-modified-id=\"Vectorize-1.2\">Vectorize</a></span></li><li><span><a href=\"#Run-three-models-and-Select\" data-toc-modified-id=\"Run-three-models-and-Select-1.3\">Run three models and Select</a></span><ul class=\"toc-item\"><li><span><a href=\"#a)-KNN\" data-toc-modified-id=\"a)-KNN-1.3.1\">a) KNN</a></span></li><li><span><a href=\"#b)-Penalized-Logistic-regression\" data-toc-modified-id=\"b)-Penalized-Logistic-regression-1.3.2\">b) Penalized Logistic regression</a></span></li><li><span><a href=\"#c)-Bagged-Tree\" data-toc-modified-id=\"c)-Bagged-Tree-1.3.3\">c) Bagged Tree</a></span></li></ul></li><li><span><a href=\"#Inspect-all-models-by-visualizing-the-coefficients.\" data-toc-modified-id=\"Inspect-all-models-by-visualizing-the-coefficients.-1.4\">Inspect all models by visualizing the coefficients.</a></span></li></ul></li></ul></div>"
1111
]
1212
},
1313
{
1414
"cell_type": "markdown",
1515
"metadata": {},
1616
"source": [
17-
"# Build a classification model using text data"
17+
"# Classification on text data"
1818
]
1919
},
2020
{
@@ -168,7 +168,7 @@
168168
"cell_type": "markdown",
169169
"metadata": {},
170170
"source": [
171-
"## Vectorize the review column into an X matrix.  "
171+
"## Vectorize"
172172
]
173173
},
174174
{
@@ -729,7 +729,7 @@
729729
"cell_type": "markdown",
730730
"metadata": {},
731731
"source": [
732-
"### I choose the Penalized Logistic regression because this model perform best in terms of positive and negtive case classification (highest F1 score)"
732+
"*I choose the Penalized Logistic regression because this model perform best in terms of positive and negtive case classification (highest F1 score)*"
733733
]
734734
},
735735
{

4. Neural Network using Keras.ipynb

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -7,14 +7,14 @@
77
},
88
"source": [
99
"<h1>Table of Contents<span class=\"tocSkip\"></span></h1>\n",
10-
"<div class=\"toc\"><ul class=\"toc-item\"><li><span><a href=\"#Build-a-predictive-neural-network-using-Keras\" data-toc-modified-id=\"Build-a-predictive-neural-network-using-Keras-1\">Build a predictive neural network using Keras</a></span><ul class=\"toc-item\"><li><span><a href=\"#Run-a-multilayer-perceptron-with-two-hidden-layers\" data-toc-modified-id=\"Run-a-multilayer-perceptron-with-two-hidden-layers-1.1\">Run a multilayer perceptron with two hidden layers</a></span></li><li><span><a href=\"#selecting-the-number-of-hidden-units-using-GridSearchCV-and-evaluation-on-a-test-set.\" data-toc-modified-id=\"selecting-the-number-of-hidden-units-using-GridSearchCV-and-evaluation-on-a-test-set.-1.2\">selecting the number of hidden units using GridSearchCV and evaluation on a test-set.</a></span></li><li><span><a href=\"#Describe-the-differences-in-the-predictive-accuracy-of-models-with-different-numbers-of-hidden-units.\" data-toc-modified-id=\"Describe-the-differences-in-the-predictive-accuracy-of-models-with-different-numbers-of-hidden-units.-1.3\">Describe the differences in the predictive accuracy of models with different numbers of hidden units.</a></span></li><li><span><a href=\"#Describe-the-predictive-strength-of-your-best-model.\" data-toc-modified-id=\"Describe-the-predictive-strength-of-your-best-model.-1.4\">Describe the predictive strength of your best model.</a></span></li></ul></li></ul></div>"
10+
"<div class=\"toc\"><ul class=\"toc-item\"><li><span><a href=\"#Predictive-neural-network-using-Keras\" data-toc-modified-id=\"Predictive-neural-network-using-Keras-1\">Predictive neural network using Keras</a></span><ul class=\"toc-item\"><li><span><a href=\"#Run-a-multilayer-perceptron-with-two-hidden-layers\" data-toc-modified-id=\"Run-a-multilayer-perceptron-with-two-hidden-layers-1.1\">Run a multilayer perceptron with two hidden layers</a></span></li><li><span><a href=\"#selecting-the-number-of-hidden-units-using-GridSearchCV-and-evaluation-on-a-test-set.\" data-toc-modified-id=\"selecting-the-number-of-hidden-units-using-GridSearchCV-and-evaluation-on-a-test-set.-1.2\">selecting the number of hidden units using GridSearchCV and evaluation on a test-set.</a></span></li><li><span><a href=\"#Describe-the-differences-in-the-predictive-accuracy-of-models-with-different-numbers-of-hidden-units.\" data-toc-modified-id=\"Describe-the-differences-in-the-predictive-accuracy-of-models-with-different-numbers-of-hidden-units.-1.3\">Describe the differences in the predictive accuracy of models with different numbers of hidden units.</a></span></li><li><span><a href=\"#Describe-the-predictive-strength-of-your-best-model.\" data-toc-modified-id=\"Describe-the-predictive-strength-of-your-best-model.-1.4\">Describe the predictive strength of your best model.</a></span></li></ul></li></ul></div>"
1111
]
1212
},
1313
{
1414
"cell_type": "markdown",
1515
"metadata": {},
1616
"source": [
17-
"# Build a predictive neural network using Keras"
17+
"# Predictive neural network using Keras"
1818
]
1919
},
2020
{

README.md

Lines changed: 20 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -13,11 +13,27 @@ Reference books:
1313

1414
**Software**: `Python`, `sklearn`, `numpy`, `pandas`, `matplotlib`, `keras`
1515

16-
## Supervised Learning
16+
## [Supervised Learning](https://github.com/YiAlpha/machine-learning-python/blob/main/1.%20Supervised%20Learning.ipynb)
1717

18-
## Unsupervised Learning
18+
1. Regression on California Test Scores
19+
2. Classification on red and white wine characteristics
1920

20-
## Modelling on Text Data
21+
## [Unsupervised Learning](https://github.com/YiAlpha/machine-learning-python/blob/main/2.%20Unsupervised%20Learning%20on%20Wine%20Quality%20Dataset.ipynb)
2122

22-
## Neural Network
23+
1. K Means Cluster
24+
2. Hierarchical Cluster Analysis
25+
3. Principal Component Analysis
2326

27+
## [Classification on Text Data](https://github.com/YiAlpha/machine-learning-python/blob/main/3.%20Classification%20on%20Text%20Data.ipynb)
28+
29+
1. Import the text data
30+
2. Vectorize
31+
3. Run three models and Select
32+
4. Inspect all models by visualizing the coefficients
33+
34+
## [Neural Network](https://github.com/YiAlpha/machine-learning-python/blob/main/4.%20Neural%20Network%20using%20Keras.ipynb)
35+
36+
1. Run a multilayer perceptron with two hidden layers

37+
2. selecting the number of hidden units using `GridSearchCV` and evaluation on a test-set.

38+
3. Describe the differences in the predictive accuracy of models with different numbers of hidden units.

39+
4. Describe the predictive strength of your best model.


0 commit comments

Comments
 (0)