clean table of contents for ipynb and update readme

yintellect · yintellect · commit fa23f9d704ac · 2021-03-29T00:49:46.000-05:00
diff --git a/1. Supervised Learning.ipynb b/1. Supervised Learning.ipynb
@@ -7,14 +7,14 @@
    },
    "source": [
     "<h1>Table of Contents<span class=\"tocSkip\"></span></h1>\n",
-    "<div class=\"toc\"><ul class=\"toc-item\"><li><span><a href=\"#Supervised-Learning-on-California-Test-Scores\" data-toc-modified-id=\"Supervised-Learning-on-California-Test-Scores-1\">Supervised Learning on California Test Scores</a></span><ul class=\"toc-item\"><li><span><a href=\"#Part-1:-Regression-on-California-Test-Scores\" data-toc-modified-id=\"Part-1:-Regression-on-California-Test-Scores-1.1\">Part 1: Regression on California Test Scores</a></span><ul class=\"toc-item\"><li><span><a href=\"#1.1-Visualize-the-univariate-distribution\" data-toc-modified-id=\"1.1-Visualize-the-univariate-distribution-1.1.1\">1.1 Visualize the univariate distribution</a></span></li><li><span><a href=\"#1.2-Visualize-the-dependency-of-the-target-on-each-feature-from-1.1.\" data-toc-modified-id=\"1.2-Visualize-the-dependency-of-the-target-on-each-feature-from-1.1.-1.1.2\">1.2 Visualize the dependency of the target on each feature from 1.1.</a></span></li><li><span><a href=\"#1.3-Modelling-to-evaluate-relationship\" data-toc-modified-id=\"1.3-Modelling-to-evaluate-relationship-1.1.3\">1.3 Modelling to evaluate relationship</a></span><ul class=\"toc-item\"><li><span><a href=\"#1.3.1-Train-test-split\" data-toc-modified-id=\"1.3.1-Train-test-split-1.1.3.1\">1.3.1 Train test split</a></span></li><li><span><a href=\"#1.3.2-Build-Models\" data-toc-modified-id=\"1.3.2-Build-Models-1.1.3.2\">1.3.2 Build Models</a></span></li><li><span><a href=\"#1.3.3-Evaluate-Models-Using-Cross-Validation\" data-toc-modified-id=\"1.3.3-Evaluate-Models-Using-Cross-Validation-1.1.3.3\">1.3.3 Evaluate Models Using Cross-Validation</a></span></li><li><span><a href=\"#1.3.4-Does-scaling-the-data-with-the-StandardScaler-help?\" data-toc-modified-id=\"1.3.4-Does-scaling-the-data-with-the-StandardScaler-help?-1.1.3.4\">1.3.4 Does scaling the data with the StandardScaler help?</a></span></li></ul></li><li><span><a href=\"#1.4-Tune-the-parameters-of-the-models-where-possible-using-GridSearchCV.\" data-toc-modified-id=\"1.4-Tune-the-parameters-of-the-models-where-possible-using-GridSearchCV.-1.1.4\">1.4 Tune the parameters of the models where possible using GridSearchCV.</a></span><ul class=\"toc-item\"><li><span><a href=\"#KNN-for-regression\" data-toc-modified-id=\"KNN-for-regression-1.1.4.1\">KNN for regression</a></span></li><li><span><a href=\"#Ridge-Regression\" data-toc-modified-id=\"Ridge-Regression-1.1.4.2\">Ridge Regression</a></span></li><li><span><a href=\"#Lasso-Regression\" data-toc-modified-id=\"Lasso-Regression-1.1.4.3\">Lasso Regression</a></span></li></ul></li><li><span><a href=\"#1.5-Compare-the-coefficients-of-your-two-best-linear-models-(not-knn)\" data-toc-modified-id=\"1.5-Compare-the-coefficients-of-your-two-best-linear-models-(not-knn)-1.1.5\">1.5 Compare the coefficients of your two best linear models (not knn)</a></span></li><li><span><a href=\"#1.6-Discuss-which-final-model-you-would-choose-to-predict-new-data\" data-toc-modified-id=\"1.6-Discuss-which-final-model-you-would-choose-to-predict-new-data-1.1.6\">1.6 Discuss which final model you would choose to predict new data</a></span></li></ul></li><li><span><a href=\"#Part-2:-Classification-on-red-and-white-wine-characteristics\" data-toc-modified-id=\"Part-2:-Classification-on-red-and-white-wine-characteristics-1.2\">Part 2: Classification on red and white wine characteristics</a></span><ul class=\"toc-item\"><li><span><a href=\"#2.1-Visualize-the-univariate-distribution\" data-toc-modified-id=\"2.1-Visualize-the-univariate-distribution-1.2.1\">2.1 Visualize the univariate distribution</a></span></li><li><span><a href=\"#Visualize-the-dependency-of-the-target-on-each-feature-from-2.1.\" data-toc-modified-id=\"Visualize-the-dependency-of-the-target-on-each-feature-from-2.1.-1.2.2\">Visualize the dependency of the target on each feature from 2.1.</a></span></li><li><span><a href=\"#2.2-Modelling-on-Relationships\" data-toc-modified-id=\"2.2-Modelling-on-Relationships-1.2.3\">2.2 Modelling on Relationships</a></span><ul class=\"toc-item\"><li><span><a href=\"#Split-Training-and-Test\" data-toc-modified-id=\"Split-Training-and-Test-1.2.3.1\">Split Training and Test</a></span></li><li><span><a href=\"#Scaling\" data-toc-modified-id=\"Scaling-1.2.3.2\">Scaling</a></span></li><li><span><a href=\"#Build-Models-and-Evaluate\" data-toc-modified-id=\"Build-Models-and-Evaluate-1.2.3.3\">Build Models and Evaluate</a></span></li><li><span><a href=\"#Logistic-Regression\" data-toc-modified-id=\"Logistic-Regression-1.2.3.4\">Logistic Regression</a></span></li><li><span><a href=\"#Penalized-Logistic-Regression\" data-toc-modified-id=\"Penalized-Logistic-Regression-1.2.3.5\">Penalized Logistic Regression</a></span></li></ul></li><li><span><a href=\"#2.3-Tune-the-parameters-where-possible-using-GridSearchCV.\" data-toc-modified-id=\"2.3-Tune-the-parameters-where-possible-using-GridSearchCV.-1.2.4\">2.3 Tune the parameters where possible using GridSearchCV.</a></span><ul class=\"toc-item\"><li><span><a href=\"#KNN\" data-toc-modified-id=\"KNN-1.2.4.1\">KNN</a></span></li><li><span><a href=\"#Penalized-Logistic-Regression\" data-toc-modified-id=\"Penalized-Logistic-Regression-1.2.4.2\">Penalized Logistic Regression</a></span></li></ul></li><li><span><a href=\"#2.4-Change-the-cross-validation-strategy\" data-toc-modified-id=\"2.4-Change-the-cross-validation-strategy-1.2.5\">2.4 Change the cross-validation strategy</a></span></li></ul></li><li><span><a href=\"#2.5-Compare-the-coefficients\" data-toc-modified-id=\"2.5-Compare-the-coefficients-1.3\">2.5 Compare the coefficients</a></span></li></ul></li></ul></div>"
+    "<div class=\"toc\"><ul class=\"toc-item\"><li><span><a href=\"#Supervised-Learning\" data-toc-modified-id=\"Supervised-Learning-1\">Supervised Learning</a></span><ul class=\"toc-item\"><li><span><a href=\"#Part-1:-Regression-on-California-Test-Scores\" data-toc-modified-id=\"Part-1:-Regression-on-California-Test-Scores-1.1\">Part 1: Regression on California Test Scores</a></span><ul class=\"toc-item\"><li><span><a href=\"#1.1-Visualize-the-univariate-distribution\" data-toc-modified-id=\"1.1-Visualize-the-univariate-distribution-1.1.1\">1.1 Visualize the univariate distribution</a></span></li><li><span><a href=\"#1.2-Visualize-the-dependency-of-the-target-on-each-feature-from-1.1.\" data-toc-modified-id=\"1.2-Visualize-the-dependency-of-the-target-on-each-feature-from-1.1.-1.1.2\">1.2 Visualize the dependency of the target on each feature from 1.1.</a></span></li><li><span><a href=\"#1.3-Modelling-to-evaluate-relationship\" data-toc-modified-id=\"1.3-Modelling-to-evaluate-relationship-1.1.3\">1.3 Modelling to evaluate relationship</a></span><ul class=\"toc-item\"><li><span><a href=\"#1.3.1-Train-test-split\" data-toc-modified-id=\"1.3.1-Train-test-split-1.1.3.1\">1.3.1 Train test split</a></span></li><li><span><a href=\"#1.3.2-Build-Models\" data-toc-modified-id=\"1.3.2-Build-Models-1.1.3.2\">1.3.2 Build Models</a></span></li><li><span><a href=\"#1.3.3-Evaluate-Models-Using-Cross-Validation\" data-toc-modified-id=\"1.3.3-Evaluate-Models-Using-Cross-Validation-1.1.3.3\">1.3.3 Evaluate Models Using Cross-Validation</a></span></li><li><span><a href=\"#1.3.4-Does-scaling-the-data-with-the-StandardScaler-help?\" data-toc-modified-id=\"1.3.4-Does-scaling-the-data-with-the-StandardScaler-help?-1.1.3.4\">1.3.4 Does scaling the data with the StandardScaler help?</a></span></li></ul></li><li><span><a href=\"#1.4-Tune-the-parameters-of-the-models-where-possible-using-GridSearchCV.\" data-toc-modified-id=\"1.4-Tune-the-parameters-of-the-models-where-possible-using-GridSearchCV.-1.1.4\">1.4 Tune the parameters of the models where possible using GridSearchCV.</a></span><ul class=\"toc-item\"><li><span><a href=\"#KNN-for-regression\" data-toc-modified-id=\"KNN-for-regression-1.1.4.1\">KNN for regression</a></span></li><li><span><a href=\"#Ridge-Regression\" data-toc-modified-id=\"Ridge-Regression-1.1.4.2\">Ridge Regression</a></span></li><li><span><a href=\"#Lasso-Regression\" data-toc-modified-id=\"Lasso-Regression-1.1.4.3\">Lasso Regression</a></span></li></ul></li><li><span><a href=\"#1.5-Compare-the-coefficients-of-your-two-best-linear-models-(not-knn)\" data-toc-modified-id=\"1.5-Compare-the-coefficients-of-your-two-best-linear-models-(not-knn)-1.1.5\">1.5 Compare the coefficients of your two best linear models (not knn)</a></span></li><li><span><a href=\"#1.6-Discuss-which-final-model-you-would-choose-to-predict-new-data\" data-toc-modified-id=\"1.6-Discuss-which-final-model-you-would-choose-to-predict-new-data-1.1.6\">1.6 Discuss which final model you would choose to predict new data</a></span></li></ul></li><li><span><a href=\"#Part-2:-Classification-on-red-and-white-wine-characteristics\" data-toc-modified-id=\"Part-2:-Classification-on-red-and-white-wine-characteristics-1.2\">Part 2: Classification on red and white wine characteristics</a></span><ul class=\"toc-item\"><li><span><a href=\"#2.1-Visualize-the-univariate-distribution\" data-toc-modified-id=\"2.1-Visualize-the-univariate-distribution-1.2.1\">2.1 Visualize the univariate distribution</a></span></li><li><span><a href=\"#Visualize-the-dependency-of-the-target-on-each-feature-from-2.1.\" data-toc-modified-id=\"Visualize-the-dependency-of-the-target-on-each-feature-from-2.1.-1.2.2\">Visualize the dependency of the target on each feature from 2.1.</a></span></li><li><span><a href=\"#2.2-Modelling-on-Relationships\" data-toc-modified-id=\"2.2-Modelling-on-Relationships-1.2.3\">2.2 Modelling on Relationships</a></span><ul class=\"toc-item\"><li><span><a href=\"#Split-Training-and-Test\" data-toc-modified-id=\"Split-Training-and-Test-1.2.3.1\">Split Training and Test</a></span></li><li><span><a href=\"#Scaling\" data-toc-modified-id=\"Scaling-1.2.3.2\">Scaling</a></span></li><li><span><a href=\"#Build-Models-and-Evaluate\" data-toc-modified-id=\"Build-Models-and-Evaluate-1.2.3.3\">Build Models and Evaluate</a></span></li><li><span><a href=\"#Logistic-Regression\" data-toc-modified-id=\"Logistic-Regression-1.2.3.4\">Logistic Regression</a></span></li><li><span><a href=\"#Penalized-Logistic-Regression\" data-toc-modified-id=\"Penalized-Logistic-Regression-1.2.3.5\">Penalized Logistic Regression</a></span></li></ul></li><li><span><a href=\"#2.3-Tune-the-parameters-where-possible-using-GridSearchCV.\" data-toc-modified-id=\"2.3-Tune-the-parameters-where-possible-using-GridSearchCV.-1.2.4\">2.3 Tune the parameters where possible using GridSearchCV.</a></span><ul class=\"toc-item\"><li><span><a href=\"#KNN\" data-toc-modified-id=\"KNN-1.2.4.1\">KNN</a></span></li><li><span><a href=\"#Penalized-Logistic-Regression\" data-toc-modified-id=\"Penalized-Logistic-Regression-1.2.4.2\">Penalized Logistic Regression</a></span></li></ul></li><li><span><a href=\"#2.4-Change-the-cross-validation-strategy\" data-toc-modified-id=\"2.4-Change-the-cross-validation-strategy-1.2.5\">2.4 Change the cross-validation strategy</a></span></li><li><span><a href=\"#2.5-Compare-the-coefficients\" data-toc-modified-id=\"2.5-Compare-the-coefficients-1.2.6\">2.5 Compare the coefficients</a></span></li></ul></li></ul></li></ul></div>"
    ]
   },
   {
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "# Supervised Learning on California Test Scores"
+    "# Supervised Learning"
    ]
   },
   {
@@ -2925,7 +2925,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "## 2.5 Compare the coefficients \n",
+    "### 2.5 Compare the coefficients \n",
     "\n",
     "Lastly, compare the coefficients for Logistic Regression and Penalized Logistic Regression and discuss which final model you would choose to predict new data."
    ]
diff --git a/3. Classification on Text Data.ipynb b/3. Classification on Text Data.ipynb
@@ -7,14 +7,14 @@
    },
    "source": [
     "<h1>Table of Contents<span class=\"tocSkip\"></span></h1>\n",
-    "<div class=\"toc\"><ul class=\"toc-item\"><li><span><a href=\"#Build-a-classification-model-using-text-data\" data-toc-modified-id=\"Build-a-classification-model-using-text-data-1\">Build a classification model using text data</a></span><ul class=\"toc-item\"><li><span><a href=\"#Import-the-text-data\" data-toc-modified-id=\"Import-the-text-data-1.1\">Import the text data</a></span></li><li><span><a href=\"#Vectorize-the-review-column-into-an-X-matrix.\" data-toc-modified-id=\"Vectorize-the-review-column-into-an-X-matrix.-1.2\">Vectorize the review column into an X matrix.</a></span></li><li><span><a href=\"#Run-three-models-and-Select\" data-toc-modified-id=\"Run-three-models-and-Select-1.3\">Run three models and Select</a></span><ul class=\"toc-item\"><li><span><a href=\"#a)-KNN\" data-toc-modified-id=\"a)-KNN-1.3.1\">a) KNN</a></span></li><li><span><a href=\"#b)-Penalized-Logistic-regression\" data-toc-modified-id=\"b)-Penalized-Logistic-regression-1.3.2\">b) Penalized Logistic regression</a></span></li><li><span><a href=\"#c)-Bagged-Tree\" data-toc-modified-id=\"c)-Bagged-Tree-1.3.3\">c) Bagged Tree</a></span></li><li><span><a href=\"#I-choose-the-Penalized-Logistic-regression-because-this-model-perform-best-in-terms-of-positive-and-negtive-case-classification-(highest-F1-score)\" data-toc-modified-id=\"I-choose-the-Penalized-Logistic-regression-because-this-model-perform-best-in-terms-of-positive-and-negtive-case-classification-(highest-F1-score)-1.3.4\">I choose the Penalized Logistic regression because this model perform best in terms of positive and negtive case classification (highest F1 score)</a></span></li></ul></li><li><span><a href=\"#Inspect-all-models-by-visualizing-the-coefficients.\" data-toc-modified-id=\"Inspect-all-models-by-visualizing-the-coefficients.-1.4\">Inspect all models by visualizing the coefficients.</a></span></li></ul></li></ul></div>"
+    "<div class=\"toc\"><ul class=\"toc-item\"><li><span><a href=\"#Classification-on-text-data\" data-toc-modified-id=\"Classification-on-text-data-1\">Classification on text data</a></span><ul class=\"toc-item\"><li><span><a href=\"#Import-the-text-data\" data-toc-modified-id=\"Import-the-text-data-1.1\">Import the text data</a></span></li><li><span><a href=\"#Vectorize\" data-toc-modified-id=\"Vectorize-1.2\">Vectorize</a></span></li><li><span><a href=\"#Run-three-models-and-Select\" data-toc-modified-id=\"Run-three-models-and-Select-1.3\">Run three models and Select</a></span><ul class=\"toc-item\"><li><span><a href=\"#a)-KNN\" data-toc-modified-id=\"a)-KNN-1.3.1\">a) KNN</a></span></li><li><span><a href=\"#b)-Penalized-Logistic-regression\" data-toc-modified-id=\"b)-Penalized-Logistic-regression-1.3.2\">b) Penalized Logistic regression</a></span></li><li><span><a href=\"#c)-Bagged-Tree\" data-toc-modified-id=\"c)-Bagged-Tree-1.3.3\">c) Bagged Tree</a></span></li></ul></li><li><span><a href=\"#Inspect-all-models-by-visualizing-the-coefficients.\" data-toc-modified-id=\"Inspect-all-models-by-visualizing-the-coefficients.-1.4\">Inspect all models by visualizing the coefficients.</a></span></li></ul></li></ul></div>"
    ]
   },
   {
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "# Build a classification model using text data"
+    "# Classification on text data"
    ]
   },
   {
@@ -168,7 +168,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "## Vectorize the review column into an X matrix.  "
+    "## Vectorize"
    ]
   },
   {
@@ -729,7 +729,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "### I choose the Penalized Logistic regression because this model perform best in terms of positive and negtive case classification (highest F1 score)"
+    "*I choose the Penalized Logistic regression because this model perform best in terms of positive and negtive case classification (highest F1 score)*"
    ]
   },
   {
diff --git a/4. Neural Network using Keras.ipynb b/4. Neural Network using Keras.ipynb
@@ -7,14 +7,14 @@
    },
    "source": [
     "<h1>Table of Contents<span class=\"tocSkip\"></span></h1>\n",
-    "<div class=\"toc\"><ul class=\"toc-item\"><li><span><a href=\"#Build-a-predictive-neural-network-using-Keras\" data-toc-modified-id=\"Build-a-predictive-neural-network-using-Keras-1\">Build a predictive neural network using Keras</a></span><ul class=\"toc-item\"><li><span><a href=\"#Run-a-multilayer-perceptron-with-two-hidden-layers\" data-toc-modified-id=\"Run-a-multilayer-perceptron-with-two-hidden-layers-1.1\">Run a multilayer perceptron with two hidden layers</a></span></li><li><span><a href=\"#selecting-the-number-of-hidden-units-using-GridSearchCV-and-evaluation-on-a-test-set.\" data-toc-modified-id=\"selecting-the-number-of-hidden-units-using-GridSearchCV-and-evaluation-on-a-test-set.-1.2\">selecting the number of hidden units using GridSearchCV and evaluation on a test-set.</a></span></li><li><span><a href=\"#Describe-the-differences-in-the-predictive-accuracy-of-models-with-different-numbers-of-hidden-units.\" data-toc-modified-id=\"Describe-the-differences-in-the-predictive-accuracy-of-models-with-different-numbers-of-hidden-units.-1.3\">Describe the differences in the predictive accuracy of models with different numbers of hidden units.</a></span></li><li><span><a href=\"#Describe-the-predictive-strength-of-your-best-model.\" data-toc-modified-id=\"Describe-the-predictive-strength-of-your-best-model.-1.4\">Describe the predictive strength of your best model.</a></span></li></ul></li></ul></div>"
+    "<div class=\"toc\"><ul class=\"toc-item\"><li><span><a href=\"#Predictive-neural-network-using-Keras\" data-toc-modified-id=\"Predictive-neural-network-using-Keras-1\">Predictive neural network using Keras</a></span><ul class=\"toc-item\"><li><span><a href=\"#Run-a-multilayer-perceptron-with-two-hidden-layers\" data-toc-modified-id=\"Run-a-multilayer-perceptron-with-two-hidden-layers-1.1\">Run a multilayer perceptron with two hidden layers</a></span></li><li><span><a href=\"#selecting-the-number-of-hidden-units-using-GridSearchCV-and-evaluation-on-a-test-set.\" data-toc-modified-id=\"selecting-the-number-of-hidden-units-using-GridSearchCV-and-evaluation-on-a-test-set.-1.2\">selecting the number of hidden units using GridSearchCV and evaluation on a test-set.</a></span></li><li><span><a href=\"#Describe-the-differences-in-the-predictive-accuracy-of-models-with-different-numbers-of-hidden-units.\" data-toc-modified-id=\"Describe-the-differences-in-the-predictive-accuracy-of-models-with-different-numbers-of-hidden-units.-1.3\">Describe the differences in the predictive accuracy of models with different numbers of hidden units.</a></span></li><li><span><a href=\"#Describe-the-predictive-strength-of-your-best-model.\" data-toc-modified-id=\"Describe-the-predictive-strength-of-your-best-model.-1.4\">Describe the predictive strength of your best model.</a></span></li></ul></li></ul></div>"
    ]
   },
   {
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "# Build a predictive neural network using Keras"
+    "# Predictive neural network using Keras"
    ]
   },
   {
diff --git a/README.md b/README.md
@@ -13,11 +13,27 @@ Reference books:
 
 **Software**: `Python`, `sklearn`, `numpy`, `pandas`, `matplotlib`, `keras`
 
-## Supervised Learning
+## [Supervised Learning](https://github.com/YiAlpha/machine-learning-python/blob/main/1.%20Supervised%20Learning.ipynb)
 
-## Unsupervised Learning
+1. Regression on California Test Scores
+2. Classification on red and white wine characteristics
 
-## Modelling on Text Data
+## [Unsupervised Learning](https://github.com/YiAlpha/machine-learning-python/blob/main/2.%20Unsupervised%20Learning%20on%20Wine%20Quality%20Dataset.ipynb)
 
-## Neural Network 
+1. K Means Cluster
+2. Hierarchical Cluster Analysis
+3. Principal Component Analysis
 
+## [Classification on Text Data](https://github.com/YiAlpha/machine-learning-python/blob/main/3.%20Classification%20on%20Text%20Data.ipynb)
+
+1. Import the text data
+2. Vectorize
+3. Run three models and Select
+4. Inspect all models by visualizing the coefficients
+
+## [Neural Network](https://github.com/YiAlpha/machine-learning-python/blob/main/4.%20Neural%20Network%20using%20Keras.ipynb) 
+
+1. Run a multilayer perceptron with two hidden layers 
+2. selecting the number of hidden units using `GridSearchCV` and evaluation on a test-set. 
+3. Describe the differences in the predictive accuracy of models with different numbers of hidden units. 
+4. Describe the predictive strength of your best model. 

Original file line number	Diff line number	Diff line change
`@@ -7,14 +7,14 @@`
`7`	`7`	`},`
`8`	`8`	`"source": [`
`9`	`9`	`"<h1>Table of Contents<span class=\"tocSkip\"></span></h1>\n",`
`10`		- "<div class=\"toc\"><ul class=\"toc-item\"><li><span><a href=\"#Supervised-Learning-on-California-Test-Scores\" data-toc-modified-id=\"Supervised-Learning-on-California-Test-Scores-1\">Supervised Learning on California Test Scores</a></span><ul class=\"toc-item\"><li><span><a href=\"#Part-1:-Regression-on-California-Test-Scores\" data-toc-modified-id=\"Part-1:-Regression-on-California-Test-Scores-1.1\">Part 1: Regression on California Test Scores</a></span><ul class=\"toc-item\"><li><span><a href=\"#1.1-Visualize-the-univariate-distribution\" data-toc-modified-id=\"1.1-Visualize-the-univariate-distribution-1.1.1\">1.1 Visualize the univariate distribution</a></span></li><li><span><a href=\"#1.2-Visualize-the-dependency-of-the-target-on-each-feature-from-1.1.\" data-toc-modified-id=\"1.2-Visualize-the-dependency-of-the-target-on-each-feature-from-1.1.-1.1.2\">1.2 Visualize the dependency of the target on each feature from 1.1.</a></span></li><li><span><a href=\"#1.3-Modelling-to-evaluate-relationship\" data-toc-modified-id=\"1.3-Modelling-to-evaluate-relationship-1.1.3\">1.3 Modelling to evaluate relationship</a></span><ul class=\"toc-item\"><li><span><a href=\"#1.3.1-Train-test-split\" data-toc-modified-id=\"1.3.1-Train-test-split-1.1.3.1\">1.3.1 Train test split</a></span></li><li><span><a href=\"#1.3.2-Build-Models\" data-toc-modified-id=\"1.3.2-Build-Models-1.1.3.2\">1.3.2 Build Models</a></span></li><li><span><a href=\"#1.3.3-Evaluate-Models-Using-Cross-Validation\" data-toc-modified-id=\"1.3.3-Evaluate-Models-Using-Cross-Validation-1.1.3.3\">1.3.3 Evaluate Models Using Cross-Validation</a></span></li><li><span><a href=\"#1.3.4-Does-scaling-the-data-with-the-StandardScaler-help?\" data-toc-modified-id=\"1.3.4-Does-scaling-the-data-with-the-StandardScaler-help?-1.1.3.4\">1.3.4 Does scaling the data with the StandardScaler help?</a></span></li></ul></li><li><span><a href=\"#1.4-Tune-the-parameters-of-the-models-where-possible-using-GridSearchCV.\" data-toc-modified-id=\"1.4-Tune-the-parameters-of-the-models-where-possible-using-GridSearchCV.-1.1.4\">1.4 Tune the parameters of the models where possible using GridSearchCV.</a></span><ul class=\"toc-item\"><li><span><a href=\"#KNN-for-regression\" data-toc-modified-id=\"KNN-for-regression-1.1.4.1\">KNN for regression</a></span></li><li><span><a href=\"#Ridge-Regression\" data-toc-modified-id=\"Ridge-Regression-1.1.4.2\">Ridge Regression</a></span></li><li><span><a href=\"#Lasso-Regression\" data-toc-modified-id=\"Lasso-Regression-1.1.4.3\">Lasso Regression</a></span></li></ul></li><li><span><a href=\"#1.5-Compare-the-coefficients-of-your-two-best-linear-models-(not-knn)\" data-toc-modified-id=\"1.5-Compare-the-coefficients-of-your-two-best-linear-models-(not-knn)-1.1.5\">1.5 Compare the coefficients of your two best linear models (not knn)</a></span></li><li><span><a href=\"#1.6-Discuss-which-final-model-you-would-choose-to-predict-new-data\" data-toc-modified-id=\"1.6-Discuss-which-final-model-you-would-choose-to-predict-new-data-1.1.6\">1.6 Discuss which final model you would choose to predict new data</a></span></li></ul></li><li><span><a href=\"#Part-2:-Classification-on-red-and-white-wine-characteristics\" data-toc-modified-id=\"Part-2:-Classification-on-red-and-white-wine-characteristics-1.2\">Part 2: Classification on red and white wine characteristics</a></span><ul class=\"toc-item\"><li><span><a href=\"#2.1-Visualize-the-univariate-distribution\" data-toc-modified-id=\"2.1-Visualize-the-univariate-distribution-1.2.1\">2.1 Visualize the univariate distribution</a></span></li><li><span><a href=\"#Visualize-the-dependency-of-the-target-on-each-feature-from-2.1.\" data-toc-modified-id=\"Visualize-the-dependency-of-the-target-on-each-feature-from-2.1.-1.2.2\">Visualize the dependency of the target on each feature from 2.1.</a></span></li><li><span><a href=\"#2.2-Modelling-on-Relationships\" data-toc-modified-id=\"2.2-Modelling-on-Relationships-1.2.3\">2.2 Modelling on Relationships</a></span><ul class=\"toc-item\"><li><span><a href=\"#Split-Training-and-Test\" data-toc-modified-id=\"Split-Training-and-Test-1.2.3.1\">Split Training and Test</a></span></li><li><span><a href=\"#Scaling\" data-toc-modified-id=\"Scaling-1.2.3.2\">Scaling</a></span></li><li><span><a href=\"#Build-Models-and-Evaluate\" data-toc-modified-id=\"Build-Models-and-Evaluate-1.2.3.3\">Build Models and Evaluate</a></span></li><li><span><a href=\"#Logistic-Regression\" data-toc-modified-id=\"Logistic-Regression-1.2.3.4\">Logistic Regression</a></span></li><li><span><a href=\"#Penalized-Logistic-Regression\" data-toc-modified-id=\"Penalized-Logistic-Regression-1.2.3.5\">Penalized Logistic Regression</a></span></li></ul></li><li><span><a href=\"#2.3-Tune-the-parameters-where-possible-using-GridSearchCV.\" data-toc-modified-id=\"2.3-Tune-the-parameters-where-possible-using-GridSearchCV.-1.2.4\">2.3 Tune the parameters where possible using GridSearchCV.</a></span><ul class=\"toc-item\"><li><span><a href=\"#KNN\" data-toc-modified-id=\"KNN-1.2.4.1\">KNN</a></span></li><li><span><a href=\"#Penalized-Logistic-Regression\" data-toc-modified-id=\"Penalized-Logistic-Regression-1.2.4.2\">Penalized Logistic Regression</a></span></li></ul></li><li><span><a href=\"#2.4-Change-the-cross-validation-strategy\" data-toc-modified-id=\"2.4-Change-the-cross-validation-strategy-1.2.5\">2.4 Change the cross-validation strategy</a></span></li></ul></li><li><span><a href=\"#2.5-Compare-the-coefficients\" data-toc-modified-id=\"2.5-Compare-the-coefficients-1.3\">2.5 Compare the coefficients</a></span></li></ul></li></ul></div>"
	`10`	+ "<div class=\"toc\"><ul class=\"toc-item\"><li><span><a href=\"#Supervised-Learning\" data-toc-modified-id=\"Supervised-Learning-1\">Supervised Learning</a></span><ul class=\"toc-item\"><li><span><a href=\"#Part-1:-Regression-on-California-Test-Scores\" data-toc-modified-id=\"Part-1:-Regression-on-California-Test-Scores-1.1\">Part 1: Regression on California Test Scores</a></span><ul class=\"toc-item\"><li><span><a href=\"#1.1-Visualize-the-univariate-distribution\" data-toc-modified-id=\"1.1-Visualize-the-univariate-distribution-1.1.1\">1.1 Visualize the univariate distribution</a></span></li><li><span><a href=\"#1.2-Visualize-the-dependency-of-the-target-on-each-feature-from-1.1.\" data-toc-modified-id=\"1.2-Visualize-the-dependency-of-the-target-on-each-feature-from-1.1.-1.1.2\">1.2 Visualize the dependency of the target on each feature from 1.1.</a></span></li><li><span><a href=\"#1.3-Modelling-to-evaluate-relationship\" data-toc-modified-id=\"1.3-Modelling-to-evaluate-relationship-1.1.3\">1.3 Modelling to evaluate relationship</a></span><ul class=\"toc-item\"><li><span><a href=\"#1.3.1-Train-test-split\" data-toc-modified-id=\"1.3.1-Train-test-split-1.1.3.1\">1.3.1 Train test split</a></span></li><li><span><a href=\"#1.3.2-Build-Models\" data-toc-modified-id=\"1.3.2-Build-Models-1.1.3.2\">1.3.2 Build Models</a></span></li><li><span><a href=\"#1.3.3-Evaluate-Models-Using-Cross-Validation\" data-toc-modified-id=\"1.3.3-Evaluate-Models-Using-Cross-Validation-1.1.3.3\">1.3.3 Evaluate Models Using Cross-Validation</a></span></li><li><span><a href=\"#1.3.4-Does-scaling-the-data-with-the-StandardScaler-help?\" data-toc-modified-id=\"1.3.4-Does-scaling-the-data-with-the-StandardScaler-help?-1.1.3.4\">1.3.4 Does scaling the data with the StandardScaler help?</a></span></li></ul></li><li><span><a href=\"#1.4-Tune-the-parameters-of-the-models-where-possible-using-GridSearchCV.\" data-toc-modified-id=\"1.4-Tune-the-parameters-of-the-models-where-possible-using-GridSearchCV.-1.1.4\">1.4 Tune the parameters of the models where possible using GridSearchCV.</a></span><ul class=\"toc-item\"><li><span><a href=\"#KNN-for-regression\" data-toc-modified-id=\"KNN-for-regression-1.1.4.1\">KNN for regression</a></span></li><li><span><a href=\"#Ridge-Regression\" data-toc-modified-id=\"Ridge-Regression-1.1.4.2\">Ridge Regression</a></span></li><li><span><a href=\"#Lasso-Regression\" data-toc-modified-id=\"Lasso-Regression-1.1.4.3\">Lasso Regression</a></span></li></ul></li><li><span><a href=\"#1.5-Compare-the-coefficients-of-your-two-best-linear-models-(not-knn)\" data-toc-modified-id=\"1.5-Compare-the-coefficients-of-your-two-best-linear-models-(not-knn)-1.1.5\">1.5 Compare the coefficients of your two best linear models (not knn)</a></span></li><li><span><a href=\"#1.6-Discuss-which-final-model-you-would-choose-to-predict-new-data\" data-toc-modified-id=\"1.6-Discuss-which-final-model-you-would-choose-to-predict-new-data-1.1.6\">1.6 Discuss which final model you would choose to predict new data</a></span></li></ul></li><li><span><a href=\"#Part-2:-Classification-on-red-and-white-wine-characteristics\" data-toc-modified-id=\"Part-2:-Classification-on-red-and-white-wine-characteristics-1.2\">Part 2: Classification on red and white wine characteristics</a></span><ul class=\"toc-item\"><li><span><a href=\"#2.1-Visualize-the-univariate-distribution\" data-toc-modified-id=\"2.1-Visualize-the-univariate-distribution-1.2.1\">2.1 Visualize the univariate distribution</a></span></li><li><span><a href=\"#Visualize-the-dependency-of-the-target-on-each-feature-from-2.1.\" data-toc-modified-id=\"Visualize-the-dependency-of-the-target-on-each-feature-from-2.1.-1.2.2\">Visualize the dependency of the target on each feature from 2.1.</a></span></li><li><span><a href=\"#2.2-Modelling-on-Relationships\" data-toc-modified-id=\"2.2-Modelling-on-Relationships-1.2.3\">2.2 Modelling on Relationships</a></span><ul class=\"toc-item\"><li><span><a href=\"#Split-Training-and-Test\" data-toc-modified-id=\"Split-Training-and-Test-1.2.3.1\">Split Training and Test</a></span></li><li><span><a href=\"#Scaling\" data-toc-modified-id=\"Scaling-1.2.3.2\">Scaling</a></span></li><li><span><a href=\"#Build-Models-and-Evaluate\" data-toc-modified-id=\"Build-Models-and-Evaluate-1.2.3.3\">Build Models and Evaluate</a></span></li><li><span><a href=\"#Logistic-Regression\" data-toc-modified-id=\"Logistic-Regression-1.2.3.4\">Logistic Regression</a></span></li><li><span><a href=\"#Penalized-Logistic-Regression\" data-toc-modified-id=\"Penalized-Logistic-Regression-1.2.3.5\">Penalized Logistic Regression</a></span></li></ul></li><li><span><a href=\"#2.3-Tune-the-parameters-where-possible-using-GridSearchCV.\" data-toc-modified-id=\"2.3-Tune-the-parameters-where-possible-using-GridSearchCV.-1.2.4\">2.3 Tune the parameters where possible using GridSearchCV.</a></span><ul class=\"toc-item\"><li><span><a href=\"#KNN\" data-toc-modified-id=\"KNN-1.2.4.1\">KNN</a></span></li><li><span><a href=\"#Penalized-Logistic-Regression\" data-toc-modified-id=\"Penalized-Logistic-Regression-1.2.4.2\">Penalized Logistic Regression</a></span></li></ul></li><li><span><a href=\"#2.4-Change-the-cross-validation-strategy\" data-toc-modified-id=\"2.4-Change-the-cross-validation-strategy-1.2.5\">2.4 Change the cross-validation strategy</a></span></li><li><span><a href=\"#2.5-Compare-the-coefficients\" data-toc-modified-id=\"2.5-Compare-the-coefficients-1.2.6\">2.5 Compare the coefficients</a></span></li></ul></li></ul></li></ul></div>"
`11`	`11`	`]`
`12`	`12`	`},`
`13`	`13`	`{`
`14`	`14`	`"cell_type": "markdown",`
`15`	`15`	`"metadata": {},`
`16`	`16`	`"source": [`
`17`		`- "# Supervised Learning on California Test Scores"`
	`17`	`+ "# Supervised Learning"`
`18`	`18`	`]`
`19`	`19`	`},`
`20`	`20`	`{`
`@@ -2925,7 +2925,7 @@`
`2925`	`2925`	`"cell_type": "markdown",`
`2926`	`2926`	`"metadata": {},`
`2927`	`2927`	`"source": [`
`2928`		`- "## 2.5 Compare the coefficients \n",`
	`2928`	`+ "### 2.5 Compare the coefficients \n",`
`2929`	`2929`	`"\n",`
`2930`	`2930`	`"Lastly, compare the coefficients for Logistic Regression and Penalized Logistic Regression and discuss which final model you would choose to predict new data."`
`2931`	`2931`	`]`