Skip to content

Commit

Permalink
working on response
Browse files Browse the repository at this point in the history
  • Loading branch information
adamw523 committed Oct 18, 2016
1 parent f977469 commit ea637ce
Showing 1 changed file with 46 additions and 37 deletions.
83 changes: 46 additions & 37 deletions projects/student_intervention/student_intervention.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -245,7 +245,7 @@
},
{
"cell_type": "code",
"execution_count": 6,
"execution_count": 11,
"metadata": {
"collapsed": false,
"scrolled": false
Expand Down Expand Up @@ -296,7 +296,7 @@
},
{
"cell_type": "code",
"execution_count": 7,
"execution_count": 12,
"metadata": {
"collapsed": true
},
Expand Down Expand Up @@ -338,16 +338,17 @@
" * Easy to understand classification. Best suited when some features play big roles in separating large parts of the data\n",
" * Can determine weather a person will like a restaurant, based on some inputs like \"type of food\", \"atmosphere\", \"price\", \"popularity\"\n",
" * Used to create the game \"20 Questions\" where the game automaticall chooses what object a person is thinking of by asking up to 20 questions.\n",
"* [strenghts / weaknesses]( https://classroom.udacity.com/nanodegrees/nd009/parts/0091345404/modules/544698886575460/lessons/5450810003/concepts/24497085540923)\n",
"* strenghts / weaknesses\n",
" * prone to overfitting with data that contains too many features creating a complicated decision tree\n",
" * Response from previous review feedback: [reference to the above point](https://classroom.udacity.com/nanodegrees/nd009/parts/0091345404/modules/544698886575460/lessons/5450810003/concepts/24497085540923)\n",
" * Very fast at prediction time\n",
" * Simple, can be printed out and interpreted by a human\n",
"* Why? Maybe some features are important in classifying the students grad rate that will create large splits in the trees.\n"
]
},
{
"cell_type": "code",
"execution_count": 8,
"execution_count": 13,
"metadata": {
"collapsed": false
},
Expand Down Expand Up @@ -383,7 +384,7 @@
},
{
"cell_type": "code",
"execution_count": 9,
"execution_count": 14,
"metadata": {
"collapsed": false
},
Expand All @@ -394,7 +395,7 @@
"text": [
"Predicting labels using DecisionTreeClassifier...\n",
"Done!\n",
"Prediction time (secs): 0.001\n",
"Prediction time (secs): 0.003\n",
"F1 score for training set: 1.0\n"
]
}
Expand All @@ -417,7 +418,7 @@
},
{
"cell_type": "code",
"execution_count": 10,
"execution_count": 15,
"metadata": {
"collapsed": false
},
Expand All @@ -428,8 +429,8 @@
"text": [
"Predicting labels using DecisionTreeClassifier...\n",
"Done!\n",
"Prediction time (secs): 0.001\n",
"F1 score for test set: 0.692913385827\n"
"Prediction time (secs): 0.002\n",
"F1 score for test set: 0.706766917293\n"
]
}
],
Expand All @@ -440,7 +441,7 @@
},
{
"cell_type": "code",
"execution_count": 11,
"execution_count": 16,
"metadata": {
"collapsed": false
},
Expand All @@ -453,15 +454,15 @@
"Training set size: 100\n",
"Training DecisionTreeClassifier...\n",
"Done!\n",
"Training time (secs): 0.005\n",
"Training time (secs): 0.003\n",
"Predicting labels using DecisionTreeClassifier...\n",
"Done!\n",
"Prediction time (secs): 0.001\n",
"F1 score for training set: 1.0\n",
"Predicting labels using DecisionTreeClassifier...\n",
"Done!\n",
"Prediction time (secs): 0.001\n",
"F1 score for test set: 0.672413793103\n",
"Prediction time (secs): 0.000\n",
"F1 score for test set: 0.637168141593\n",
"------------------------------------------\n",
"Training set size: 200\n",
"Training DecisionTreeClassifier...\n",
Expand All @@ -474,7 +475,7 @@
"Predicting labels using DecisionTreeClassifier...\n",
"Done!\n",
"Prediction time (secs): 0.000\n",
"F1 score for test set: 0.802919708029\n",
"F1 score for test set: 0.776119402985\n",
"------------------------------------------\n",
"Training set size: 300\n",
"Training DecisionTreeClassifier...\n",
Expand All @@ -487,7 +488,7 @@
"Predicting labels using DecisionTreeClassifier...\n",
"Done!\n",
"Prediction time (secs): 0.000\n",
"F1 score for test set: 0.707692307692\n"
"F1 score for test set: 0.746268656716\n"
]
}
],
Expand Down Expand Up @@ -519,21 +520,21 @@
},
{
"cell_type": "code",
"execution_count": 12,
"execution_count": 17,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/html": [
"<table><thead><tr><td><strong>class<strong></td><td><strong> 100 train </strong></td> <td><strong> 100 test </strong></td><td><strong> 200 train </strong></td> <td><strong> 200 test </strong></td><td><strong> 300 train </strong></td> <td><strong> 300 test </strong></td></tr></thead><tbody><tr><td><strong> DecisionTreeClassifier </strong></td><td> 1.0000 </td><td> 0.6724 </td><td> 1.0000 </td><td> 0.8029 </td><td> 1.0000 </td><td> 0.7077 </td></tr></tbody></table>"
"<table><thead><tr><td><strong>class<strong></td><td><strong> 100 train </strong></td> <td><strong> 100 test </strong></td><td><strong> 200 train </strong></td> <td><strong> 200 test </strong></td><td><strong> 300 train </strong></td> <td><strong> 300 test </strong></td></tr></thead><tbody><tr><td><strong> DecisionTreeClassifier </strong></td><td> 1.0000 </td><td> 0.6372 </td><td> 1.0000 </td><td> 0.7761 </td><td> 1.0000 </td><td> 0.7463 </td></tr></tbody></table>"
],
"text/plain": [
"<IPython.core.display.HTML object>"
]
},
"execution_count": 12,
"execution_count": 17,
"metadata": {},
"output_type": "execute_result"
}
Expand All @@ -550,7 +551,12 @@
"metadata": {},
"source": [
"### SVM\n",
"* General Applications: Great for classification where a good margin exists between classes of data.\n",
"* General Applications\n",
" * Great for classification where a good margin exists between classes of data.\n",
" * Typical uses include\n",
" * text classification\n",
" * image classification\n",
" * handwritten character recognition\n",
"* [strengths / weaknesses](https://classroom.udacity.com/nanodegrees/nd009/parts/0091345404/modules/544698886575460/lessons/5447009165/concepts/23841887100923)\n",
" * work really well in complicated domains with clear margin of separation\n",
" * not great with too much noise\n",
Expand All @@ -560,7 +566,7 @@
},
{
"cell_type": "code",
"execution_count": 13,
"execution_count": 18,
"metadata": {
"collapsed": false
},
Expand All @@ -571,7 +577,7 @@
"text": [
"Training SVC...\n",
"Done!\n",
"Training time (secs): 0.010\n"
"Training time (secs): 0.014\n"
]
}
],
Expand All @@ -596,7 +602,7 @@
},
{
"cell_type": "code",
"execution_count": 14,
"execution_count": 19,
"metadata": {
"collapsed": false
},
Expand All @@ -607,7 +613,7 @@
"text": [
"Predicting labels using SVC...\n",
"Done!\n",
"Prediction time (secs): 0.007\n",
"Prediction time (secs): 0.009\n",
"F1 score for training set: 0.858387799564\n"
]
}
Expand All @@ -630,7 +636,7 @@
},
{
"cell_type": "code",
"execution_count": 15,
"execution_count": 20,
"metadata": {
"collapsed": false
},
Expand All @@ -641,7 +647,7 @@
"text": [
"Predicting labels using SVC...\n",
"Done!\n",
"Prediction time (secs): 0.003\n",
"Prediction time (secs): 0.009\n",
"F1 score for test set: 0.846153846154\n"
]
}
Expand All @@ -653,7 +659,7 @@
},
{
"cell_type": "code",
"execution_count": 16,
"execution_count": 21,
"metadata": {
"collapsed": false
},
Expand All @@ -666,10 +672,10 @@
"Training set size: 100\n",
"Training SVC...\n",
"Done!\n",
"Training time (secs): 0.002\n",
"Training time (secs): 0.004\n",
"Predicting labels using SVC...\n",
"Done!\n",
"Prediction time (secs): 0.001\n",
"Prediction time (secs): 0.003\n",
"F1 score for training set: 0.859060402685\n",
"Predicting labels using SVC...\n",
"Done!\n",
Expand All @@ -679,10 +685,10 @@
"Training set size: 200\n",
"Training SVC...\n",
"Done!\n",
"Training time (secs): 0.004\n",
"Training time (secs): 0.005\n",
"Predicting labels using SVC...\n",
"Done!\n",
"Prediction time (secs): 0.003\n",
"Prediction time (secs): 0.004\n",
"F1 score for training set: 0.858064516129\n",
"Predicting labels using SVC...\n",
"Done!\n",
Expand Down Expand Up @@ -732,7 +738,7 @@
},
{
"cell_type": "code",
"execution_count": 17,
"execution_count": 22,
"metadata": {
"collapsed": false
},
Expand All @@ -746,7 +752,7 @@
"<IPython.core.display.HTML object>"
]
},
"execution_count": 17,
"execution_count": 22,
"metadata": {},
"output_type": "execute_result"
}
Expand All @@ -764,11 +770,14 @@
"source": [
"### K-NN\n",
"\n",
"* General Applications: Great for classification of data with complex decision boundaries, when there is not too much training data.\n",
"* General Applications:\n",
" * Great for classification of data with complex decision boundaries, when there is not too much training data.\n",
" * Very popular in recommender systems\n",
" * Used in Concept Search - searching for documents containing similar topics\n",
"* strengths / weaknesses\n",
" * good when training data is large\n",
" * need to figure out parameter K\n",
" * predictions are memory and computationally expensive with lots of training data\n",
" * good when training data is large\n",
" * need to figure out parameter K\n",
" * predictions are memory and computationally expensive with lots of training data\n",
"* Why? We have a small enough training set to use in the prediciton step. \n"
]
},
Expand Down Expand Up @@ -1190,7 +1199,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython2",
"version": "2.7.11"
"version": "2.7.12"
}
},
"nbformat": 4,
Expand Down

0 comments on commit ea637ce

Please sign in to comment.