Skip to content

Commit

Permalink
Update the structure of the package
Browse files Browse the repository at this point in the history
  • Loading branch information
adler-j committed Dec 5, 2017
1 parent f1cc02f commit 1e4d1f7
Show file tree
Hide file tree
Showing 7 changed files with 230 additions and 48 deletions.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
210 changes: 162 additions & 48 deletions code/classify_mnist.ipynb → code/part2_classification.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -46,6 +46,7 @@
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true,
"scrolled": true
},
"outputs": [],
Expand Down Expand Up @@ -94,6 +95,7 @@
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true,
"scrolled": true
},
"outputs": [],
Expand All @@ -116,6 +118,7 @@
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true,
"scrolled": false
},
"outputs": [],
Expand Down Expand Up @@ -165,28 +168,6 @@
" return np.mean(result == test_labels)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Create placeholders. Placeholders are needed in tensorflow since tensorflow is a lazy language,\n",
"and hence we first define the computational graph with placeholders as input, and later we evaluate it."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true,
"scrolled": true
},
"outputs": [],
"source": [
"with tf.name_scope('placeholders'):\n",
" images = tf.placeholder(tf.float32, shape=[None, 28, 28, 1])\n",
" true_labels = tf.placeholder(tf.int32, shape=[None])"
]
},
{
"cell_type": "markdown",
"metadata": {},
Expand All @@ -208,7 +189,9 @@
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"toh = tf.one_hot([0, 1, 2], depth=3)\n",
Expand Down Expand Up @@ -268,25 +251,40 @@
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"with tf.name_scope('elementary'):\n",
"with tf.name_scope('elementary_network'):\n",
" # Create a placeholder for our input data (no computation is done here)\n",
" X = tf.placeholder(shape=(None, 784), dtype=tf.float32, name=\"X\")\n",
" \n",
" # Create the parameters (weight, bias) of the model\n",
" weights = tf.Variable(tf.random_normal((784, 10)), name=\"weights\")\n",
" bias = tf.Variable(tf.zeros((10)), name=\"bias\")\n",
" lin = tf.matmul(X, weights)\n",
" lin_ = lin + bias\n",
" elin_ = tf.exp(lin_)\n",
" Z = tf.reduce_sum(tf.exp(lin_), axis=1, keep_dims=True)\n",
" prob = elin_ / Z\n",
" \n",
" # Compute the probabilities (this is all lazy, no computations are actually performed)\n",
" lin = tf.matmul(X, weights) + bias\n",
" elin = tf.exp(lin)\n",
" Z = tf.reduce_sum(elin, axis=1, keep_dims=True)\n",
" prob = elin / Z\n",
" log_prob = tf.log(prob)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Define the loss function which measures how good our parameters are"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"with tf.name_scope(\"elementary_loss\"):\n",
Expand All @@ -295,40 +293,73 @@
" loss = -tf.reduce_mean(determ*log_prob)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Define the gradient descent update step, i.e.\n",
"\n",
"$$w_i \\leftarrow w_i - \\omega \\nabla_{w_i} L(w, b)$$\n",
"\n",
"where $\\omega$ is the *learning rate*, or step size.\n",
"\n",
"Note that in machine learning, we typically use *stochastic* gradient descent (SGD). In these methods we don't use all of the data to compute the gradient, only a small subset called a mini-batch. Here we use 128 images in each training step.\n",
"\n",
"Further, while for this case computing the gradient would be quite simple, once we move to harder and mroe complicated models doing so would be basically impossible to do by hand. To work around this, all major deep learning frameworks implement [automatic differentiation](https://en.wikipedia.org/wiki/Automatic_differentiation). This may sound fancy, but automatic differentiation is simply the chain rule for the derivative. Tensorflow implements it using the `tf.gradients` command."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"with tf.name_scope(\"elementary_training\"): \n",
"with tf.name_scope(\"elementary_training\"):\n",
" learning_rate = .1\n",
" batch_size = 2**7\n",
" batch_size = 128\n",
"\n",
" variables = [weights, bias]\n",
" gradients = tf.gradients(loss, variables)\n",
" update_ops = [var.assign(var - learning_rate*grad) \n",
" for var, grad in zip(variables, gradients)]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Since all the code above was lazy, nothing has actually happened. Before we start we need to initialize the variables"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"init = tf.global_variables_initializer().run()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We train the network by feeding data from the training set and occationally evalute the performance on our test set, this is the first point we actually start doing computations"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true,
"scrolled": false
},
"outputs": [],
"source": [
"feed_dict={labels:mnist.train.labels[:batch_size], X:mnist.train.images[:batch_size]}\n",
"for i in range(100000):\n",
" images_, labels_ = mnist.train.next_batch(batch_size)\n",
" session.run(update_ops, \n",
Expand All @@ -342,33 +373,115 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"### Using TensorFlow libraries"
"### Using TensorFlow libraries\n",
"\n",
"While the above code solves our problem, it involved several small and perhaps obscure steps. Once we start moving to more complicated neural networks the code would become very repetetive.\n",
"\n",
"Since all of the steps are standardized, we can (and should) instead use built in tensorflow functions, this example does that, and all following examples will do the same."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Placeholders"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true,
"scrolled": true
},
"outputs": [],
"source": [
"with tf.name_scope('placeholders'):\n",
" images = tf.placeholder(tf.float32, shape=[None, 28, 28, 1])\n",
" true_labels = tf.placeholder(tf.int32, shape=[None])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Network\n",
"\n",
"The \"network\" can be computed using the `tf.contrib.layers.fully_connected` function, which computes\n",
"\n",
"$$\\rho(Ax + b)$$\n",
"\n",
"where $\\rho$ is the activation function, $A$ the weights and $b$ the bias. Note that here we never explicitly construct these, they are hidden inside tensorflow."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"with tf.name_scope('logistic_regression'):\n",
" x = tf.contrib.layers.flatten(images)\n",
" logits = tf.contrib.layers.fully_connected(x, 10,\n",
" activation_fn=None)\n",
" pred = tf.argmax(logits, axis=1)\n",
" \n",
" activation_fn=None)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Loss and optimization\n",
"\n",
"The loss function defined above should be done using the `tf.nn.softmax_cross_entropy_with_logits` function, which not only is easier to use, it is also more numerically stable"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"with tf.name_scope('optimizer'):\n",
" one_hot_labels = tf.one_hot(true_labels, depth=10)\n",
" \n",
" loss = tf.nn.softmax_cross_entropy_with_logits(labels=one_hot_labels,\n",
" logits=logits)\n",
" optimizer = tf.train.AdamOptimizer().minimize(loss)\n",
" optimizer = tf.train.AdamOptimizer().minimize(loss)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"session.run(tf.global_variables_initializer())"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Train the network\n",
"\n",
"Training the network looks about the same as above"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true,
"scrolled": true
},
"outputs": [],
"source": [
"# Initialize all TF variables\n",
"session.run(tf.global_variables_initializer())\n",
"\n",
"for i in range(10000):\n",
" batch = mnist.train.next_batch(128)\n",
" train_images = batch[0].reshape([-1, 28, 28, 1])\n",
Expand All @@ -377,9 +490,8 @@
" session.run(optimizer, feed_dict={images: train_images, \n",
" true_labels: train_labels})\n",
"\n",
" if i % 100 == 0:\n",
" print('{} Average correct: {}'.format(\n",
" i, evaluate(pred, images)))"
" if i % 1000 == 0:\n",
" print(\"{:.1f}%, \".format(evaluate(tf.argmax(logits, axis=1), X)*100), end=\"\")"
]
},
{
Expand All @@ -401,6 +513,7 @@
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true,
"scrolled": false
},
"outputs": [],
Expand Down Expand Up @@ -451,6 +564,7 @@
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true,
"scrolled": true
},
"outputs": [],
Expand Down Expand Up @@ -505,7 +619,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.1"
"version": "3.5.3"
}
},
"nbformat": 4,
Expand Down
Loading

0 comments on commit 1e4d1f7

Please sign in to comment.