chapter 11 Question 9.2 #283

nitml · 2018-08-16T19:19:02Z

In question 9.2 to freeze the hidden layers using in question 8 (DNN for MNIST data for digit 0-4) to create new DNN for digit 5-9 code was :

reset_graph()
restore_saver = tf.train.import_meta_graph("./my_best_mnist_model_0_to_4.meta")
X = tf.get_default_graph().get_tensor_by_name("X:0")
y = tf.get_default_graph().get_tensor_by_name("y:0")
loss = tf.get_default_graph().get_tensor_by_name("loss:0")
Y_proba = tf.get_default_graph().get_tensor_by_name("Y_proba:0")
logits = Y_proba.op.inputs[0]
accuracy = tf.get_default_graph().get_tensor_by_name("accuracy:0")
learning_rate = 0.01
output_layer_vars = tf.get_collection(tf.GraphKeys.TRAINABLE_VARIABLES, scope="logits")
optimizer = tf.train.AdamOptimizer(learning_rate, name="Adam2")
training_op = optimizer.minimize(loss, var_list=output_layer_vars)
correct = tf.nn.in_top_k(logits, y, 1)
accuracy = tf.reduce_mean(tf.cast(correct, tf.float32), name="accuracy")
init = tf.global_variables_initializer()
five_frozen_saver = tf.train.Saver()
n_epochs = 1000
batch_size = 20
max_checks_without_progress = 20
checks_without_progress = 0
best_loss = np.infty
with tf.Session() as sess:
    init.run()
    restore_saver.restore(sess, "./my_best_mnist_model_0_to_4")
    for var in output_layer_vars:
        var.initializer.run()

    t0 = time.time()
        
    for epoch in range(n_epochs):

to freeze pretrained layer the idea was to exclude their variables from the optimizer's list of trainable variables:

output_layer_vars = tf.get_collection(tf.GraphKeys.TRAINABLE_VARIABLES, scope="logits")
optimizer = tf.train.AdamOptimizer(learning_rate, name="Adam2")
training_op = optimizer.minimize(loss, var_list=output_layer_vars)

My question is why have we used following lines of code (line 25)

for var in output_layer_vars:
    var.initializer.run()

ageron · 2018-08-16T20:04:59Z

Hi @nitml ,

Good question, I think it's a mistake (fortunately with no consequence).

Suppose there are 4 variables in the previous model ("my_best_mnist_model_0_to_4"), let's call them A, B, C, D. And suppose there are 2 variables in the new output layer, let's call them E, F.

init.run() initializes all variables, A, B, C, D, E, F.
restore_saver.restore() restores all variables (except those in the output layer): A, B, C, D.
the for loop initializes all variables in the output layer: E, F.

You can certainly remove the for loop, it is not useful since all the variables have already been initialized by init.run().

Alternatively, you could remove the init.run() step, and everything should work fine since A, B, C, D would get initialized and restored by the restore_saver, and the for loop would initialize the remaining variables E, F.

The first option is simpler, but it is intellectually not very satisfying since A, B, C, D get initialized for nothing by the init.run() step (since the restore_saver will restore them anyway).
The second option is more complicated, and it does not add much value (perhaps a tiny performance gain since we initialize A, B, C, D only once).

I think I must have tried both options, and I forgot to choose between the two. ;-)

Hope this helps,
Aurélien

nitml · 2018-08-16T20:13:29Z

Great Explanation,
Thanks for your time and support

nitml closed this as completed Aug 16, 2018

nitml reopened this Aug 16, 2018

ageron closed this as completed in c81f7ba Aug 20, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

chapter 11 Question 9.2 #283

chapter 11 Question 9.2 #283

nitml commented Aug 16, 2018 •

edited by ageron

Loading

ageron commented Aug 16, 2018 •

edited

Loading

nitml commented Aug 16, 2018

chapter 11 Question 9.2 #283

chapter 11 Question 9.2 #283

Comments

nitml commented Aug 16, 2018 • edited by ageron Loading

ageron commented Aug 16, 2018 • edited Loading

nitml commented Aug 16, 2018

nitml commented Aug 16, 2018 •

edited by ageron

Loading

ageron commented Aug 16, 2018 •

edited

Loading