Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pretrained Kth Model checkpoint tensor shapes does not match #22

Open
brandonhuo opened this issue Mar 27, 2019 · 11 comments
Open

Pretrained Kth Model checkpoint tensor shapes does not match #22

brandonhuo opened this issue Mar 27, 2019 · 11 comments

Comments

@brandonhuo
Copy link

brandonhuo commented Mar 27, 2019

Hi alex @alexlee-gk ,

When I try to load the pretrained KTH model (savp, gan and all of them) for testing, there is an error:

Traceback (most recent call last):
  File "D:\Program Files (x86)\Microsoft Visual Studio\Shared\Python36_64\lib\site-packages\tensorflow\python\client\session.py", line 1334, in _do_call
    return fn(*args)
  File "D:\Program Files (x86)\Microsoft Visual Studio\Shared\Python36_64\lib\site-packages\tensorflow\python\client\session.py", line 1319, in _run_fn
    options, feed_dict, fetch_list, target_list, run_metadata)
  File "D:\Program Files (x86)\Microsoft Visual Studio\Shared\Python36_64\lib\site-packages\tensorflow\python\client\session.py", line 1407, in _call_tf_sessionrun
    run_metadata)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Assign requires shapes of both tensors to match. lhs shape= [1] rhs shape= [3]
	 [[{{node save/Assign_77}}]]
	 [[{{node save/RestoreV2}}]]
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
  File "<input>", line 1, in <module>
  File "C:\Users\huoqi\PyCharm Community Edition 2018.3.4\helpers\pydev\_pydev_bundle\pydev_umd.py", line 197, in runfile
    pydev_imports.execfile(filename, global_vars, local_vars)  # execute the script
  File "C:\Users\huoqi\PyCharm Community Edition 2018.3.4\helpers\pydev\_pydev_imps\_pydev_execfile.py", line 18, in execfile
    exec(compile(contents+"\n", file, 'exec'), glob, loc)
  File "F:/video_prediction/scripts/generate.py", line 193, in <module>
    main()
  File "F:/video_prediction/scripts/generate.py", line 152, in main
    model.restore(sess, args.checkpoint)

It seems the checkpoint cannot be restored for the testing. I would very appreciate your help!

The entire error message is below:
error.txt

@itstaby
Copy link

itstaby commented Apr 4, 2019

Getting a very similar error, would appreciate help!

EDIT: We had a dimensions issue before this, with the batch size set to 8, the program gave an error that it should completely divide the dataset with size 819, so we changed the batch size to 9.

@alexlee-gk

@ShreyasKolpe
Copy link

ShreyasKolpe commented Apr 4, 2019

Hi @alexlee-gk, I am having a similar problem as well with the kth dataset and the ours_savp model. This is the output of running
python scripts/generate.py --input_dir data/kth --dataset_hparams sequence_length=30 --checkpoint pretrained_models/kth/ours_savp --mode test --results_dir results_test_samples/kth --batch_size 9

error.txt

@alexlee-gk
Copy link
Owner

The errors from @brandonhuo and @ShreyasKolpe are the same. The problem is that the pre-trained model was trained with grayscale images that had 3 channels (same values tiled for the RGB channels) whereas now the grayscale images have 1 channel. I have new KTH models that were trained with the newer dataset format (and also perform better), but I haven't uploaded them yet. I'll do that soon. In the meantime, if you want to use the old pre-trained models, you can use grayscale images with 3 channels by commenting out this line: https://github.com/alexlee-gk/video_prediction/blob/master/video_prediction/datasets/kth_dataset.py#L118

The issue pointed out by @itstaby is different and that's just a limitation of the current implementation: the size of the evaluation dataset should be divisible by the batch size.

@brandonhuo
Copy link
Author

brandonhuo commented Apr 15, 2019

@alexlee-gk Thank you very much for the reply! After commenting out L118 in kth_dataset.py script, I still have the following error:

Traceback (most recent call last):
  File "D:\Program Files (x86)\Microsoft Visual Studio\Shared\Python36_64\lib\site-packages\tensorflow\python\client\session.py", line 1334, in _do_call
    return fn(*args)
  File "D:\Program Files (x86)\Microsoft Visual Studio\Shared\Python36_64\lib\site-packages\tensorflow\python\client\session.py", line 1319, in _run_fn
    options, feed_dict, fetch_list, target_list, run_metadata)
  File "D:\Program Files (x86)\Microsoft Visual Studio\Shared\Python36_64\lib\site-packages\tensorflow\python\client\session.py", line 1407, in _call_tf_sessionrun
    run_metadata)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Assign requires shapes of both tensors to match. lhs shape= [3,3,39,7] rhs shape= [3,3,53,7]
	 [[{{node save/Assign_76}}]]
	 [[{{node save/RestoreV2}}]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "F:/video_prediction/scripts/generate.py", line 193, in <module>
    main()
  File "F:/video_prediction/scripts/generate.py", line 152, in main
    model.restore(sess, args.checkpoint)
  File "F:\video_prediction\video_prediction\models\savp_model.py", line 855, in restore
    super(SAVPVideoPredictionModel, self).restore(sess, checkpoints, restore_to_checkpoint_mapping)
  File "F:\video_prediction\video_prediction\models\base_model.py", line 246, in restore
    sess.run(restore_op)
  File "D:\Program Files (x86)\Microsoft Visual Studio\Shared\Python36_64\lib\site-packages\tensorflow\python\client\session.py", line 929, in run
    run_metadata_ptr)
  File "D:\Program Files (x86)\Microsoft Visual Studio\Shared\Python36_64\lib\site-packages\tensorflow\python\client\session.py", line 1152, in _run
    feed_dict_tensor, options, run_metadata)
  File "D:\Program Files (x86)\Microsoft Visual Studio\Shared\Python36_64\lib\site-packages\tensorflow\python\client\session.py", line 1328, in _do_run
    run_metadata)
  File "D:\Program Files (x86)\Microsoft Visual Studio\Shared\Python36_64\lib\site-packages\tensorflow\python\client\session.py", line 1348, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Assign requires shapes of both tensors to match. lhs shape= [3,3,39,7] rhs shape= [3,3,53,7]
	 [[node save/Assign_76 (defined at F:\video_prediction\video_prediction\utils\tf_utils.py:542) ]]
	 [[node save/RestoreV2 (defined at F:\video_prediction\video_prediction\utils\tf_utils.py:542) ]]

Caused by op 'save/Assign_76', defined at:
  File "F:/video_prediction/scripts/generate.py", line 193, in <module>
    main()
  File "F:/video_prediction/scripts/generate.py", line 152, in main
    model.restore(sess, args.checkpoint)
  File "F:\video_prediction\video_prediction\models\savp_model.py", line 855, in restore
    super(SAVPVideoPredictionModel, self).restore(sess, checkpoints, restore_to_checkpoint_mapping)
  File "F:\video_prediction\video_prediction\models\base_model.py", line 243, in restore
    restore_to_checkpoint_mapping=restore_to_checkpoint_mapping)
  File "F:\video_prediction\video_prediction\utils\tf_utils.py", line 542, in get_checkpoint_restore_saver
    restore_saver = tf.train.Saver(max_to_keep=1, var_list=restore_and_checkpoint_vars, filename=checkpoint)
  File "D:\Program Files (x86)\Microsoft Visual Studio\Shared\Python36_64\lib\site-packages\tensorflow\python\training\saver.py", line 832, in __init__
    self.build()
  File "D:\Program Files (x86)\Microsoft Visual Studio\Shared\Python36_64\lib\site-packages\tensorflow\python\training\saver.py", line 844, in build
    self._build(self._filename, build_save=True, build_restore=True)
  File "D:\Program Files (x86)\Microsoft Visual Studio\Shared\Python36_64\lib\site-packages\tensorflow\python\training\saver.py", line 881, in _build
    build_save=build_save, build_restore=build_restore)
  File "D:\Program Files (x86)\Microsoft Visual Studio\Shared\Python36_64\lib\site-packages\tensorflow\python\training\saver.py", line 513, in _build_internal
    restore_sequentially, reshape)
  File "D:\Program Files (x86)\Microsoft Visual Studio\Shared\Python36_64\lib\site-packages\tensorflow\python\training\saver.py", line 354, in _AddRestoreOps
    assign_ops.append(saveable.restore(saveable_tensors, shapes))
  File "D:\Program Files (x86)\Microsoft Visual Studio\Shared\Python36_64\lib\site-packages\tensorflow\python\training\saving\saveable_object_util.py", line 73, in restore
    self.op.get_shape().is_fully_defined())
  File "D:\Program Files (x86)\Microsoft Visual Studio\Shared\Python36_64\lib\site-packages\tensorflow\python\ops\state_ops.py", line 223, in assign
    validate_shape=validate_shape)
  File "D:\Program Files (x86)\Microsoft Visual Studio\Shared\Python36_64\lib\site-packages\tensorflow\python\ops\gen_state_ops.py", line 68, in assign
    use_locking=use_locking, name=name)
  File "D:\Program Files (x86)\Microsoft Visual Studio\Shared\Python36_64\lib\site-packages\tensorflow\python\framework\op_def_library.py", line 788, in _apply_op_helper
    op_def=op_def)
  File "D:\Program Files (x86)\Microsoft Visual Studio\Shared\Python36_64\lib\site-packages\tensorflow\python\util\deprecation.py", line 507, in new_func
    return func(*args, **kwargs)
  File "D:\Program Files (x86)\Microsoft Visual Studio\Shared\Python36_64\lib\site-packages\tensorflow\python\framework\ops.py", line 3300, in create_op
    op_def=op_def)
  File "D:\Program Files (x86)\Microsoft Visual Studio\Shared\Python36_64\lib\site-packages\tensorflow\python\framework\ops.py", line 1801, in __init__
    self._traceback = tf_stack.extract_stack()

InvalidArgumentError (see above for traceback): Assign requires shapes of both tensors to match. lhs shape= [3,3,39,7] rhs shape= [3,3,53,7]
	 [[node save/Assign_76 (defined at F:\video_prediction\video_prediction\utils\tf_utils.py:542) ]]
	 [[node save/RestoreV2 (defined at F:\video_prediction\video_prediction\utils\tf_utils.py:542) ] 

What could be this error comes from? Thanks again for the help!

@Chuckie-He
Copy link

Hi @alexlee-gk I am having a similar problem as well with the kth dataset and the ours_savp model.
I used two ways to preprocess the kth dataset(both grayscale images with 1channel ans 3 channels), but the problem still exist:
Traceback (most recent call last):
File "scripts/evaluate.py", line 315, in
main()
File "scripts/evaluate.py", line 252, in main
model.build_graph(input_phs)
File "/home/hechujing/demo/SAVP/video_prediction/models/base_model.py", line 478, in build_graph
outputs_tuple, losses_tuple, loss_tuple, metrics_tuple = self.tower_fn(self.inputs)
File "/home/hechujing/demo/SAVP/video_prediction/models/base_model.py", line 412, in tower_fn
gen_outputs = self.generator_fn(inputs)
File "/home/hechujing/demo/SAVP/video_prediction/models/savp_model.py", line 730, in generator_fn
gen_outputs_posterior = generator_given_z_fn(inputs_posterior, mode, hparams)
File "/home/hechujing/demo/SAVP/video_prediction/models/savp_model.py", line 694, in generator_given_z_fn
outputs, _ = tf_utils.unroll_rnn(cell, inputs)
File "/home/hechujing/demo/SAVP/video_prediction/utils/tf_utils.py", line 139, in unroll_rnn
swap_memory=False, time_major=True, scope=scope)
File "/home/hechujing/anaconda3/envs/SAVP/lib/python3.6/site-packages/tensorflow/python/ops/rnn.py", line 618, in dynamic_rnn
dtype=dtype)
File "/home/hechujing/anaconda3/envs/SAVP/lib/python3.6/site-packages/tensorflow/python/ops/rnn.py", line 815, in _dynamic_rnn_loop
swap_memory=swap_memory)
File "/home/hechujing/anaconda3/envs/SAVP/lib/python3.6/site-packages/tensorflow/python/ops/control_flow_ops.py", line 3209, in while_loop
result = loop_context.BuildLoop(cond, body, loop_vars, shape_invariants)
File "/home/hechujing/anaconda3/envs/SAVP/lib/python3.6/site-packages/tensorflow/python/ops/control_flow_ops.py", line 2941, in BuildLoop
pred, body, original_loop_vars, loop_vars, shape_invariants)
File "/home/hechujing/anaconda3/envs/SAVP/lib/python3.6/site-packages/tensorflow/python/ops/control_flow_ops.py", line 2878, in _BuildLoop
body_result = body(*packed_vars_for_body)
File "/home/hechujing/anaconda3/envs/SAVP/lib/python3.6/site-packages/tensorflow/python/ops/control_flow_ops.py", line 3179, in
body = lambda i, lv: (i + 1, orig_body(*lv))
File "/home/hechujing/anaconda3/envs/SAVP/lib/python3.6/site-packages/tensorflow/python/ops/rnn.py", line 786, in _time_step
(output, new_state) = call_cell()
File "/home/hechujing/anaconda3/envs/SAVP/lib/python3.6/site-packages/tensorflow/python/ops/rnn.py", line 772, in
call_cell = lambda: cell(input_t, state)
File "/home/hechujing/anaconda3/envs/SAVP/lib/python3.6/site-packages/tensorflow/python/ops/rnn_cell_impl.py", line 232, in call
return super(RNNCell, self).call(inputs, state)
File "/home/hechujing/anaconda3/envs/SAVP/lib/python3.6/site-packages/tensorflow/python/layers/base.py", line 329, in call
outputs = super(Layer, self).call(inputs, *args, **kwargs)
File "/home/hechujing/anaconda3/envs/SAVP/lib/python3.6/site-packages/tensorflow/python/keras/engine/base_layer.py", line 703, in call
outputs = self.call(inputs, *args, **kwargs)
File "/home/hechujing/demo/SAVP/video_prediction/models/savp_model.py", line 549, in call
kernels = dense(flatten(smallest_layer), np.prod(kernel_shape))
File "/home/hechujing/demo/SAVP/video_prediction/ops.py", line 7, in dense
input_shape = inputs.get_shape().as_list()
File "/home/hechujing/anaconda3/envs/SAVP/lib/python3.6/site-packages/tensorflow/python/framework/tensor_shape.py", line 903, in as_list
raise ValueError("as_list() is not defined on an unknown TensorShape.")
ValueError: as_list() is not defined on an unknown TensorShape.

when I try to train or evaluate, it always told me "ValueError: as_list() is not defined on an unknown TensorShape. input_shape = inputs.get_shape().as_list()". how can I solve the error?What could be this error comes from? Thanks for the help!

@ry85
Copy link

ry85 commented Apr 25, 2019

@brandonhuo @Chuckie-He You need to run the download_and_preprocess_dataset.sh script after commenting out the Line#118 in kth_dataset.py file. And then run the code. It will work.

https://github.com/alexlee-gk/video_prediction/blob/master/video_prediction/datasets/kth_dataset.py#L118

@wangwen39
Copy link

I also met the problem of "ValueError: as_list() is not defined on an unknown TensorShape. input_shape = inputs.get_shape().as_list()", the method which commenting out the Line#118 in kth_dataset.py file seems have no effect to solve this problem.

@Chuckie-He
Copy link

me too,whether commenting out the Line#118 or not,it doesn’t work。I am very puzzled。

@wangwen39
Copy link

Yes, I met this problems for both train and predictions steps.

@AvaniPitre
Copy link

hello @wangwen39 @Chuckie-He
could you resolved this error?
Me too facing same issue.. any way to sort it out?
Please help

@malalejandra
Copy link

Hi, I switched to tf 1.10 and at least testing is working

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

9 participants