-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Failed to allocate memory #1093
Comments
That branch is 6 years old. You should try the main branch. We are also close to merging a new branch that upgrades to jp4.6 and Tensorflow 2.9. |
Thank you, @Ezward how to try new? |
@Ezward Failed to allocate memory using donkey v4.4.4-main ... Layer (type) Output Shape Param # Connected toimg_in (InputLayer) [(None, 600, 800, 3) 0 conv2d_1 (Conv2D) (None, 298, 398, 24) 1824 img_in[0][0] dropout (Dropout) (None, 298, 398, 24) 0 conv2d_1[0][0] conv2d_2 (Conv2D) (None, 147, 197, 32) 19232 dropout[0][0] dropout_1 (Dropout) (None, 147, 197, 32) 0 conv2d_2[0][0] conv2d_3 (Conv2D) (None, 72, 97, 64) 51264 dropout_1[0][0] dropout_2 (Dropout) (None, 72, 97, 64) 0 conv2d_3[0][0] conv2d_4 (Conv2D) (None, 70, 95, 64) 36928 dropout_2[0][0] dropout_3 (Dropout) (None, 70, 95, 64) 0 conv2d_4[0][0] conv2d_5 (Conv2D) (None, 68, 93, 64) 36928 dropout_3[0][0] dropout_4 (Dropout) (None, 68, 93, 64) 0 conv2d_5[0][0] flattened (Flatten) (None, 404736) 0 dropout_4[0][0] dense_1 (Dense) (None, 100) 40473700 flattened[0][0] dropout_5 (Dropout) (None, 100) 0 dense_1[0][0] dense_2 (Dense) (None, 50) 5050 dropout_5[0][0] dropout_6 (Dropout) (None, 50) 0 dense_2[0][0] n_outputs0 (Dense) (None, 1) 51 dropout_6[0][0] n_outputs1 (Dense) (None, 1) 51 dropout_6[0][0]Total params: 40,625,028 None (1) Resource exhausted: Failed to allocate memory for the batch of component 0
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info. 0 successful operations. Function call stack: |
What machine are you running training on? Maybe your are trying to train on a Jetson Nano? I have heard of others that did this, but it is not a supported training machine. I guess I would try to reduce the batch size by changing |
@Ezward yes try to train with Jetson Nano B01, I changed BATCH_SIZE to 1, but failed |
@Ahrovan - you seem to be using an image size of 600x800. The linear model that you are trying to train is geared towards an image size of 120x160, and has about 500k parameters for that size. For a 600x800 you would need a model with higher compression, i.e. more layers or larger strides. You can see that your model now has 40,000,000 parameters (look at the Flattened layer from your model: you are coming out with a 400k dimensional vector and then going into a 100d dense layer, this basically gives you the 40m parameters). This model is far to big to fit into the ram of the nano. So either, set the image size to the standard of 120x160 or modify the model architecture. |
@DocGarbanzo Thank you |
Note that image should be 120 high, 160 wide. @Ahrovan have you retried using the new image size? |
@Ahrovan I am going to close this; if you have more info please do add a comment to the closed issue. |
donkey train --tub ./data --model ./models/myModel.h5
2 root error(s)
(0) Resource exhausted: Failed to allocate memory for the batch of component 0
[[node IteratorGetNext (defined at /projects/donkeycar/donkeycar/parts/keras.py:183) ]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.
[[IteratorGetNext/_6]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.
(1) Resource exhausted: Failed to allocate memory for the batch of component 0
[[node IteratorGetNext (defined at /projects/donkeycar/donkeycar/parts/keras.py:183) ]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.
Hardware/Software Details:
The text was updated successfully, but these errors were encountered: