Issue running single gpu training script #229
Replies: 2 comments 3 replies
-
Solved : Issue was caused by python 3.7. Running it with python 3.10 fixed the issue. |
Beta Was this translation helpful? Give feedback.
-
@Shidhanta95 With batch =2, I reckon the problem could be coming from mask_decoder: with But @JunMa11 Sorry for tagging you out of the blue. I have spent hours on this matter with no success. Since you are the contributor of branch 0.1, perhaps you have encounted this problem before? |
Beta Was this translation helpful? Give feedback.
-
Hi, I am new to deep learning so apologies if the question may be very trivial. I am using a modified version of the train_one_gpu script to train the medsam model on a dataset. The first time I run the script I have no issues. But the second time I ran the script without making any changes I got the following error.
"RuntimeError: The size of tensor a (4) must match the size of tensor b (2) at non-singleton dimension 0"
Passing the code and tensor dimensions to chatgpt and asking it to output the tensor sizes shows that there should be no mismatch with the tensor dimensions.
I am clueless as to why it runs the first time and then it doesnt run again. I have attached the screenshots of the first run and the error. If required I can share my script as well.
![medsam single gpu training](https://private-user-images.githubusercontent.com/147026855/317648938-1ec63fbc-cb90-4e4b-81e8-5e542cb94008.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MjIwNzQxMTEsIm5iZiI6MTcyMjA3MzgxMSwicGF0aCI6Ii8xNDcwMjY4NTUvMzE3NjQ4OTM4LTFlYzYzZmJjLWNiOTAtNGU0Yi04MWU4LTVlNTQyY2I5NDAwOC5wbmc_WC1BbXotQWxnb3JpdGhtPUFXUzQtSE1BQy1TSEEyNTYmWC1BbXotQ3JlZGVudGlhbD1BS0lBVkNPRFlMU0E1M1BRSzRaQSUyRjIwMjQwNzI3JTJGdXMtZWFzdC0xJTJGczMlMkZhd3M0X3JlcXVlc3QmWC1BbXotRGF0ZT0yMDI0MDcyN1QwOTUwMTFaJlgtQW16LUV4cGlyZXM9MzAwJlgtQW16LVNpZ25hdHVyZT02ZjFjMGE1ODExODBhZmQ2NTYxNjM2NWI4Yzg2ZjM0YmJiNDM0MDI3NWJmNGZmMjJmMGU2YzZmNDYwMDI1NDFmJlgtQW16LVNpZ25lZEhlYWRlcnM9aG9zdCZhY3Rvcl9pZD0wJmtleV9pZD0wJnJlcG9faWQ9MCJ9.XUIAS6xoRfbb-GOI6tDJosQdFZs-dz27rA2rZMpVdWY)
![medsam single gpu training error (2)](https://private-user-images.githubusercontent.com/147026855/317648971-6d870c70-3d55-4107-b0e7-39390ab4606b.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MjIwNzQxMTEsIm5iZiI6MTcyMjA3MzgxMSwicGF0aCI6Ii8xNDcwMjY4NTUvMzE3NjQ4OTcxLTZkODcwYzcwLTNkNTUtNDEwNy1iMGU3LTM5MzkwYWI0NjA2Yi5wbmc_WC1BbXotQWxnb3JpdGhtPUFXUzQtSE1BQy1TSEEyNTYmWC1BbXotQ3JlZGVudGlhbD1BS0lBVkNPRFlMU0E1M1BRSzRaQSUyRjIwMjQwNzI3JTJGdXMtZWFzdC0xJTJGczMlMkZhd3M0X3JlcXVlc3QmWC1BbXotRGF0ZT0yMDI0MDcyN1QwOTUwMTFaJlgtQW16LUV4cGlyZXM9MzAwJlgtQW16LVNpZ25hdHVyZT05MDQ3MzlmNzMwZGQxZmUzOTJhMzU0MTQ4OWQ3ZGY5ODU0MTFhY2ZlM2Q3NjM1OGNjYTQ1NjAzZTFlNzJmZGVlJlgtQW16LVNpZ25lZEhlYWRlcnM9aG9zdCZhY3Rvcl9pZD0wJmtleV9pZD0wJnJlcG9faWQ9MCJ9._3fmjUnB-m0UMQshGputH2ZXEzjSnQKf2e54uQmN9VQ)
Beta Was this translation helpful? Give feedback.
All reactions