Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

hellokan.ipynb returns NaN instead of formula. #179

Closed
Stealeristaken opened this issue May 13, 2024 · 5 comments
Closed

hellokan.ipynb returns NaN instead of formula. #179

Stealeristaken opened this issue May 13, 2024 · 5 comments

Comments

@Stealeristaken
Copy link

Hi.

I was trying hellokan.ipynb file. I do some scaleup at training steps like 50 -> 150. In the end train_loss started to return NaN instead of any value.

I thought maybe it's a kernel error. So I re-downloaded the baseline hellokan.ipynb and rerun without any editing. It returned NaN once again. I will drop screenshot about problem.
Ekran Resmi 2024-05-13 15 08 12

@KindXiaoming
Copy link
Owner

KindXiaoming commented May 13, 2024

Hi, the problem was caused by the appearance of the log function (which is unexpected behavior). This means that the pruning step is not good. Could you show the plot you have after pruning? From feedback from others, you may try model.prune(threshold=5e-2) instead of just model.prune().

@Stealeristaken
Copy link
Author

Sorry for late response. I tried both prune options (no-specified threshold and specified threshold) result was the same i am adding pictures
specified

Ekran Resmi 2024-05-16 20 46 51

no-specified

Ekran Resmi 2024-05-16 20 45 17

@ShuleiCao
Copy link

I encountered a similar issue, but I found that increasing the step size helped.

@KindXiaoming
Copy link
Owner

@Stealeristaken, in block [8], it should again specify the threshold model = model.prune(threshold=5e-2)

@KindXiaoming
Copy link
Owner

@ShuleiCao Thanks, yes, the pruning results can depend on quite many factors. Training longer will usually end up a sparser network.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants