Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Runtime Error in hellokan.ipynb #173

Merged
merged 1 commit into from
May 12, 2024

Conversation

AlessandroFlati
Copy link
Contributor

Closes #170
Thanks to @wkqian06

@AlessandroFlati
Copy link
Contributor Author

This should close also #36, given that the initial model and the other model are both initialized with device='cuda'. If they have different dtypes, though, they should be all casted to float(), for now.

torch.Size([5, 13])
'''
# x_eval: (size, batch); y_eval: (size, batch); grid: (size, grid); k: scalar
mat = B_batch(x_eval, grid, k, device=device).permute(0, 2, 1).to(y_eval.dtype)
coef = torch.linalg.lstsq(mat.to('cpu'), y_eval.unsqueeze(dim=2).to('cpu')).solution[:, :, 0] # sometimes 'cuda' version may diverge
mat = B_batch(x_eval, grid, k, device=device).permute(0, 2, 1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please confirm if it is compatible with float64? Please check the context of the changes made in PR #148.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It works with float64 inputs. I agree with Ziming's reply to #146. The singular problem matters. In fact, torch.linalg.lstsq may raise errors due to a singular problem using gels as the driver in cuda version because gels can only work with a full-rank mat.

My other concern about this modification is that I just noticed the author commented the original code # sometimes 'cuda' version may diverge . I don't know if this modification would raise the 'diverge' situation again while running on 'cuda'. For now, this modification at least fixed the Runtime Error.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually tested it thoroughly through all singularity problems I could think of. Of course I don't expect this to work in all situations, but at least now it's working on most :)
We'll have to understand better when a non full-rank mat actually appears and how to mitigate the issue - from a theoretic point of view, I can only see this kind of problem in a feature being totally excluded by the graph, which indeed would be a valuable information by itself.

@KindXiaoming KindXiaoming merged commit 10c456a into KindXiaoming:master May 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Runtime Error in hellokan.ipynb
4 participants