Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ggml : implementation of xPos RoPE (#441); also extends ggml_rope_bac… #442

Merged
merged 1 commit into from
Aug 22, 2023

Conversation

jploski
Copy link
Contributor

@jploski jploski commented Aug 9, 2023

This is my implementation of xPos RoPE - based on an implementation provided by https://github.com/syncdoth/RetNet (which differs slightly from Microsoft's torchscale implementation to support iterative/recurrent use).

Note that I decided to extend the existing !is_neox implementation by adding the required parameters, but I did not add a new flag for it (the combination of !is_neox and xpos_scale > 0 is used to distinguish between original RoPE and xPos; for the latter an additional API function ggml_rope_xpos_inplace is provided).

I also extended ggml_rope_back with xPos-specific parameters as well as with the two freq parameters that were already added to the forward version of ggml_rope_custom, but were missing in the backprop.

Code review and help with the missing CUDA port appreciated.

…_rope_back with additional parameters (breaking API change); does not include CUDA version
Copy link
Owner

@ggerganov ggerganov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@ggerganov ggerganov merged commit 896b089 into ggerganov:master Aug 22, 2023
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants