-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add support for packing tokenized datasets #2011
base: main
Are you sure you want to change the base?
Conversation
d7a4ca9
to
f6dedb5
Compare
Anyone having bandwidth, requesting review thank you - @qgallouedec @lewtun @kashif @lvwerra or others from community. Discussion can be seen here - #1848 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the contribution @kmehant ! Overall it looks good to me - would you mind adding an integration test for this scenario?
f6dedb5
to
004f128
Compare
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
@lewtun Thank you for your review. I have addressed the review comments and as well added the test cases. Thank you. |
Signed-off-by: Mehant Kammakomati <mehant.kammakomati2@ibm.com>
Signed-off-by: Mehant Kammakomati <mehant.kammakomati2@ibm.com>
Signed-off-by: Mehant Kammakomati <mehant.kammakomati2@ibm.com>
What does this PR do?
Fixes #1848
Before submitting
Pull Request section?
to it if that's the case. Support packing for pretokenized datasets #1848
documentation guidelines.
Who can review?
@qgallouedec
Anyone from the community!