Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add pytests for semantic dedup #141

Open
ayushdg opened this issue Jul 5, 2024 · 0 comments
Open

Add pytests for semantic dedup #141

ayushdg opened this issue Jul 5, 2024 · 0 comments
Labels
enhancement New feature or request

Comments

@ayushdg
Copy link
Collaborator

ayushdg commented Jul 5, 2024

Is your feature request related to a problem? Please describe.
If possible we should include some minor pytests for semantic dedup on a small dataset using a very small model and the end to end API's, similar to some of the tests here: https://github.com/NVIDIA/NeMo-Curator/blob/main/tests/test_fuzzy_dedup.py#L210.

This would help catch breakages in minor functionality/api calls when running in gpu enabled local dev environments.
Followup of #130.

@ayushdg ayushdg added the enhancement New feature or request label Jul 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant