Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Changes for orcacode experiment #3612

Merged
merged 7 commits into from
Jul 29, 2023
Merged

Changes for orcacode experiment #3612

merged 7 commits into from
Jul 29, 2023

Conversation

andreaskoepf
Copy link
Collaborator

No description provided.

@shahules786
Copy link
Collaborator

Besides this I think there is a bug in this line. Config cannot be updated. I removed it from my version. Can you check that too with this PR? @andreaskoepf

@@ -199,45 +198,37 @@ def __getitem__(self, idx):
class DolphinMix(Dataset):
name = "dophin-mix"

def __init__(self, cache_dir, num_samples=100000, max_char_len=8000, seed=42):
def __init__(self, cache_dir, num_samples: Optional[int] = None, max_char_len: int = 8000, seed: int = 42):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't it better to add data_files as an argument with the default value "flan-5m..."? In this way, we can change to gpt-4 version using config if required.
`

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you remove the update_config function definition too since it's not used?

@shahules786
Copy link
Collaborator

Nice, thanks.

@shahules786 shahules786 merged commit c2c1318 into main Jul 29, 2023
1 check passed
@shahules786 shahules786 deleted the prepare_orcacode branch July 29, 2023 09:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants