Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing model card / data sheet with info on pretraining and RLHF datasets #9

Open
mdingemanse opened this issue Sep 28, 2023 · 4 comments

Comments

@mdingemanse
Copy link

At opening-up-chatgpt.github.io we're documenting data sources and degrees of openness along several dimensions for instruction-tuned LLMs. I am looking for information about (1) pretraining dataset and (2) RLHF datasets but have not found any details. The HuggingFace model card says

For full details of this model please read our release blog post

The release blog post provides no information on this at present.

@aakosm
Copy link

aakosm commented Sep 28, 2023

Information on the the language composition of the pretraining dataset would also be welcome, as there are no mention on multilingual capabilities of the model in the linked blog post.

@149189
Copy link

149189 commented Sep 29, 2023

I would like to work on this project!

@AlexWortega
Copy link

Upvote thread

@mdingemanse
Copy link
Author

FWIW Mistral currently sits in the bottom 5 of the live tracker of LLM openness:

74f48dadd7bef58c

diegolascasas added a commit that referenced this issue Dec 12, 2023
* Add MoE and Pipelining support

* Update readme

* Update requirements

* Add faster loading 

* Make sliding window optional and add rope_theta with smart default

---------

Co-authored-by: devendrachaplot <devendrachaplot@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants