Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Give users the ability to compare models' outputs for a given task. #56

Open
dynamicwebpaige opened this issue Mar 6, 2022 · 1 comment

Comments

@dynamicwebpaige
Copy link

πŸ—’οΈ Motivation

When a user selects a specific task on the Hugging Face Hub - for example, image-to-text:

Screen Shot 2022-03-05 at 10 14 06 PM

That user is shown a series of models, with no guidance as to which model might be state of the art, or which might be the most performant for their use case.

To test the capabilities and behavior of each model, the user must:

  • Open the link to each model in a new browser tab.
  • Read through each model's model card.
  • Test out each model with an image, if the model has a Spaceor a Colab notebook available (not every model does).
  • Cross-reference each model with the state-of-the-art leaderboards on Papers with Code.

πŸ™ Desired Behavior

The user should be able to:

  • Select a given task (for example, image-to-text).
  • Select one or more models to test, for that use case.
  • Input a piece of data, to test (an image, in the case of image-to-text).
  • View the output of each model, given that input data -- side by side, to compare the performance and behavior.
@gary149
Copy link
Contributor

gary149 commented Mar 7, 2022

This would be so cool, I really like the user story you made with the side-by-side benchmark!

We were talking about integrating something like this but into tasks pages, so the workflow would be:

  1. Select a task
  2. Read a bit about the task and discover some SOTA models (we tried to editorialize that a bit by writing an explanatory text about each task and associating a note to each hand-curated model).
  3. Compute multiple widgets with a single input to compare the outputs (today you only get one curated model to test the task).

And maybe when you select a particular task on /models we could add a link to the task page:

image

This is probably the simplest way of doing it but I understand that it's not the same as having it directly integrated into the /models page.

So maybe we want to go further and do it exactly like you said: integrate it directly into the /models page: you drag/input a picture/audio/text and all the visible models on the page compute and it switches to "benchmark mode" (that could be gamechanger 🀯). That will of course be a lot of work and I'm not even sure that we can hold that many computations at the same time πŸ‘€ (edit: we will find a way πŸ‘ ).

@LysandreJik LysandreJik transferred this issue from huggingface/huggingface_hub Mar 16, 2022
@osanseviero osanseviero changed the title [huggingface.co] Give users the ability to compare models' outputs for a given task. Give users the ability to compare models' outputs for a given task. Mar 17, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants