Give users the ability to compare models' outputs for a given task. #56

dynamicwebpaige · 2022-03-06T06:26:17Z

🗒️ Motivation

When a user selects a specific task on the Hugging Face Hub - for example, image-to-text:

That user is shown a series of models, with no guidance as to which model might be state of the art, or which might be the most performant for their use case.

To test the capabilities and behavior of each model, the user must:

Open the link to each model in a new browser tab.
Read through each model's model card.
Test out each model with an image, if the model has a Spaceor a Colab notebook available (not every model does).
Cross-reference each model with the state-of-the-art leaderboards on Papers with Code.

🙏 Desired Behavior

The user should be able to:

Select a given task (for example, image-to-text).
Select one or more models to test, for that use case.
Input a piece of data, to test (an image, in the case of image-to-text).
View the output of each model, given that input data -- side by side, to compare the performance and behavior.

The text was updated successfully, but these errors were encountered:

gary149 · 2022-03-07T10:11:48Z

This would be so cool, I really like the user story you made with the side-by-side benchmark!

We were talking about integrating something like this but into tasks pages, so the workflow would be:

Select a task
Read a bit about the task and discover some SOTA models (we tried to editorialize that a bit by writing an explanatory text about each task and associating a note to each hand-curated model).
Compute multiple widgets with a single input to compare the outputs (today you only get one curated model to test the task).

And maybe when you select a particular task on /models we could add a link to the task page:

This is probably the simplest way of doing it but I understand that it's not the same as having it directly integrated into the /models page.

So maybe we want to go further and do it exactly like you said: integrate it directly into the /models page: you drag/input a picture/audio/text and all the visible models on the page compute and it switches to "benchmark mode" (that could be gamechanger 🤯). That will of course be a lot of work and I'm not even sure that we can hold that many computations at the same time 👀 (edit: we will find a way 👍 ).

LysandreJik transferred this issue from huggingface/huggingface_hub Mar 16, 2022

osanseviero changed the title ~~[huggingface.co] Give users the ability to compare models' outputs for a given task.~~ Give users the ability to compare models' outputs for a given task. Mar 17, 2022

osanseviero added the feature-request label Mar 17, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Give users the ability to compare models' outputs for a given task. #56

Give users the ability to compare models' outputs for a given task. #56

dynamicwebpaige commented Mar 6, 2022

gary149 commented Mar 7, 2022 •

edited

Loading

Give users the ability to compare models' outputs for a given task. #56

Give users the ability to compare models' outputs for a given task. #56

Comments

dynamicwebpaige commented Mar 6, 2022

🗒️ Motivation

🙏 Desired Behavior

gary149 commented Mar 7, 2022 • edited Loading

gary149 commented Mar 7, 2022 •

edited

Loading