-
Notifications
You must be signed in to change notification settings - Fork 113
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Decide what metrics to capture to ASV for merlin-models notebooks #235
Comments
@bschifferer can you please review the notebooks and share what the standard outputs for these notebooks are. Can we standardize all of the notebooks to a singe set of metrics? If not what are the differences? |
I took a look and Merlin Models have only a few set of metrics. Merlin Systems and Merlin/Merlin are based on Merlin Models examples and uses the same metrics. Can we discuss that based on the use cases/models? My proposal is: All Notebooks:
Training:
Inference:
@sararb @gabrielspmoreira do you want to add additional metrics for training? Are other metrics helpful to understand, if everything is running correctly? |
After our last CI meeting, we want only track a few metrics for some specific notebooks. @jperez999 is our CI a single or multi-GPU environment? Currently, we use following datasets in our repositories:
Proposal is to use criteo: It has a large dataset size and it is used in research community/perf benchmark, therefore, there are examples for good AUC scores. Proposal:
|
How about HugeCTR? |
Not quite sure I understand the Triton part: How do we intend to measure p95 latency? Do we have a way to generate realistic requests that would make p95 meaningful? Or are we just intending to allow for some variance in the latency of serving the same request over and over? |
I think we decided to collect the metrics here: https://nvidia.slack.com/archives/CVBDJUPEZ/p1679617586380369 and we continue the ticket with NVIDIA-Merlin/models#1047 . |
No description provided.
The text was updated successfully, but these errors were encountered: