You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Our algorithm colleagues often conduct SFT based on some industry-leading SOTA models, and then apply them to online serving in vertical domains, with the expectation of achieving some breakthroughs in business performance. Sometimes the business performance does not improve, and even shows a negative trend. They will investigate problems throughout the entire process, including the model.
Their usual way of comparison is to compare with the results of transformers. The general process is to use the same input and sampling parameters, especially when using greedy search for sampling parameters, to obtain deterministic output, in order to compare and confirm whether LMDeploy itself has introduced differences.
I wrote a diff tool for this process, maybe I can submit a PR. Do you have any suggestions?@lzhangzz@grimoire@lvhan028 Due to the low inference efficiency of the transformers library itself, the version I am currently implementing does not support comparing dataset sizes. Usually, users specify an input from command line parameters. In the future, more comprehensive and complex features can be considered, such as supporting dataset reading and comparison.
Due to the current implementation of split-k in LMDeploy TurboMind, there is inconsistency between the results of some requests and transformers when the batch size dynamically changes under greedy search. Considering that we are currently only comparing single request, this is not a problem for now.
Related resources
No response
Additional context
No response
The text was updated successfully, but these errors were encountered:
Based on my experience working with algorithm colleagues, it is best to have a document that details the functions of temperature, top_k, and top_p when they are set simultaneously. Users should be able to explicitly set these values rather than using default values. The default values for these parameters vary in different frameworks or libraries such as transformers, which may lead to difficulties in troubleshooting issues.
Considering that this matter is relatively trivial, I will use it internally first and it cannot be called a tool for the time being. I will close it first and reopen it if anyone needs it later. ref https://huggingface.co/blog/how-to-generate
Motivation
Our algorithm colleagues often conduct SFT based on some industry-leading SOTA models, and then apply them to online serving in vertical domains, with the expectation of achieving some breakthroughs in business performance. Sometimes the business performance does not improve, and even shows a negative trend. They will investigate problems throughout the entire process, including the model.
Their usual way of comparison is to compare with the results of
transformers
. The general process is to use the same input and sampling parameters, especially when using greedy search for sampling parameters, to obtain deterministic output, in order to compare and confirm whether LMDeploy itself has introduced differences.I wrote a diff tool for this process, maybe I can submit a PR. Do you have any suggestions? @lzhangzz @grimoire @lvhan028 Due to the low inference efficiency of the
transformers
library itself, the version I am currently implementing does not support comparing dataset sizes. Usually, users specify an input from command line parameters. In the future, more comprehensive and complex features can be considered, such as supporting dataset reading and comparison.Due to the current implementation of split-k in LMDeploy TurboMind, there is inconsistency between the results of some requests and
transformers
when the batch size dynamically changes under greedy search. Considering that we are currently only comparing single request, this is not a problem for now.Related resources
No response
Additional context
No response
The text was updated successfully, but these errors were encountered: