Refactor perplexity implementations to be usable with evaluators #240

mathemakitten · 2022-08-09T19:40:28Z

Currently the perplexity metric and measurement both instantiate an entire model object within the _compute() function and run inference, which breaks the pattern where only predictions, references, and other metadata is passed in and only the metric computation is performed (e.g. inference is already run before the metric hits its _compute function, which is the case for most other metrics).

Evaluators already instantiate model-like objects with model_or_pipeline, which makes it redundant to reinstantiate another copy of the model/pipeline within the _compute() function. It seems like this is done because perplexity can be used in a standalone manner to calculate the perplexity of some text with regards to a pretrained model (e.g. perplexity.compute(model_id='gpt2', add_start_token=False, input_texts=input_texts), trading off developer flexibility for user-friendliness.

For background, this is relevant since I'm in the middle of implementing a "text-generation" evaluator, and it makes sense for perplexity to be the default metric for text generation. We could alternatively write one-off logic for the text generation evaluator to compute perplexity with the model instantiated in the evaluator instead of using the metric implemented in evaluate, but that seems suboptimally out-of-pattern. Additionally, perplexity metrics right now are limited to be used with models which can easily instantiated via AutoModelForCausalLM.from_pretrained(model_id), but does not support a generic call to a language model.

The text was updated successfully, but these errors were encountered:

mathemakitten linked a pull request Aug 12, 2022 that will close this issue

Refactor perplexity so compute() does not run inference by default #252

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor perplexity implementations to be usable with evaluators #240

Refactor perplexity implementations to be usable with evaluators #240

mathemakitten commented Aug 9, 2022

Refactor perplexity implementations to be usable with evaluators #240

Refactor perplexity implementations to be usable with evaluators #240

Comments

mathemakitten commented Aug 9, 2022