The conversion script doesn’t work #174

StellaAthena · 2021-03-26T04:27:36Z

Describe the bug
A clear and concise description of what the bug is.

To Reproduce
Steps to reproduce the behavior:

Run conversion script
Load results into the HuggingFace transformers library
Feed it a context of 450 tokens and then have it generate another 200
Observe that around the 500th token the coherency falls off a cliff

Expected behavior
Performance should not jump off a cliff

Proposed solution
It appears that the problem is the lack of compatibility between the local attention function used in GPT-Neo and the transformers library. While the transformers library does include models with local attention (longformer, for example) it’s not consistent with how the GPT-2 model is defined in the transformers library.

Screenshots
n/a

Environment (please complete the following information):

GPUs: v3-8s, Ti1080s, A100s
Configs: any config that has local attention

Additional context
Add any other context about the problem here.

The text was updated successfully, but these errors were encountered:

StellaAthena · 2021-03-26T13:46:32Z

The amazing @patil-suraj and @LysandreJik have a preliminary PR for a HF implementation!

huggingface/transformers#10848

StellaAthena · 2021-03-31T15:17:57Z

It's live on HF!

https://huggingface.co/EleutherAI/gpt-neo-2.7B

StellaAthena added the bug Something isn't working. label Mar 26, 2021

This was referenced Mar 26, 2021

convert_gpt.py error on converting new pre-trained models #173

Closed

ValueError when predicting with pretrained models #150

Closed

StellaAthena closed this as completed Mar 27, 2021

EleutherAI deleted a comment from mwkrix777 Mar 28, 2021

StellaAthena reopened this Mar 28, 2021

StellaAthena closed this as completed Mar 31, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The conversion script doesn’t work #174

The conversion script doesn’t work #174

StellaAthena commented Mar 26, 2021

StellaAthena commented Mar 26, 2021 •

edited

Loading

StellaAthena commented Mar 31, 2021

The conversion script doesn’t work #174

The conversion script doesn’t work #174

Comments

StellaAthena commented Mar 26, 2021

StellaAthena commented Mar 26, 2021 • edited Loading

StellaAthena commented Mar 31, 2021

StellaAthena commented Mar 26, 2021 •

edited

Loading