You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Feb 12, 2022. It is now read-only.
I have trained model with some adjustments on a corpus almost similar to Wikitext-103. After using the same hyperparameters (args) used for Wiki-103 in the paper/readme & training for 12 epochs, the training PP is 8-15, valid PP is 53, but the test PP is above 150. When I try to generate words, consecutive words turn out to be unrelated. I've checked that the model doesn't need any more training & that the corpus was shuffled correctly before splitting. Any ideas as to what may cause this difference in PP?
The text was updated successfully, but these errors were encountered:
I have trained model with some adjustments on a corpus almost similar to Wikitext-103. After using the same hyperparameters (args) used for Wiki-103 in the paper/readme & training for 12 epochs, the training PP is 8-15, valid PP is 53, but the test PP is above 150. When I try to generate words, consecutive words turn out to be unrelated. I've checked that the model doesn't need any more training & that the corpus was shuffled correctly before splitting. Any ideas as to what may cause this difference in PP?
The text was updated successfully, but these errors were encountered: