Skip to content
This repository has been archived by the owner on Feb 12, 2022. It is now read-only.

High Test Perplexity #98

Closed
Sketchizer opened this issue Apr 15, 2019 · 0 comments
Closed

High Test Perplexity #98

Sketchizer opened this issue Apr 15, 2019 · 0 comments

Comments

@Sketchizer
Copy link

I have trained model with some adjustments on a corpus almost similar to Wikitext-103. After using the same hyperparameters (args) used for Wiki-103 in the paper/readme & training for 12 epochs, the training PP is 8-15, valid PP is 53, but the test PP is above 150. When I try to generate words, consecutive words turn out to be unrelated. I've checked that the model doesn't need any more training & that the corpus was shuffled correctly before splitting. Any ideas as to what may cause this difference in PP?

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant