High Test Perplexity #98

Sketchizer · 2019-04-15T07:43:33Z

I have trained model with some adjustments on a corpus almost similar to Wikitext-103. After using the same hyperparameters (args) used for Wiki-103 in the paper/readme & training for 12 epochs, the training PP is 8-15, valid PP is 53, but the test PP is above 150. When I try to generate words, consecutive words turn out to be unrelated. I've checked that the model doesn't need any more training & that the corpus was shuffled correctly before splitting. Any ideas as to what may cause this difference in PP?

Sketchizer closed this as completed May 11, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

High Test Perplexity #98

High Test Perplexity #98

Sketchizer commented Apr 15, 2019

High Test Perplexity #98

High Test Perplexity #98

Comments

Sketchizer commented Apr 15, 2019