finetune & pointer bugs? #26

aykutfirat · 2018-03-23T13:24:18Z

python finetune.py --epochs 750 --data data/wikitext-2 --save WT2.pt --dropouth 0.2 --seed 1882
python pointer.py --save WT2.pt --lambdasm 0.1279 --theta 0.662 --window 3785 --bptt 2000 --data data/wikitext-2

Traceback (most recent call last):
File "finetune.py", line 183, in
stored_loss = evaluate(val_data)
File "finetune.py", line 108, in evaluate
model.eval()

Looks like model loading & more needs to be modified.

Also, I no longer get the reported ppls in main. LSTM gets stuck around 80s and QRNN around 90s.

Smerity · 2018-03-23T23:00:25Z

Hey @aykutfirat,

We've replicated the same issue you're seeing in terms of the initial training performance for ASGD based WT2, in our case using QRNN as it's faster to test. This is as I patched our changes for the Adam based model we used for WT-103, PTBC, and enwik8 over the top of AWD-LSTM-LM but failed to do full testing for regression.

We're hunting down the issue now, initially to fix the standard training and then later to fix the finetune and pointer steps.

xsway · 2018-04-20T15:38:38Z

It is probably a related issue, so I thought I would report it here.

When running python main.py --batch_size 20 --data data/penn --dropouti 0.4 --dropouth 0.25 --seed 141 --epoch 500 --save PTB.pt instead of the perplexities 61.2/58.8 I got 70.1 (?!)/58.6. The last lines of the training log below.

| end of epoch 498 | time: 159.11s | valid loss  4.25 | valid ppl    70.08 | valid bpc    6.131
-----------------------------------------------------------------------------------------
| epoch 499 |   200/  663 batches | lr 30.00000 | ms/batch 217.91 | loss  3.69 | ppl    39.95 | bpc    5.320
| epoch 499 |   400/  663 batches | lr 30.00000 | ms/batch 217.03 | loss  3.66 | ppl    38.88 | bpc    5.281
| epoch 499 |   600/  663 batches | lr 30.00000 | ms/batch 218.92 | loss  3.67 | ppl    39.39 | bpc    5.300
-----------------------------------------------------------------------------------------
| end of epoch 499 | time: 159.08s | valid loss  4.25 | valid ppl    70.08 | valid bpc    6.131
-----------------------------------------------------------------------------------------
| epoch 500 |   200/  663 batches | lr 30.00000 | ms/batch 216.38 | loss  3.70 | ppl    40.25 | bpc    5.331
| epoch 500 |   400/  663 batches | lr 30.00000 | ms/batch 216.45 | loss  3.66 | ppl    38.98 | bpc    5.285
| epoch 500 |   600/  663 batches | lr 30.00000 | ms/batch 220.70 | loss  3.68 | ppl    39.60 | bpc    5.308
-----------------------------------------------------------------------------------------
| end of epoch 500 | time: 158.92s | valid loss  4.25 | valid ppl    70.08 | valid bpc    6.131
-----------------------------------------------------------------------------------------
=========================================================================================
| End of training | test loss  4.07 | test ppl    58.56 | test bpc    5.872
=========================================================================================

keskarnitish · 2018-04-25T23:54:33Z

@xsway I think you're issue is linked to #32
I think everything is working as expected but we're printing the wrong validation loss/perplexity. Could you try patching that change and re-running? I think it should work. I will be running it myself before I merge the changes.

Smerity mentioned this issue Apr 3, 2018

Fine-tune broken for QRNNs? #28

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

finetune & pointer bugs? #26

finetune & pointer bugs? #26

aykutfirat commented Mar 23, 2018

Smerity commented Mar 23, 2018 •

edited

Loading

xsway commented Apr 20, 2018

keskarnitish commented Apr 25, 2018

finetune & pointer bugs? #26

finetune & pointer bugs? #26

Comments

aykutfirat commented Mar 23, 2018

Smerity commented Mar 23, 2018 • edited Loading

xsway commented Apr 20, 2018

keskarnitish commented Apr 25, 2018

Smerity commented Mar 23, 2018 •

edited

Loading