SRL elmo 5b #1310

DeNeutoy · 2018-05-30T20:59:31Z

New SRL model using ELMo - this moves our F1 from 78.9(boo) to 84.97 (84.93 on test).

I used a bit of a hacky repackaging method here so that we can distribute it - it basically involves loading in the GPU specific model, copying the config and weights into a new model which doesn't have the GPU only LSTM implementation.

Full results below for posterity:

              corr.  excess  missed    prec.    rec.      F1
------------------------------------------------------------
   Overall    70191   12778   12062    84.60   85.34   84.97
----------
      ARG0    16309    1586    1327    91.14   92.48   91.80
      ARG1    25106    3566    3123    87.56   88.94   88.24
      ARG2     7735    1642    1795    82.49   81.16   81.82
      ARG3      402     186     274    68.37   59.47   63.61
      ARG4      446     121     137    78.66   76.50   77.57
      ARG5        7       4       9    63.64   43.75   51.85
      ARGA        0       0       4     0.00    0.00    0.00
  ARGM-ADJ      162     131      92    55.29   63.78   59.23
  ARGM-ADV     2085    1135    1083    64.75   65.81   65.28
  ARGM-CAU      401     145     151    73.44   72.64   73.04
  ARGM-COM       19      27      15    41.30   55.88   47.50
  ARGM-DIR      316     217     233    59.29   57.56   58.41
  ARGM-DIS     2445     580     555    80.83   81.50   81.16
  ARGM-DSP        0       0       2     0.00    0.00    0.00
  ARGM-EXT      126     120     147    51.22   46.15   48.55
  ARGM-GOL       16      49      81    24.62   16.49   19.75
  ARGM-LOC     1572     718     605    68.65   72.21   70.38
  ARGM-LVB       47      10       8    82.46   85.45   83.93
  ARGM-MNR     1389     608     604    69.55   69.69   69.62
  ARGM-MOD     2776      72      48    97.47   98.30   97.88
  ARGM-NEG     1501      83      60    94.76   96.16   95.45
  ARGM-PNC       23      79      90    22.55   20.35   21.40
  ARGM-PRD       72     176     344    29.03   17.31   21.69
  ARGM-PRP      336     244     192    57.93   63.64   60.65
  ARGM-REC        4       2       9    66.67   30.77   42.11
  ARGM-TMP     5186    1011     814    83.69   86.43   85.04
    R-ARG0      917      80      85    91.98   91.52   91.75
    R-ARG1      632      99      82    86.46   88.52   87.47
    R-ARG2       34      16      23    68.00   59.65   63.55
    R-ARG3        0       0       7     0.00    0.00    0.00
    R-ARG4        1       0       1   100.00   50.00   66.67
R-ARGM-ADV        0       1       3     0.00    0.00    0.00
R-ARGM-CAU        1       3       1    25.00   50.00   33.33
R-ARGM-COM        0       1       0     0.00    0.00    0.00
R-ARGM-DIR        0       0       1     0.00    0.00    0.00
R-ARGM-EXT        0       0       2     0.00    0.00    0.00
R-ARGM-LOC       68      42      21    61.82   76.40   68.34
R-ARGM-MNR        9       3       5    75.00   64.29   69.23
R-ARGM-PRP        0       0       1     0.00    0.00    0.00
R-ARGM-TMP       48      21      28    69.57   63.16   66.21
------------------------------------------------------------
         V    34783       0       0   100.00  100.00  100.00
------------------------------------------------------------

DeNeutoy · 2018-05-30T23:06:06Z

allennlp/tests/models/sniff_test.py

                {"verb": "will",
-                 "description": "[ARGM-ADV: If you liked the music we were playing last night] , [ARG0: you] [V: will] [ARG1: absolutely love what we 're playing tomorrow] !",
-                 "tags": ["B-ARGM-ADV", "I-ARGM-ADV", "I-ARGM-ADV", "I-ARGM-ADV", "I-ARGM-ADV", "I-ARGM-ADV", "I-ARGM-ADV", "I-ARGM-ADV", "I-ARGM-ADV", "I-ARGM-ADV", "O", "B-ARG0", "B-V", "B-ARG1", "I-ARG1", "I-ARG1", "I-ARG1", "I-ARG1", "I-ARG1", "I-ARG1", "O"]},
+                 "description": "If you liked the music we were playing last night , you [V: will] absolutely love what we 're playing tomorrow !",


It's pretty cool that the new model picks up this case - I hadn't thought that the previous model was incorrect, but in this sentence, "will" is a light verb and doesn't really hold any semantic meaning (I think). Progress 🎉

matt-gardner

Nice, great work! I had some minor comments that aren't a big deal, as this is an infrequently used script. Feel free to ignore them if you want.

matt-gardner · 2018-05-31T03:43:43Z