Does an LSTM forget more than a CNN? An empirical study of catastrophic forgetting in NLP

This is repository contains code for the experiment to evaluate catastrophic forgetting in neural networks for ALTA paper.

If you want to access version used in ALTA 2019 paper, use branch alta_paper

Requirements

allennlp (Installed from https://github.com/gauravaror/allennlp)
pytorch
svcca (https://github.com/google/svcca)
numpy
pandas

Installation Step

Create a virtual environment
Install allennlp in virtual environment by running pip install --editable .
Clone svcca folder in home directory of this project.
Install latest pytorch, Last tested with 1.3(stable)

Tasks

Currently, We support running four tasks:

TREC (code: trec)
SST (code: sst)
CoLA (code: cola)
Subjectivity (code: subjectivity)

Passing option --task trec will add the task. Order of --task option will decide the task in which tasks are trained.

Architectures

We support running tasks on

CNN (--cnn)
LSTM (--seq2vec)
GRU (--seq2vec --gru)
Transformer Encoder from pytorch. (--seq2vec --transformer)
Deep Pyramid CNN (--pyramid)

Embeddings

You can add embeddings using --embeddings option. Currently supports default(trained from scratch), bert, elmo. But you will have to download the ELMo embeddings yourself and store it in data folder.

elmo_2x4096_512_2048cnn_2xhighway_options.json  
elmo_2x4096_512_2048cnn_2xhighway_weights.hdf5

There are few other options feel free to explore or create an issue on github if you get stuck.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Does an LSTM forget more than a CNN? An empirical study of catastrophic forgetting in NLP

Requirements

Installation Step

Tasks

Architectures

Embeddings

Files

README.md

Latest commit

History

README.md

File metadata and controls

Does an LSTM forget more than a CNN? An empirical study of catastrophic forgetting in NLP

Requirements

Installation Step

Tasks

Architectures

Embeddings