rcnn/code/rationale at master · taolei87/rcnn

History

Name		Name	Last commit message	Last commit date
parent directory ..
example_rationales		example_rationales
figures		figures
medical		medical
ubuntu		ubuntu
README.md		README.md
extended_layers.py		extended_layers.py
myio.py		myio.py
options.py		options.py
rationale.py		rationale.py
rationale_dependent.py		rationale_dependent.py

README.md

Learning Rationales behind Predictions

About

This directory contains the code and resources of the following paper:

"Rationalizing Neural Predictions". Tao Lei, Regina Barzilay and Tommi Jaakkola. EMNLP 2016. [PDF] [Slides]

The method learns to provide justifications, i.e. rationales, as supporting evidence of neural networks' prediction. The following figure illustrates the rationales and the associated predictions for multi-aspect sentiment analysis on product reivew:

A Pytorch implementation is available: https://github.com/yala/text_nn. This version uses Gumbel softmax instead of REINFORCE for training.

A Tensorflow implementation is available: https://github.com/RiaanZoetmulder/Master-Thesis/tree/master/rationale (by Riaan Zoetmulder)

Overview of the Model

We optimize two modular (neural) components, generator and encoder, to produce rationales and predictions. The framework is generic -- generator and encoder can be implemented and realized in various ways such as using RNNs or CNNs. We train the model in a RL style using policy gradient (specifically REINFORCE), as illustrated below.

Sub-directories

this root directory contains impelmentation of the rationale model used for the beer review data. rationale.py implements the independent selection version and rationale_dependent.py implements the sequential selection version. See the paper for details.
example_rationales contains rationales generated for the beer review data.
ubuntu contains alternative implementation for the AskUbuntu data.
medical contains alternative implementation for medical report classification.

Data

Proudct reviews: We provide subsets of reviews and pre-trained word embeddings at here. This should be sufficient for producing our results. Please contact the author of the dataset, Prof. McAuley for the full set (1.5 million reviews).
AskUbuntu data: AskUbuntu question data is available in this repo.
Pathology data: This data is not available due to patients' privacy. We only provide the code and example snapshot at /medical directory

Important Note: all data is for research-purpose only.

Code Usage

To run the code, you need Numpy and Theano (> 0.7.0.dev-8d3a67 I used) installed. Next:

Clone the rcnn repo
Use “export PYTHONPATH=/path/to/rcnn/code” to add the rcnn/code directory to Python library
Run python rationale.py --help or python rationale_dependent.py --help to see all running options

Example run of beer review data:

THEANO_FLAGS='device=gpu,floatX=float32'        # use GPU and 32-bit float
python rationale.py                             # independent selection version
      --embedding /path/to/vectors              # path to load word vectors (required)
      --train reviews.aspect0.train.txt.gz      # path to training set (required)
      --dev reviews.aspect0.heldout.txt.gz      # path to development set (required)        
      --load_rationale annotations.json         # path to rationale annotation for testing (required)
      --aspect 0                                # which aspect (-1 means all aspects)
      --dump outputs.json                       # dump selected rationales and predictions
      --sparsity 0.0003 --coherent 2.0          # regularizations

To-do

better documentation of the code
more example usage of the code
put trained models in the repo??

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

rationale

rationale

README.md

Learning Rationales behind Predictions

About

Overview of the Model

Sub-directories

Data

Code Usage

To-do

Files

rationale

Directory actions

More options

Directory actions

More options

Latest commit

History

rationale

Folders and files

parent directory

README.md

Learning Rationales behind Predictions

About

Overview of the Model

Sub-directories

Data

Code Usage

To-do