Skip to content

lansatiankong/type-supervised-tagging-2012emnlp

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Type-Supervised Tagging: EMNLP 2012

This repository contains the code, scripts, and instructions needed to reproduce the results in the paper

Type-Supervised Hidden Markov Models for Part-of-Speech Tagging with Incomplete Tag Dictionaries
Dan Garrette and Jason Baldridge
In Proceedings of EMNLP 2012

This code is frozen as of the version used to obtain the results in the paper. It will not be maintained.

To see the most up-to-date version of the code, visit this repository.

Running the experiments

Set up English data

The English experiments rely on Penn Treebank data. This script prepares that data for use by the experiments. The treebank directory referenced when running the script should contain a folder combined containing files wsj_0000.mrg through wsj_2454.mrg.

sh run.sh "en-data /path/to/treebank"

Run English experiments on sections 00-15

sh run.sh en-run16

Run English experiments on sections 00-07

sh run.sh en-run8

Run Italian experiments

The Italian data is already located in the data directory, so this experiment can be launched immediately without need for data setup.

sh run.sh it-run

Questions

If you have any questions, please contact Dan Garrette (dhg@cs.utexas.edu).

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Scala 99.9%
  • Shell 0.1%