Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
examples/configs		examples/configs
mcolt		mcolt
scripts		scripts
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt
train_w_mono.sh		train_w_mono.sh

Repository files navigation

Contrastive Learning for Many-to-many Multilingual Neural Machine Transaltion(mCOLT), ACL2021

The code for training mCOLT, a multilingual NMT training framework, implemented based on fairseq.

Introduction

mRASP2/mCOLT, representing multilingual Contrastive Learning for Transformer, is a multilingual neural machine translation model that supports complete many-to-many multilingual machine translation. It employs both parallel corpora and multilingual corpora in a unified training framework. For detailed information please refer to the paper.

Pre-requisite

pip install -r requirements.txt

Training Data and Checkpoints

We release our preprocessed training data and checkpoints in the following.

Dataset

We merge 32 English-centric language pairs, resulting in 64 directed translation pairs in total. The original 32 language pairs corpus contains about 197M pairs of sentences. We get about 262M pairs of sentences after applying RAS, since we keep both the original sentences and the substituted sentences. We release both the original dataset and dataset after applying RAS.

Dataset	#Pair
32-lang-pairs-TRAIN	197603294
32-lang-pairs-RAS-TRAIN	262662792
mono-split-a	-
mono-split-b	-
mono-split-c	-
mono-split-d	-
mono-split-e	-
mono-split-de-fr-en	-
mono-split-nl-pl-pt	-
32-lang-pairs-DEV-en-centric	-
32-lang-pairs-DEV-many-to-many	-
Vocab	-
BPE Code	-

Checkpoints

Note that the provided checkpoint is sightly different from that in the paper.

mRASP2-12e12d

Training

bash train_w_mono.sh ${model_config}

We give example of ${model_config} in ${PROJECT_REPO}/examples/configs/parallel_mono_12e12d_contrastive.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Contrastive Learning for Many-to-many Multilingual Neural Machine Transaltion(mCOLT), ACL2021

Introduction

Pre-requisite

Training Data and Checkpoints

Dataset

Checkpoints

Training

About

Releases

Packages

Contributors 2

Languages

PANXiao1994/mRASP2

Folders and files

Latest commit

History

Repository files navigation

Contrastive Learning for Many-to-many Multilingual Neural Machine Transaltion(mCOLT), ACL2021

Introduction

Pre-requisite

Training Data and Checkpoints

Dataset

Checkpoints

Training

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages