Avg ~ PC1

"Average" Approximates "First Principal Component"? An Empirical Analysis on Representations from Neural Language Models

Our paper is available here. Accepted as a short paper in EMNLP'21.

Motivation

In progress.

Scripts

Environment

Packages can be installed via pip install -r requirements.txt.

Reproduce

First run embed_corpus.py to obtain the embeddings for a certain corpus with a certain language model.
Then, run calculate_properties.py to get the absolute cosine similarities between first PC and the average of the embeddings.
Calculations for other properties in the paper are in progress.

Citation

Please cite the following paper if you found our dataset or framework useful. Thanks!

Zihan Wang, Chengyu Dong, and Jingbo Shang. ""Average" Approximates "First Principal Component"? An Empirical Analysis on Representations from Neural Language Models" arXiv preprint arXiv:2104.08673 (2021).

@misc{wang2020xclass,
      title={"Average" Approximates "First Principal Component"? An Empirical Analysis on Representations from Neural Language Models}, 
      author={Zihan Wang and Chengyu Dong and Jingbo Shang},
      year={2021},
      eprint={2104.08673},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.gitignore		.gitignore
README.md		README.md
calculate_properties.py		calculate_properties.py
embed_corpus.py		embed_corpus.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Avg ~ PC1

"Average" Approximates "First Principal Component"? An Empirical Analysis on Representations from Neural Language Models

Motivation

Scripts

Environment

Reproduce

Citation

About

Releases

Packages

Languages

ZihanWangKi/AverageApproxFirstPC

Folders and files

Latest commit

History

Repository files navigation

Avg ~ PC1

"Average" Approximates "First Principal Component"? An Empirical Analysis on Representations from Neural Language Models

Motivation

Scripts

Environment

Reproduce

Citation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages