Skip to content

code for EMNLP 2024 paper: Interpreting Arithmetic Mechanism in Large Language Models through Comparative Neuron Analysis

Notifications You must be signed in to change notification settings

zepingyu0512/arithmetic-mechanism

Repository files navigation

EMNLP 2024: Interpreting Arithmetic Mechanism in Large Language Models through Comparative Neuron Analysis

Introduction

This repo is for the EMNLP 2024 paper: Interpreting Arithmetic Mechanism in Large Language Models through Comparative Neuron Analysis

This work explores the mechanism of arithmetic tasks. This work introduces the comparable neuron analysis (CNA) method to identify the important neurons.

This work uses the findings and insights in this repo in this EMNLP 2024 paper

running code

First, please use modeling_llama.py to replace the original file in the transformers path, which is usually in anaconda3/envs/YOUR_ENV_NAME/lib/python3.8/site-packages/transformers/models/llama. This modified file is useful for extracting the internal vectors during inference time. Please remember to save the original file.

Then, run the code in Llama_view_arithmetic_head.ipynb and Llama_view_arithmetic_CNA.ipynb using jupyter notebook. This introduces how to identify and analyze the important heads/neurons in an arithmetic case.

Llama_view_arithmetic_head.ipynb: identifying the important heads.

Llama_view_arithmetic_CNA.ipynb: identifying the important neurons in deep FFN layers and shallow FFN layers.

transformers version: 4.37.1

torch version: 2.1.2+cu121

cite us:

@article{yu2024interpreting,
  title={Interpreting Arithmetic Mechanism in Large Language Models through Comparative Neuron Analysis},
  author={Yu, Zeping and Ananiadou, Sophia},
  journal={arXiv preprint arXiv:2409.14144},
  year={2024}
}

About

code for EMNLP 2024 paper: Interpreting Arithmetic Mechanism in Large Language Models through Comparative Neuron Analysis

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published