LLM4REC

This ReadMe file contains the Python codes for the paper.

1. Task and Solution

Our task is to use large language models to recommend. The current methods cannot integrate edge information in graphs into LLMs structurally. Our solution contains two major parts. First, we add an edge measurement in attention calculation. Second, we design a set of prompts for pre-training and fine-tuning. The details can be found in the paper.

2. Dataset

We use the Amazon Review Dataset for experiments. The raw data can be found here.

3. Codes Description

There are two parts of the code. The first part is the modified Attention code. The second part is the progress of the proposed method.

3.1. Modified Attention Codes

The modified Attention code is in the folder. You can put it in the Transformers lib and the path to those two codes may be like this:

'/home/local/ASURITE/xwang735/anaconda3/envs/LLM/lib/python3.12/site-packages/transformers/models/gpt2'

Or you can just create a new lib containing these codes and name it 'newTransformers'.

3.2. Main Codes

There are data preprocessing, pre-training, fine-tuning, and prediction codes in the src/.

3.2.1. Data Pre-processing Codes

First, the data preprocessing codes contain data_preprocess_amazon.py, data_preprocessing.py, and data_pkl.py.

data_preprocess_amazon.py is used to transform raw data to the format we want. The processed data can be found at this link.

data_preprocessing.py is used to get the relationship matrix for every dataset.

data_pkl.py is used to get 2-order connection among items.

3.2.2. Useful Components

These codes are in libs/. These codes are built for the dataloader, personalized models, and tokenizer.

3.2.3. Codes to Run

These codes are used for pre-training and fine-tuning.

training.py is used for pre-training stage. You can run like this:

python training.py --dataset 'dataset_name' --lambda_V 1

OR

accelerate launch training.py --dataset 'dataset_name' --lambda_V 1

finetuning.py is used for fine-tuning stage. You can run like this:

python finetuning.py --dataset 'dataset_name' --lambda_V 1

OR

accelerate launch finetuning.py --dataset 'dataset_name' --lambda_V 1

Be careful!!! You may need to change the path based on your own.

And, you will need a folder to store the model. It should have a structure like this.

/'dataset_name'
  /collaborate
  /content
  /rec

If you have any questions, please feel free to drop me an e-mail.

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
modified_transformer		modified_transformer
src		src
README.md		README.md
merges.txt		merges.txt
vocab_file.json		vocab_file.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LLM4REC

1. Task and Solution

2. Dataset

3. Codes Description

3.1. Modified Attention Codes

3.2. Main Codes

3.2.1. Data Pre-processing Codes

3.2.2. Useful Components

3.2.3. Codes to Run

About

Releases

Packages

Languages

anord-wang/LLM4REC

Folders and files

Latest commit

History

Repository files navigation

LLM4REC

1. Task and Solution

2. Dataset

3. Codes Description

3.1. Modified Attention Codes

3.2. Main Codes

3.2.1. Data Pre-processing Codes

3.2.2. Useful Components

3.2.3. Codes to Run

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages