Skip to content

anord-wang/LLM4REC

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

21 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

LLM4REC

This ReadMe file contains the Python codes for the paper.

1. Task and Solution

Our task is to use large language models to recommend. The current methods cannot integrate edge information in graphs into LLMs structurally. Our solution contains two major parts. First, we add an edge measurement in attention calculation. Second, we design a set of prompts for pre-training and fine-tuning. The details can be found in the paper.

2. Dataset

We use the Amazon Review Dataset for experiments. The raw data can be found here.

3. Codes Description

There are two parts of the code. The first part is the modified Attention code. The second part is the progress of the proposed method.

3.1. Modified Attention Codes

The modified Attention code is in the folder. You can put it in the Transformers lib and the path to those two codes may be like this:

'/home/local/ASURITE/xwang735/anaconda3/envs/LLM/lib/python3.12/site-packages/transformers/models/gpt2'

Or you can just create a new lib containing these codes and name it 'newTransformers'.

3.2. Main Codes

There are data preprocessing, pre-training, fine-tuning, and prediction codes in the src/.

3.2.1. Data Pre-processing Codes

First, the data preprocessing codes contain data_preprocess_amazon.py, data_preprocessing.py, and data_pkl.py.

data_preprocess_amazon.py is used to transform raw data to the format we want. The processed data can be found at this link.

data_preprocessing.py is used to get the relationship matrix for every dataset.

data_pkl.py is used to get 2-order connection among items.

3.2.2. Useful Components

These codes are in libs/. These codes are built for the dataloader, personalized models, and tokenizer.

3.2.3. Codes to Run

These codes are used for pre-training and fine-tuning.

training.py is used for pre-training stage. You can run like this:

python training.py --dataset 'dataset_name' --lambda_V 1

OR

accelerate launch training.py --dataset 'dataset_name' --lambda_V 1

finetuning.py is used for fine-tuning stage. You can run like this:

python finetuning.py --dataset 'dataset_name' --lambda_V 1

OR

accelerate launch finetuning.py --dataset 'dataset_name' --lambda_V 1

Be careful!!! You may need to change the path based on your own.

And, you will need a folder to store the model. It should have a structure like this.

/'dataset_name'
  /collaborate
  /content
  /rec

If you have any questions, please feel free to drop me an e-mail.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages