Skip to content

ZhaoyueSun/PHEE

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

UPDATES

Data Updates: Version 2.0 of PHEE (PHEE V2.0)

In the original dataset, we observed particularly low levels of annotation inconsistency for the ‘subject.disorder,’ ‘time_elapsed,’ and ‘duration’ arguments. To address this, we implemented an automatic revision for the ‘subject.disorder’ annotation and employed annotators to manually correct the ‘time_elapsed’ and ‘duration’ annotations. For the ‘subject.disorder’ correction, if a ‘treatment.disorder’ was present in the ‘subject’ argument but not annotated as ‘subject.disorder,’ we added it to the annotation. For ‘time_elapsed’ and ‘duration’ corrections, we provided detailed guidelines to the annotators to ensure consistent annotations. We evaluated annotation consistency using the EM_F1 score. The average EM_F1 scores for the ‘time_elapsed’ and ‘duration’ annotations were 75.3%.

The updated data can be downloaded at Leveraging ChatGPT in Pharmacovigilance Event Extraction: An Empirical Study Git Repo

The paper introduces “DrugWatch”, an easy-to-use and interactive multi-source information visualisation platform for drug safety study. It allows users to understand common side effects of drugs and their statistical information, flexibly retrieve relevant medical reports, or annotate their own medical texts with our automated annotation tool. Supported by NLP technology and enriched with interactive visual components, we are committed to providing researchers and practitioners with a one-stop information analysis, retrieval, and annotation service. The demo system is online.

The paper explores the potential of ChatGPT in pharmacovigilance event extraction. The research involves extensive experiments using various prompts and strategies to assess ChatGPT's performance. The paper also investigates using ChatGPT for data augmentation and filtering strategies to achieve more stable results.


Introduction

Data and Code for PHEE: A Dataset for Pharmacovigilance Event Extraction from Text

Data

Raw and processed data are under Data folder:

- clean: clean raw data
- json: processed structured data
- eeqa: EEQA format data transfered from json data
- gen_qa: GenQA format data transfered from json data
- ace: ACE format data transfered from json data

Compared Methods

gen_qa

We use a generative QA model to be our baseline method. The code is based on huggingface example code.

We modified the training code of seq2seq_qa to fit our data and evaluation setting. More details could be found in the folder README.

eeqa

EEQA is one of our comparison method from the paper: Event Extraction by Answering (Almost) Natural Questions.

The code under this folder is from the paper's released code. A few places are modified to fit our data.

ACE

ACE is a sequence labelling method from the paper: Automated Concatenation of Embeddings for Structured Prediction: The code under this folder is from the paper's released code. The configuration we used is under ACE/config

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published