Skip to content

Official code of *Towards Event-oriented Long Video Understanding*

Notifications You must be signed in to change notification settings

RUCAIBox/Event-Bench

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 

Repository files navigation

Towards Event-oriented Long Video Understanding

VideoQA Multi-Modal

Event-Bench VIM


🔥 News

  • 2024.06.20 🌟 Benchmark, evaluation code, training data, and model are released!

👀 Overview

We introduce Event-Bench, an event-oriented long video understanding benchmark built on existing datasets and human annotations. Event-Bench consists of three event understanding abilities and six event-related tasks, including 2,190 test instances to comprehensively evaluate the ability to understand video events.

Event-Bench provides a systematic comparison across different kinds of capabilities for existing video MLLMs, and points out the major shortcomings of open-source MLLMs.

🔍 Dataset

Download the raw videos in VNBench from the google drive link. Download the annotation of VNBench from the huggingface link License:

Event-Bench is only used for academic research. Commercial use in any form is prohibited.

🔮 Evaluation Pipeline

Prompt:

The common prompt used in our evaluation follows this format:

<QUESTION>
A. <OPTION1>
B. <OPTION2>
C. <OPTION3>
D. <OPTION4>
Answer with the option's letter from the given choices directly.

Evaluation:

We recommend you to save the inference result in the format as example_result.jsonl. Once you have prepared the model responses in this format, please execute our evaluation script evaluate_em.py, and you will get the accuracy scores.

python evaluate_em.py \
    --path $RESULTS_FILE

If you want to use GPT-4-turbo for evaluation, please use the following script evaluate_gpt.py.

python evaluate_gpt.py \
    --input_file $INPUT_FILE \
    --output_file $OUTPUT_FILE 

📈 Experimental Results

  • Evaluation results of different Video MLLMs.

Citation

If you find our work helpful for your research, please consider citing our work.

@misc{du2024eventoriented,
    title={Towards Event-oriented Long Video Understanding},
    author={Yifan Du and Kun Zhou and Yuqi Huo and Yifan Li and Wayne Xin Zhao and Haoyu Lu and Zijia Zhao and Bingning Wang and Weipeng Chen and Ji-Rong Wen},
    year={2024},
    eprint={2406.14129},
    archivePrefix={arXiv},
    primaryClass={cs.CV}
}

About

Official code of *Towards Event-oriented Long Video Understanding*

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%