- Several important packages
- torch == 1.8.0+cu111
- trochvision == 0.9.0+cu111
Dataset | MultiMediate:Multi-modal Group Behaviour Analysis for Artificial Mediation
Download the dataset from the above link to the 'noxi' folder.
├── Engagemet Estiamtion/
│ ├── code/
│ │ ├── data
│ │ ├── src
└── └── └── output_model
│ ├── noxi/
│ │ ├── train
│ │ ├── val
└── └── └── test
- Windows10
- Ubuntu20.04
- macOS (CPU only)
- Single GPU Training
- DataParallel (single machine multi-gpus)
- DistributedDataParallel
(more information: https://pytorch.org/tutorials/intermediate/ddp_tutorial.html)
Perform preprocessing operations, normalization, and merging of multi-modal features.
python process.py
Use the center-based sliding window to partition multi-modal feature sequences for training, validation, and testing.
python CSW.py
├── code/
│ ├── data/
└── └── └── all_data
python train.py \
--N 3 \
--M 0 \
--K 0 \
--save_dir CEAM
python train.py \
--N 1 \
--M 1 \
--K 2 \
--save_dir DCECEAM
python eval.py \
--N 3 \
--M 0 \
--K 0 \
--save_dir CEAM
python eval.py \
--N 1 \
--M 1 \
--K 2 \
--save_dir DCECEAM
Model | Method | Val CCC | Test CCC | Inference speed (FPS) | Params (M) |
---|---|---|---|---|---|
SA-based model [Yu et al.] | Sliding window | 0.796 | - | 4537 | 22.67 |
BiLSTM-based model [Yu et al.] | Sliding window | 0.818 | 0.689 | 1310 | 36.17 |
CEAM (Ours) | Center-based sliding window | 0.821 | 0.691 | 6455 | 23.98 |
Dialogue Cross-Enhanced CEAM (Ours) | Center-based sliding window | 0.835 | 0.704 | 6185 | 31.07 |
It can be downloaded from Google Cloud Disk.
It can be directly used for inference and to get the final result.
- Müller P, Balazia M, Baur T, et al. MultiMediate'23: Engagement Estimation and Bodily Behaviour Recognition in Social Interactions[C]//Proceedings of the 31st ACM International Conference on Multimedia. 2023: 9640-9645.
- Yu J, Lu K, Jing M, et al. Sliding Window Seq2seq Modeling for Engagement Estimation[C]//Proceedings of the 31st ACM International Conference on Multimedia. 2023: 9496-9500.