Skip to content

Code of the paper "IncepTR: Micro-Expression Recognition Integrating Inception-CBAM and Vision Transformer"

Notifications You must be signed in to change notification settings

HaoliangZhou/IncepTR

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

30 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

IncepTR

IncepTR: Micro-Expression Recognition Integrating Inception-CBAM and Vision Transformer paper

Haoliang Zhou, Shucheng Huang, Xuqiao Xu

Abstract

Micro-Expressions (MEs) are the instantaneous and subtle facial movement that conveys crucial emotional information. However, traditional neural networks face difficulties in accurately capturing the delicate features of MEs due to the limited amount of available data. To address this issue, a dual-branch attention network is proposed for ME recognition, called IncepTR, which can capture attention-aware local and global representations. The network takes optical flow features as input and performs feature extraction using a dual-branch network. First, the Inception model based on the Convolutional Block Attention Module (CBAM) attention mechanism is maintained for multi-scale local feature extraction. Second, the Vision Transformer (ViT) is employed to capture subtle motion features and robustly model global relationships among multiple local patches. Additionally, to enhance the rich relationships between different local patches in ViT, Multi-head Self-Attention Dropping (MSAD) is introduced to drop an attention map randomly, effectively preventing overfitting to specific regions. Finally, the two types of features could be used to learn ME representations effectively through similarity comparison and feature fusion. With such combination, the model is forced to capture the most discriminative multi-scale local and global features while reducing the influence of affective-irrelevant features. Extensive experiments show that the proposed IncepTR achieves UF1 and UAR of 0.753 and 0.746 on the composite dataset MEGC2019-CD, demonstrating better or competitive performance compared to existing state-of-the-art methods for ME recognition.

Data preparation

Following Dual-ATME and RCN, the data lists are reorganized as follow:

data/
├─ MEGC2019/
│  ├─ v_cde_flow/
│  │  ├─ 006_test.txt
│  │  ├─ 006_train.txt
│  │  ├─ 007_test.txt
│  │  ├─ ...
│  │  ├─ sub26_train.txt
│  │  ├─ subName.txt
  1. There are 3 columns in each txt file:
/home/user/data/samm/flow/006_006_1_2_006_05588-006_05562_flow.png 0 1

In this example, the first column is the path of the optical flow image for a particular ME sample, the second column is the label (0-2 for three emotions), and the third column is the database type (1-3 for three databases).

  1. There are 68 raws in subName.txt, reference to subName.txt:
006
...
037
s01
...
s20
sub01
...
sub26

Represents ME samples divided by MEGC2019, as described in here ahd here.

Citation

If you find this repo useful for your research, please consider citing the paper

@article{zhou2023inceptr,
  title={Inceptr: micro-expression recognition integrating inception-CBAM and vision transformer},
  author={Zhou, Haoliang and Huang, Shucheng and Xu, Yuqiao},
  journal={Multimedia Systems},
  volume={29},
  number={6},
  pages={3863--3876},
  year={2023},
  publisher={Springer}
}

About

Code of the paper "IncepTR: Micro-Expression Recognition Integrating Inception-CBAM and Vision Transformer"

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages