Skip to content
/ DualH Public

DualH: A Dual Hierarchical Model for Temporal Action Localization

License

Notifications You must be signed in to change notification settings

zz202/DualH

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 

Repository files navigation

DualH — Codes and pretrained models will be uploaded here

DualH: A Dual Hierarchical Model for Temporal Action Localization

Temporal action localization aims to detect action boundaries and classify action labels in untrimmed videos. Recent efforts have focused on utilizing Transformers to encode extracted features into a bottom-up pyramid feature map and localizing actions from all levels of the pyramid while only considering features from those specific levels. A limitation of this bottom-up encoding is that the lower-level features lack broader contexts, while the upper-level features lose local boundary information. Consequently, the performance of the model may be hindered. In this work, we propose a dual hierarchical model to mitigate this issue. The first hierarchy operates on the full temporal sequence to encode features at multiple scales. These features are fused to ensure all temporal locations consider both local boundary information and broader contexts. Next, the fused feature is downsampled to a pyramid representation for localizing actions at multiple resolutions. Experimental results on THUMOS14, ActivityNet-1.3, and EPIC-KITCHENS-100 demonstrate that our dual hierarchical design improves the performance with respect to the conventional bottom-up pyramid Transformer-based models.

About

DualH: A Dual Hierarchical Model for Temporal Action Localization

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published