This is my GSoC2020 project with Red Hen Lab.
The goal is to design a network capable of detecting and recognizing hand gestures and then apply it to annotate the dataset of news videos of Red Hen Lab.
extract information from the IEMOCAP dataset and save the data as pickle.
Audio-only, visual-only and audio-visual models are tested to verify the utility of bimodal information.