Skip to content

taliapandans/SER_unimse

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

18 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Speech Emotion Recognition

This is a university project to build a speech emotion recognition with multiple modalities. In this case we sample UniMSE paper:

Datasets

  • MOSEI
  • MOSI
  • IEMOCAP

The dataset folders are to be found under UniMSE>Simcse>dataset folder

Installation and Environment

Unimse_Submission.ipynb contains a walkthrough of all the installation steps. In the end of the notebook, main.py is tested for three different datasets (MOSEI, MOSI, IEMOCAP)

Process

  • First, every single dataset (MOSI, MOSEI, IEMOCAP,MELD) is preprocessed by running preprocess.py. For every dataset, a train, test and validation pkl is created and saved to the according folders.
  • Under UniMSE>src, the paths to the created .pkl files must be set correctly for every dataset. The train dataset and number of epochs can be changed on the config.py.
  • Please also add t5-base to ./src folder.
  • Lastly, model is trained by running main.py.

Current results (will be updated soon)

MOSEI: | Train Loss: 0.0070 | Valid Loss 0.3053 | Test Loss 0.3024
MOSI: Train Loss: 0.0121 | Valid Loss 0.4356 | Test Loss 0.3878

Current Goals

Run on IEMOCAP
Fine Tune on Emodb

Releases

No releases published

Packages

No packages published