Skip to content

[NeurIPS 2021] Multiscale Benchmarks for Multimodal Representation Learning

License

Notifications You must be signed in to change notification settings

pliang279/MultiBench

Repository files navigation

MultiBench: Multiscale Multimodal Benchmark

Large Scale Benchmarks for Multimodal Representation Learning

Contributors

Correspondence to:

Datasets supported

  1. Affective computing: CMU-MOSI, CMU-MOSEI, POM, UR-FUNNY, Deception, MUStARD
  2. Healthcare: MIMIC
  3. Multimedia: AV-MNIST, MMIMDB, Kinetics (size issue?)
  4. Finance: Stocks-food, Stocks-tech, Stocks-healthcare

TODO: add HCI and Robotics

To add a new dataset:

  1. see datasets/
  2. add a new folder if appropriate
  3. write a dataloader python file following the existing examples
  4. see examples/ and write an example training python file following the existing examples
  5. check that calling the dataloader and running a simple training script works

Algorithms supported

  1. unimodals: LSTM, Transformer, FCN, Random Forest
  2. fusions: early/late concatenation, attention, tensors
  3. objective_functions: VAE, contrastive learning, max MI, CCA
  4. training_structures: balancing generalization, architecture search

To add a new algorithm:

  1. Figure out which subfolder to add it into:
  • unimodals/ : unimodal architectures
  • fusions/ : multimodal fusion architectures
  • objective_functions/ : objective functions in addition to supervised training loss (e.g., VAE loss, contrastive loss)
  • training_structures/ : training algorithms excluding objective functions (e.g., balancing generalization, architecture search outer RL loop)
  1. see examples/ and write an example training python file following the existing examples
  2. check that calling the added functions and running a simple training script works

Experiments

References