A dataset of egocentric vision, eye-tracking and full body kinematics from human locomotion in out-of-the-lab environments. Also, different use cases of the dataset along with example code.
-
Updated
Nov 5, 2023 - Python
A dataset of egocentric vision, eye-tracking and full body kinematics from human locomotion in out-of-the-lab environments. Also, different use cases of the dataset along with example code.
Experiments around using Multi-Modal Casual Attention with Multi-Grouped Query Attention
AI Poet who looks at the images and writes poems Web service.
a Discord chatbot trained on Mistral and LLaVA language models
Code for paper Multiomics dynamic learning enables personalized diagnosis and prognosis for pan-cancer and cancer-subtypes
Applied Deep Learning 深度學習之應用 by Vivian Chen 陳縕儂 at NTU CSIE
My master thesis: Siamese multi-hop attention for cross-modal retrieval.
Repository for context based emotion recognition
[AINL 2023] IMAD: IMage Augmented multi-modal Dialogue
COMPSCI 696DS Industry Mentorship Program with Meta Reality Labs: Ambient AI: Multimodal Wearable Sensor Understanding (Experiments in Distilling Knowledge in Cross-Modal Contrastive Learning.)
Streamlit app for demonstrating multi-modal(vision+language) modelling in Pytorch.
Multi-Modal Representational Learning for Social Media Popularity Prediction
MultiCLIP: A framework for multimodal-multilabel-multistage classification utilizing advanced pretrained models like CLIP and BLIP. 一个多模态多标签多阶段分类框架,利用像CLIP和BLIP这样的先进预训练模型。
Multimodal deep learning package that uses both categorical and text-based features in a single deep architecture for regression and binary classification use cases.
Deeplearning utils for multimodal research
Code for TGRS 2022 paper "Fine-scale Urban Informal Settlements Mapping by Fusing Remote Sensing Images and Building Data via a Transformer-based Multimodal Fusion Network"
Code and Models for Binding Text, Images, Graphs, and Audio for Music Representation Learning
VTC: Improving Video-Text Retrieval with User Comments
More to Less (M2L): Enhanced Health Recognition in the Wild with Reduced Modality of Wearable Sensors
Multimodal Emotion Recognition using ClipBERT.
Add a description, image, and links to the multimodal-deep-learning topic page so that developers can more easily learn about it.
To associate your repository with the multimodal-deep-learning topic, visit your repo's landing page and select "manage topics."