Skip to content
View isrkhou's full-sized avatar
Block or Report

Block or report isrkhou

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Beta Lists are currently in beta. Share feedback and report bugs.
Showing results

多模态情感分析——基于BERT+ResNet的多种融合方法

Python 198 27 Updated Nov 20, 2022

MMSA is a unified framework for Multimodal Sentiment Analysis.

Python 627 103 Updated Dec 27, 2023

Meta-Transformer for Unified Multimodal Learning

Python 1,476 113 Updated Dec 5, 2023

Reading list for research topics in multimodal machine learning

5,735 831 Updated Jun 19, 2024

Code for COLING 2022 paper: Modeling Intra- and Inter-Modal Relations: Hierarchical Graph Contrastive Learning for Multimodal Sentiment Analysis

Python 6 4 Updated May 28, 2023

Multi-Modal learning toolkit based on PaddlePaddle and PyTorch, supporting multiple applications such as multi-modal classification, cross-modal retrieval and image caption.

Python 559 98 Updated May 7, 2023

This repository contains various models targetting multimodal representation learning, multimodal fusion for downstream tasks such as multimodal sentiment analysis.

OpenEdge ABL 695 144 Updated Mar 15, 2023

A modular framework for vision & language multimodal research from Facebook AI Research (FAIR)

Python 5,459 932 Updated May 25, 2024

TorchMultimodal is a PyTorch library for training state-of-the-art multimodal multi-task models at scale.

Python 1,385 133 Updated Jul 29, 2024

Codes for paper "Learning Modality-Specific Representations with Self-Supervised Multi-Task Learning for Multimodal Sentiment Analysis"

Python 178 34 Updated Jun 25, 2022

Code for NPLCC 2020 paper: 一种基于多任务学习的多模态情感识别方法

Python 2 1 Updated Jul 12, 2023

xLSTM as Generic Vision Backbone

Python 366 26 Updated Jul 21, 2024

Cambrian-1 is a family of multimodal LLMs with a vision-centric design.

Python 1,617 99 Updated Jul 26, 2024

GPT4V-level open-source multi-modal model based on Llama3-8B

Python 1,692 90 Updated Jul 31, 2024

✨✨Latest Advances on Multimodal Large Language Models

10,971 723 Updated Aug 1, 2024

Official repo for "Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models"

Python 3,111 277 Updated May 4, 2024

InternLM-XComposer-2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output

Python 2,320 144 Updated Jul 31, 2024

DeepSeek-VL: Towards Real-World Vision-Language Understanding

Python 1,914 181 Updated Apr 24, 2024

[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的可商用开源多模态对话模型

Python 4,544 352 Updated Aug 1, 2024

Collection of AWESOME vision-language models for vision tasks

2,058 187 Updated Jul 24, 2024

MiniCPM-Llama3-V 2.5: A GPT-4V Level Multimodal LLM on Your Phone

Python 8,146 573 Updated Aug 1, 2024

Set-of-Mark Prompting for LMMs

Python 1,057 82 Updated Jun 5, 2024

[COLM-2024] List Items One by One: A New Data Source and Learning Paradigm for Multimodal LLMs

Python 105 2 Updated Jul 28, 2024

PG-Video-LLaVA: Pixel Grounding in Large Multimodal Video Models

Python 230 11 Updated Jan 2, 2024

🎬 UMT: Unified Multi-modal Transformers for Joint Video Moment Retrieval and Highlight Detection (CVPR 2022)

7 Updated Mar 26, 2022

[ICCV 2023] The official PyTorch implementation of the paper: "Localizing Moments in Long Video Via Multimodal Guidance"

14 Updated Jul 19, 2023

Hierarchical Video-Moment Retrieval and Step-Captioning (CVPR 2023)

Python 87 8 Updated Oct 21, 2023

Official pytorch repository for CG-DETR "Correlation-guided Query-Dependency Calibration in Video Representation Learning for Temporal Grounding"

Python 99 11 Updated Jul 1, 2024

Official pytorch repository for "QD-DETR : Query-Dependent Video Representation for Moment Retrieval and Highlight Detection" (CVPR 2023 Paper)

Python 187 13 Updated Nov 21, 2023

[CVPR 2024] Official implementation of the paper "Visual In-context Learning"

Python 335 11 Updated Apr 8, 2024
Next