CserDu

CserDu CserDu

0 followers · 2 following

Block or Report

Block or report CserDu

Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Lists (9)

Sort

Beta Lists are currently in beta. Share feedback and report bugs.

Stars

QwenLM / Qwen-Audio

The official repo of Qwen-Audio (通义千问-Audio) chat & pretrained large audio language model proposed by Alibaba Cloud.

Python 1,278 92 Updated Jul 5, 2024

pangxincheng / pTunnel

Go 2 Updated Jun 8, 2024

AILab-CVC / SEED

Official implementation of SEED-LLaMA (ICLR 2024).

Python 530 30 Updated Apr 11, 2024

facebookresearch / ToMe

A method to increase the speed and lower the memory footprint of existing vision transformers.

Python 908 68 Updated Jun 17, 2024

Zhao-Yian / GraCo

[CVPR 2024 Highlight] Official GraCo: Granularity-Controllable Interactive Segmentation.

Python 39 Updated Jul 19, 2024

mit-han-lab / streaming-llm

[ICLR 2024] Efficient Streaming Language Models with Attention Sinks

Python 6,381 355 Updated Jul 11, 2024

QwenLM / Qwen2

Qwen2 is the large language model series developed by Qwen team, Alibaba Cloud.

Shell 6,431 366 Updated Jul 18, 2024

FlagOpen / FlagEmbedding

Retrieval and Retrieval-augmented LLMs

Python 6,153 442 Updated Jul 14, 2024

EvolvingLMMs-Lab / LongVA

Long Context Transfer from Language to Vision

Python 242 12 Updated Jul 12, 2024

yunlong10 / Awesome-LLMs-for-Video-Understanding

🔥🔥🔥Latest Papers, Codes and Datasets on Vid-LLMs.

997 55 Updated Jul 23, 2024

ttengwang / Awesome_Long_Form_Video_Understanding

Awesome papers & datasets specifically focused on long-term videos.

122 4 Updated Jul 15, 2024

OpenMOSS / AnyGPT

Code for "AnyGPT: Unified Multimodal LLM with Discrete Sequence Modeling"

Python 663 49 Updated Jul 9, 2024

rese1f / MovieChat

[CVPR 2024] 🎬💭 chat with over 10K frames of video!

Python 464 38 Updated Jun 16, 2024

tatsu-lab / stanford_alpaca

Code and documentation to train Stanford's Alpaca models, and generate the data.

Python 29,184 4,020 Updated Jul 17, 2024

haotian-liu / LLaVA

[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.

Python 18,298 1,997 Updated Jul 14, 2024

dvlab-research / LISA

Project Page for "LISA: Reasoning Segmentation via Large Language Model"

Python 1,664 113 Updated Jul 2, 2024

PKU-YuanGroup / LanguageBind

【ICLR 2024🔥】 Extending Video-Language Pretraining to N-modality by Language-based Semantic Alignment

Python 629 48 Updated Mar 25, 2024

DAMO-NLP-SG / VideoLLaMA2

VideoLLaMA 2: Advancing Spatial-Temporal Modeling and Audio Understanding in Video-LLMs

Python 549 33 Updated Jul 19, 2024

X-PLUG / mPLUG-Owl

mPLUG-Owl & mPLUG-Owl2: Modularized Multimodal Large Language Model

Python 2,031 158 Updated Apr 5, 2024

boheumd / MA-LMM

(2024CVPR) MA-LMM: Memory-Augmented Large Multimodal Model for Long-Term Video Understanding

Python 191 24 Updated Jul 19, 2024

LLaVA-VL / LLaVA-NeXT

Python 1,325 72 Updated Jul 22, 2024

BradyFU / Video-MME

✨✨Video-MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-modal LLMs in Video Analysis

315 11 Updated Jun 18, 2024

clashdownload / Clash

Clash官网各版本Clash下载地址及备份下载地址

218 17 Updated Jul 5, 2024

cyh-0 / CAVP

Official code for "A Closer Look at Audio-Visual Segmentation"

Python 105 18 Updated Jul 11, 2024

facebookresearch / Mask2Former

Code release for "Masked-attention Mask Transformer for Universal Image Segmentation"

Python 2,369 366 Updated Feb 17, 2024

YuanGongND / ast

Code for the Interspeech 2021 paper "AST: Audio Spectrogram Transformer".

Jupyter Notebook 1,075 205 Updated May 21, 2023

allenai / unified-io-2.pytorch

Python 53 1 Updated Jul 3, 2024

OpenNLPLab / AVSBench

[ECCV 2022] Official implementation of the paper: Audio-Visual Segmentation

Python 438 41 Updated Nov 28, 2023

allenai / unified-io-2

Python 542 25 Updated Feb 15, 2024

allenai / unified-io-inference

Jupyter Notebook 213 26 Updated Dec 18, 2023

CserDu CserDu

Block or report CserDu

Lists (9)

Adapter

Affect Computing

diffusion model

GANs

listening talking face

LLM

long-video

talking face

web

Stars