- UoS -> NUS & BIT
- https://mortyzaigc.netlify.app/
Block or Report
Block or report Mortyzhou-Shef-BIT
Contact GitHub support about this user’s behavior. Learn more about reporting abuse.
Report abuseLists (1)
Sort Name ascending (A-Z)
Language
Sort by: Recently starred
Starred repositories
The official repo of Qwen2-Audio chat & pretrained large audio language model proposed by Alibaba Cloud.
Official repository for "Unveiling and Mitigating Bias in Audio Visual Segmentation" in ACM MM 2024
A high-throughput and memory-efficient inference and serving engine for LLMs
[TPAMI 2023] Local-Global Context Aware Transformer for Language-Guided Video Segmentation
[2023 TPAMI] Contrastive Positive Sample Propagation along the Audio-Visual Event Line
Codebase for the paper: "TIM: A Time Interval Machine for Audio-Visual Action Recognition"
The official repo for "Ref-AVS: Refer and Segment Objects in Audio-Visual Scenes", ECCV 2024
The official repo for "Stepping Stones: A Progressive Training Strategy for Audio-Visual Semantic Segmentation", ECCV 2024
An official implementation of "Distribution-Consistent Modal Recovering for Incomplete Multimodal Learning" in PyTorch. (ICCV 2023)
Official Implementation of "Moving Object Segmentation: All You Need Is SAM (and Flow)" Junyu Xie, Charig Yang, Weidi Xie, Andrew Zisserman
Explore the Limits of Omni-modal Pretraining at Scale
This is a reposotory that includes paper、code and datasets about domain generalization-based fault diagnosis and prognosis. (基于领域泛化的故障诊断和预测,持续更新)
Faster Whisper transcription with CTranslate2
Official source code of the INTERSPEECH 2023 paper: "Audio-Visual Speech Separation in Noisy Environments with a Lightweight Iterative Model" (AVLIT)
An Audio-Visual Speech Separation Model Inspired by Cortico-Thalamo-Cortical Circuits
Official code release for "RTFS-Net: Recurrent time-frequency modelling for efficient audio-visual speech separation", accepted ICLR 2024
Repository of the WACV'24 paper "Can CLIP Help Sound Source Localization?"
Collection of awesome parameter-efficient fine-tuning resources.
SD-Eval: A Benchmark Dataset for Spoken Dialogue Understanding Beyond Words
A comprehensive collection of awesome research and other items about video domain adaptation