Skip to content
View Zth9730's full-sized avatar
🥬
Ataraxy
🥬
Ataraxy
  • Computer of Science and Technology Beijing

Highlights

  • Pro
Block or Report

Block or report Zth9730

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Reading list for research topics in multimodal machine learning

5,779 836 Updated Jun 19, 2024

[INTERSPEECH 2023] Knowledge Transfer from Pre-trained Language Models to Cif-based Recognizers via Hierarchical Distillation

Python 36 5 Updated Sep 1, 2023

Code and Pretrained Models for ICLR 2023 Paper "Contrastive Audio-Visual Masked Autoencoder".

Python 219 21 Updated Mar 20, 2024

This repository is an unoffical PyTorch implementation of Semi-Supervised Learning on Medical Segmentation in 2D and 3D.

Python 42 12 Updated May 14, 2022

A library for speech data augmentation in time-domain

Python 634 57 Updated Aug 30, 2021

✨✨Latest Advances on Multimodal Large Language Models

11,170 736 Updated Aug 14, 2024

Implementation of 🦩 Flamingo, state-of-the-art few-shot visual question answering attention net out of Deepmind, in Pytorch

Python 1,184 59 Updated Oct 18, 2022

Promting Whisper for Audio-Visual Speech Recognition, Code-Switched Speech Recognition, and Zero-Shot Speech Translation

Python 131 11 Updated Jan 16, 2024

INTERSPEECH 2023-2024 Papers: A complete collection of influential and exciting research papers from the INTERSPEECH 2023-24 conference. Explore the latest advances in speech and language processin…

619 42 Updated Aug 9, 2024

FAIR Sequence Modeling Toolkit 2

Python 653 69 Updated Aug 17, 2024
Python 161 37 Updated Aug 14, 2024

Convert Machine Learning Code Between Frameworks

Python 14,020 5,797 Updated Aug 16, 2024

Learning audio concepts from natural language supervision

Python 446 36 Updated May 27, 2024

Rectified Rotary Position Embeddings

Python 327 27 Updated May 20, 2024

Code and Pretrained Models for Interspeech 2023 Paper "Whisper-AT: Noise-Robust Automatic Speech Recognizers are Also Strong Audio Event Taggers"

Python 303 25 Updated Feb 21, 2024

This repository is an unoffical PyTorch implementation of Medical segmentation in 2D and 3D.

Python 838 196 Updated Feb 29, 2024

Arxiv个性化定制化模版,实现对特定领域的相关内容、作者与学术会议的有效跟进。

CSS 226 19 Updated Aug 16, 2024

Open source core of Synergy, the cross-platform keyboard and mouse sharing tool (Windows, macOS, Linux)

C++ 10,185 3,623 Updated Aug 16, 2024

Production First and Production Ready End-to-End Speech Recognition Toolkit

Python 4,014 1,052 Updated Aug 16, 2024

Foundation Architecture for (M)LLMs

Python 2,988 202 Updated Apr 11, 2024

An implementation of "Retentive Network: A Successor to Transformer for Large Language Models"

Python 1,152 99 Updated Oct 22, 2023

Official release of InternLM2.5 base and chat models. 1M context support

Python 6,079 431 Updated Aug 14, 2024

Official repository of OFA (ICML 2022). Paper: OFA: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning Framework

Python 2,383 248 Updated Apr 24, 2024

IFSeg: Image-free Semantic Segmentation via Vision-Language Model (CVPR 2023)

Python 79 9 Updated Sep 5, 2023

Source code for paper "Improving Task-Specific Generalization in Few-Shot Learning via Adaptive Vicinal Risk Minimization"

Python 4 Updated Mar 1, 2023

Central place for the engineering/scaling WG: documentation, SLURM scripts and logs, compute environment and data.

Shell 971 100 Updated Jul 29, 2024

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

Python 34,341 4,011 Updated Aug 16, 2024

Fine-tune the Whisper speech recognition model to support training without timestamp data, training with timestamp data, and training without speech data. Accelerate inference and support Web deplo…

C 779 122 Updated Jul 18, 2024

A playbook for systematically maximizing the performance of deep learning models.

26,145 2,179 Updated Jun 18, 2024

Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities

Python 19,374 2,464 Updated Aug 12, 2024