-
Georgia Tech
- Atlanta, GA
Highlights
- Pro
Block or Report
Block or report MihawkHu
Contact GitHub support about this user’s behavior. Learn more about reporting abuse.
Report abuseStars
Language
Sort by: Recently starred
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficie…
Utilities intended for use with Llama models.
Masked Spectrogram Modeling using Masked Autoencoders for Learning General-purpose Audio Representations
A generative speech model for daily dialogue.
Up to 200x Faster Inner Products and Vector Similarity — for Python, JavaScript, Rust, C, and Swift, supporting f64, f32, f16 real & complex, i8, and binary vectors using SIMD for both x86 AVX2 & A…
A lightweight screen recorder based on ScreenCapture Kit for macOS / 基于 ScreenCapture Kit 的轻量化多功能 macOS 录屏工具
Inference and training library for high-quality TTS models.
Youku-mPLUG: A 10 Million Large-scale Chinese Video-Language Pre-training Dataset and Benchmarks
Official code for "Learning Neural Acoustic Fields" (NeurIPS 2022)
Scripts for fine-tuning Meta Llama3 with composable FSDP & PEFT methods to cover single/multi-node GPUs. Supports default & custom datasets for applications such as summarization and Q&A. Supportin…
[WIP] Layer Diffusion for WebUI (via Forge)
[CVPR 2024 Highlight] DistriFusion: Distributed Parallel Inference for High-Resolution Diffusion Models
Qwen2 is the large language model series developed by Qwen team, Alibaba Cloud.
Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audi…
marcosiniscalchi / RobustGER
Forked from YUCHEN005/RobustGERCode for paper "Large Language Models are Efficient Learners of Noise-Robust Speech Recognition"
Pretrain, finetune and deploy AI models on multiple GPUs, TPUs with zero code changes.
Code and Pretrained Models for Interspeech 2023 Paper "Whisper-AT: Noise-Robust Automatic Speech Recognizers are Also Strong Audio Event Taggers"
INTERSPEECH 23 - Refunction Whisper to recognize new tasks with adapters!
Code for loralib, an implementation of "LoRA: Low-Rank Adaptation of Large Language Models"
Foundational Models for State-of-the-Art Speech and Text Translation
This repo includes ChatGPT prompt curation to use ChatGPT better.
🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.