Stars
Use PEFT or Full-parameter to finetune 350+ LLMs or 90+ MLLMs. (LLM: Qwen2.5, Llama3.2, GLM4, Internlm2.5, Yi1.5, Mistral, Baichuan2, DeepSeek, Gemma2, ...; MLLM: Qwen2-VL, Qwen2-Audio, Llama3.2-Vi…
[NeurIPS'2024] Invertible Consistency Distillation for Text-Guided Image Editing in Around 7 Steps
[NeurIPS 2024] Motion Consistency Model: Accelerating Video Diffusion with Disentangled Motion-Appearance Distillation
[COLM-2024] List Items One by One: A New Data Source and Learning Paradigm for Multimodal LLMs
Memory optimization and training recipes to extrapolate language models' context length to 1 million tokens, with minimal hardware.
PeRFlow: Piecewise Rectified Flow as Universal Plug-and-Play Accelerator (NeurIPS 2024)
Official Code for Stable Cascade
Official implementation of the paper "LangSplat: 3D Language Gaussian Splatting" [CVPR2024 Highlight]
pkuanjie / emu_trl
Forked from huggingface/trlTrain transformer language models with reinforcement learning.
VL-GPT: A Generative Pre-trained Transformer for Vision and Language Understanding and Generation
Reference implementation for DPO (Direct Preference Optimization)
[CVPR 2024] Code for the paper "Using Human Feedback to Fine-tune Diffusion Models without Any Reward Model"
PyTorch implementation of RCG https://arxiv.org/abs/2312.03701
Latent Consistency Models: Synthesizing High-Resolution Images with Few-Step Inference
GPT-4V in Wonderland: LMMs as Smartphone Agents
a state-of-the-art-level open visual language model | 多模态预训练模型
Official repo for VGen: a holistic video generation ecosystem for video generation building on diffusion models
🚀🧠💬 Supercharged Custom Instructions for ChatGPT (non-coding) and ChatGPT Advanced Data Analysis (coding).
An Open-source Framework for Data-centric, Self-evolving Autonomous Language Agents
arXiv LaTeX Cleaner: Easily clean the LaTeX code of your paper to submit to arXiv
PyTorch implementation of InstructDiffusion, a unifying and generic framework for aligning computer vision tasks with human instructions.
QAEval Experiments This repository will contain the code to reproduce the experiments from Towards Question-Answering as an Automatic Metric for Evaluating the Content Quality of a Summary.
MM-Vet: Evaluating Large Multimodal Models for Integrated Capabilities (ICML 2024)
Reverse engineered API of Microsoft's Bing Chat AI
Node.js client for Bing's new AI-powered search. It's like ChatGPT on steroids 🔥