-
ZJU-UIUC Institute
- Hangzhou, Zhejiang, China
- https://bruceyo.github.io/
- @bruceyo7
- https://scholar.google.com.hk/citations?user=o2VAejIAAAAJ
Stars
Heterogeneous Pre-trained Transformer (HPT) as Scalable Policy Learner.
Quantized Side Tuning: Fast and Memory-Efficient Tuning of Quantized Large Language Models
[ACL 2024] An Easy-to-use Knowledge Editing Framework for LLMs.
[CVPR2024] The code of "UniPT: Universal Parallel Tuning for Transfer Learning with Efficient Parameter and Memory"
👾 A Python API wrapper for Poe.com. With this, you will have free access to GPT-4, Claude, Llama, Gemini, Mistral and more! 🚀
An Autonomous LLM Agent for Complex Task Solving
Official inference library for Mistral models
PyTorch implementation of MAR+DiffLoss https://arxiv.org/abs/2406.11838
⛏💎 STEVE in Minecraft is for See and Think: Embodied Agent in Virtual Environment
VMamba: Visual State Space Models,code is based on mamba
Official implementation of the CVPR paper Open-TransMind: A New Baseline and Benchmark for 1st Foundation Model Challenge of Intelligent Transportation
🌐 Jekyll is a blog-aware static site generator in Ruby
Easy to maintain open source documentation websites.
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
The simplest, fastest repository for training/finetuning medium-sized GPTs.
Latte: Latent Diffusion Transformer for Video Generation.
Generalized Video Anomaly Event Detection: Systematic Taxonomy and Comparison of Deep Models
FineDiving: A Fine-grained Dataset for Procedure-aware Action Quality Assessment
A GUI client for Windows, support Xray core and v2fly core and others
Official implementation of the paper "MotionAGFormer: Enhancing 3D Pose Estimation with a Transformer-GCNFormer Network" (WACV 2024).
🌟 The Multi-Agent Framework: First AI Software Company, Towards Natural Language Programming
We write your reusable computer vision tools. 💜
official implementation for Language Supervised Training for Skeleton-based Action Recognition
BEAR: a new BEnchmark on video Action Recognition
Code and models for the paper "One Transformer Fits All Distributions in Multi-Modal Diffusion"