- Seattle
Highlights
- Pro
Block or Report
Block or report Andy-Cheng
Contact GitHub support about this user’s behavior. Learn more about reporting abuse.
Report abuseStars
Language
Sort by: Recently starred
Video-LLaVA: Learning United Visual Representation by Alignment Before Projection
OpenEQA Embodied Question Answering in the Era of Foundation Models
An official implementation of ShareGPT4Video: Improving Video Understanding and Generation with Better Captions
Accelerating the development of large multimodal models (LMMs) with lmms-eval
Open-source evaluation toolkit of large vision-language models (LVLMs), support GPT-4v, Gemini, QwenVLPlus, 50+ HF models, 20+ benchmarks
Cambrian-1 is a family of multimodal LLMs with a vision-centric design.
Long Context Transfer from Language to Vision
This is the official implementation of the paper "Needle In A Multimodal Haystack"
CoTracker is a model for tracking any point (pixel) on a video.
This is a Phi-3 book for getting started with Phi-3. Phi-3, a family of open AI models developed by Microsoft. Phi-3 models are the most capable and cost-effective small language models (SLMs) avai…
ImageBind One Embedding Space to Bind Them All
CVPR and NeurIPS poster examples and templates. May we have in-person poster session soon!
Allows to use your GoPro camera as a webcam on linux
Monocular, One-stage, Regression of Multiple 3D People and their 3D positions & trajectories in camera & global coordinates. ROMP[ICCV21], BEV[CVPR22], TRACE[CVPR2023]
Bimanual Dexterous Teleoperation with Real-Time Retargeting using VisionPro
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
Code Repository for Liquid Time-Constant Networks (LTCs)
An efficient, flexible and full-featured toolkit for fine-tuning LLM (InternLM2, Llama3, Phi3, Qwen, Mistral, ...)
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
[ECCV 2022]Code for paper "DaViT: Dual Attention Vision Transformer"
💫 Industrial-strength Natural Language Processing (NLP) in Python
Flexible and powerful tensor operations for readable and reliable code (for pytorch, jax, TF and others)
🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.
"MST++: Multi-stage Spectral-wise Transformer for Efficient Spectral Reconstruction" (CVPRW 2022) & (Winner of NTIRE 2022 Spectral Recovery Challenge) and a toolbox for spectral reconstruction
[CVPR 2023 Best Paper] Planning-oriented Autonomous Driving
[Incl. GenAD, CVPR 2024 Highlight] Embracing Foundation Models into Autonomous Agent and System