-
Tsinghua University
- Beijing
- https://qijimrc.github.io
Block or Report
Block or report qijimrc
Contact GitHub support about this user’s behavior. Learn more about reporting abuse.
Report abuseStars
Language
Sort by: Recently starred
Memory optimization and training recipes to extrapolate language models' context length to 1 million tokens, with minimal hardware.
P2P terminal game about spacepirates playing basketball across the galaxy
A lightweight, terminal-based application to view and query delimiter separated value formatted documents, such as CSV or TSV files.
Your journal app if you live in a terminal
A feature-rich command-line audio/video downloader
SGLang is yet another fast serving framework for large language models and vision language models.
LVBench: An Extreme Long Video Understanding Benchmark
GLM-4 series: Open Multilingual Multimodal Chat LMs | 开源多语言多模态对话模型
GPT4V-level open-source multi-modal model based on Llama3-8B
Mixture-of-Experts for Large Vision-Language Models
MathVista: data, code, and evaluation for Mathematical Reasoning in Visual Contexts
MambaOut: Do We Really Need Mamba for Vision?
A Unified Pixel-level Vision LLM for Understanding, Generating, Segmenting, Editing
Code for paper "Super-CLEVR: A Virtual Benchmark to Diagnose Domain Robustness in Visual Reasoning"
VILA - a multi-image visual language model with training, inference and evaluation recipe, deployable from cloud to edge (Jetson Orin and laptops)
A series of large language models trained from scratch by developers @01-ai
ImageBind One Embedding Space to Bind Them All
The official implementation of "Relay Diffusion: Unifying diffusion process across resolutions for image synthesis" [ICLR 2024 Spotlight]
📰 Must-read papers and blogs on LLM based Long Context Modeling 🔥
This project aim to reproduce Sora (Open AI T2V model), we wish the open source community contribute to this project.
[CVPR 2024] Panda-70M: Captioning 70M Videos with Multiple Cross-Modality Teachers
A curated list of reinforcement learning with human feedback resources (continually updated)
LongAlign: A Recipe for Long Context Alignment Encompassing Data, Training, and Evaluation
Scenic: A Jax Library for Computer Vision Research and Beyond
Reading list for research topics in multimodal machine learning