-
HUST | Research intern at ByteDance
- Wuhan, China
- https://wjf5203.github.io/
Stars
[ECCV2024] PartGLEE: A Foundation Model for Recognizing and Parsing Any Objects
Lumina-T2X is a unified framework for Text to Any Modality Generation
LaVIT: Empower the Large Language Model to Understand and Generate Visual Content
This repo contains the code for our paper An Image is Worth 32 Tokens for Reconstruction and Generation
The open-source tool for building high-quality datasets and computer vision models
Repository for Meta Chameleon, a mixed-modal early-fusion foundation model from FAIR.
Autoregressive Model Beats Diffusion: 🦙 Llama for Scalable Image Generation
20+ high-performance LLMs with recipes to pretrain, finetune and deploy at scale.
A quick guide (especially) for trending instruction finetuning datasets
A framework for few-shot evaluation of language models.
GPT4All: Run Local LLMs on Any Device. Open-source and available for commercial use.
The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.
Use PEFT or Full-parameter to finetune 350+ LLMs or 90+ MLLMs. (LLM: Qwen2.5, Llama3.2, GLM4, Internlm2.5, Yi1.5, Mistral, Baichuan2, DeepSeek, Gemma2, ...; MLLM: Qwen2-VL, Qwen2-Audio, Llama3.2-Vi…
Awesome-LLM: a curated list of Large Language Model
📋 A list of open LLMs available for commercial use.
DataComp: In search of the next generation of multimodal datasets
[CVPR 2024] Official implementation of "ViTamin: Designing Scalable Vision Models in the Vision-language Era"
A PyTorch implementation of MAGE: MAsked Generative Encoder to Unify Representation Learning and Image Synthesis
[ECCV2024] Grounded Multimodal Large Language Model with Localized Visual Tokenization
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
[GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ultra-simple, user-friendly …
[CVPR2024 Highlight]GLEE: General Object Foundation Model for Images and Videos at Scale
[CVPR2024] Generative Region-Language Pretraining for Open-Ended Object Detection
Official PyTorch Implementation of "Scalable Diffusion Models with Transformers"
The calflops is designed to calculate FLOPs、MACs and Parameters in all various neural networks, such as Linear、 CNN、 RNN、 GCN、Transformer(Bert、LlaMA etc Large Language Model)