Block or Report
Block or report ellenzhuwang
Contact GitHub support about this user’s behavior. Learn more about reporting abuse.
Report abuseStars
Language
Sort by: Recently starred
[ECCV 2024] official code for "Long-CLIP: Unlocking the Long-Text Capability of CLIP"
An open source implementation of CLIP.
Multi-Modal learning toolkit based on PaddlePaddle and PyTorch, supporting multiple applications such as multi-modal classification, cross-modal retrieval and image caption.
[CVPR 2022 - Demo Track] - Effective conditioned and composed image retrieval combining CLIP-based features
Generating captions on image datasets using MiniGPT-v2
Open-sourced codes for MiniGPT-4 and MiniGPT-v2 (https://minigpt-4.github.io, https://minigpt-v2.github.io/)
Run any open-source LLMs, such as Llama 3.1, Gemma, as OpenAI compatible API endpoint in the cloud.
Chinese-Vicuna: A Chinese Instruction-following LLaMA-based Model —— 一个中文低资源的llama+lora方案,结构参考alpaca
An end-to-end vision and language model incorporating explicit knowledge graphs and OOD-detection.
😎 up-to-date & curated list of awesome LMM hallucinations papers, methods & resources.
Finetuning Large Language Models on One Consumer GPU in Under 4 Bits
Code for the ICLR 2023 paper "GPTQ: Accurate Post-training Quantization of Generative Pretrained Transformers".
Code for a New Loss for Mitigating the Bias of Learning Difficulties in Generative Language Models
✨✨Latest Advances on Multimodal Large Language Models
Open-source and strong foundation image recognition models.
Official repo for "Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models"
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
The official homepage of the COCO-Stuff dataset.
[ICLR 24] MaGIC: Multi-modality Guided Image Completion
ai副业赚钱大集合,教你如何利用ai做一些副业项目,赚取更多额外收益。The Ultimate Guide to Making Money with AI Side Hustles: Learn how to leverage AI for some cool side gigs and rake in some extra cash. Check out the English versi…
[ACM MM23] CLIP-Count: Towards Text-Guided Zero-Shot Object Counting
[ICCV 2023] Official implementation of the paper "A Simple Framework for Open-Vocabulary Segmentation and Detection"
This method uses Segment Anything and CLIP to ground and count any object that matches a custom text prompt, without requiring any point or box annotation.
[NeurIPS2023] Code release for "Hierarchical Open-vocabulary Universal Image Segmentation"