![chrome-extension logo](https://raw.githubusercontent.com/github/explore/8eaa4711f3b6015070483ff1c3b707292304efe4/topics/chrome-extension/chrome-extension.png)
-
Northeastern University, SmileLab
- Boston
- https://dddraxxx.github.io/
- https://orcid.org/0000-0001-5125-7218
Highlights
- Pro
Block or Report
Block or report dddraxxx
Contact GitHub support about this user’s behavior. Learn more about reporting abuse.
Report abuseLists (9)
Sort Name ascending (A-Z)
Language
Sort by: Recently starred
Starred repositories
Helper functions to create COCO datasets
High-Resolution Image Synthesis with Latent Diffusion Models
Cornell NLVR and NLVR2 are natural language grounding datasets. Each example shows a visual input and a sentence describing it, and is annotated with the truth-value of the sentence.
An efficient, flexible and full-featured toolkit for fine-tuning LLM (InternLM2, Llama3, Phi3, Qwen, Mistral, ...)
Evaluation code for Ref-L4, a new REC benchmark in the LMM era
[CVPR2023] The code for 《Position-guided Text Prompt for Vision-Language Pre-training》
Hackable and optimized Transformers building blocks, supporting a composable construction.
Open-source evaluation toolkit of large vision-language models (LVLMs), support ~100 VLMs, 30+ benchmarks
InternLM-XComposer-2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output
[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的可商用开源多模态对话模型
OpenCompass is an LLM evaluation platform, supporting a wide range of models (Llama3, Mistral, InternLM2,GPT-4,LLaMa2, Qwen,GLM, Claude, etc) over 100+ datasets.
[ICCV2023] VLPart: Going Denser with Open-Vocabulary Part Segmentation
Emu Series: Generative Multimodal Models from BAAI
Easily turn large sets of image urls to an image dataset. Can download, resize and package 100M urls in 20h on one machine.
This repo contains evaluation code for the paper "Are We on the Right Way for Evaluating Large Vision-Language Models"
We write your reusable computer vision tools. 💜
A Jupyter widget for annotating images with bounding boxes
The Amazon S3 Connector for PyTorch delivers high throughput for PyTorch training jobs that access and store data in Amazon S3.
PyTorch re-implementation of DeepLab v2 on COCO-Stuff / PASCAL VOC datasets
An official PyTorch implementation of the CRIS paper
LMDeploy is a toolkit for compressing, deploying, and serving LLMs.
(CVPR2024)A benchmark for evaluating Multimodal LLMs using multiple-choice questions.
The official repo of Qwen-VL (通义千问-VL) chat & pretrained large vision language model proposed by Alibaba Cloud.
Official repo for "Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models"