jwyang

🏠

Jianwei Yang jwyang

🏠

Principal Researcher @ MSR

1.8k followers · 31 following

Microsoft
Redmond, WA
https://jwyang.github.io

Organizations

Stars

mu-cai / matryoshka-mm

Matryoshka Multimodal Models

Python 67 4 Updated Aug 22, 2024

zzxslp / SoM-LLaVA

[COLM-2024] List Items One by One: A New Data Source and Learning Paradigm for Multimodal LLMs

Python 112 2 Updated Aug 23, 2024

myshell-ai / JetMoE

Reaching LLaMA2 Performance with 0.1M Dollars

Python 955 77 Updated Jul 23, 2024

jzhang38 / EasyContext

Memory optimization and training recipes to extrapolate language models' context length to 1 million tokens, with minimal hardware.

Python 604 41 Updated Jul 26, 2024

FoundationVision / GLEE

[CVPR2024 Highlight]GLEE: General Object Foundation Model for Images and Videos at Scale

Python 1,025 82 Updated Aug 8, 2024

roboflow / multimodal-maestro

streamline the fine-tuning process for multimodal models: PaliGemma, Florence-2, Phi-3.5 Vision

Python 1,228 89 Updated Sep 12, 2024

UX-Decoder / DINOv

[CVPR 2024] Official implementation of the paper "Visual In-context Learning"

Python 363 15 Updated Apr 8, 2024

ishan0102 / vimGPT

Browse the web with GPT-4V and Vimium

Python 2,594 197 Updated Aug 10, 2024

roboflow / awesome-openai-vision-api-experiments

Must-have resource for anyone who wants to experiment with and build on the OpenAI vision API 🔥

Python 1,626 126 Updated Feb 22, 2024

ddupont808 / GPT-4V-Act

AI agent using GPT-4V(ision) capable of using a mouse/keyboard to interact with web UI

JavaScript 951 86 Updated Jan 31, 2024

microsoft / SoM

Set-of-Mark Prompting for GPT-4V and LMMs

Python 1,093 85 Updated Aug 19, 2024

vllm-project / vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 26,701 3,911 Updated Sep 14, 2024

microsoft / X-Decoder

[CVPR 2023] Official Implementation of X-Decoder for generalized decoding for pixel, image and language

Python 1,282 132 Updated Oct 5, 2023

TalalWasim / Video-FocalNets

Official repository for "Video-FocalNets: Spatio-Temporal Focal Modulation for Video Action Recognition" [ICCV 2023]

Python 84 16 Updated Apr 30, 2024

UX-Decoder / Semantic-SAM

[ECCV 2024] Official implementation of the paper "Semantic-SAM: Segment and Recognize Anything at Any Granularity"

Python 2,254 107 Updated Jul 19, 2024

Zhendong-Wang / Prompt-Diffusion

Official PyTorch implementation of the paper "In-Context Learning Unlocked for Diffusion Models"

Python 370 9 Updated Mar 25, 2024

UX-Decoder / Segment-Everything-Everywhere-All-At-Once

[NeurIPS 2023] Official implementation of the paper "Segment Everything Everywhere All at Once"

Python 4,308 382 Updated Aug 19, 2024

google-research / arxiv-latex-cleaner

arXiv LaTeX Cleaner: Easily clean the LaTeX code of your paper to submit to arXiv

Python 5,188 325 Updated Jul 21, 2024

IDEA-Research / OpenSeeD

[ICCV 2023] Official implementation of the paper "A Simple Framework for Open-Vocabulary Segmentation and Detection"

Python 637 39 Updated Jan 22, 2024

[ICLR'23 Spotlight🔥] The first successful BERT/MAE-style pretraining on any convolutional network; Pytorch impl. of "Designing BERT for Convolutional Networks: Sparse and Hierarchical Masked Modeling"

Python 1,421 82 Updated Jan 23, 2024

zjc062 / mind-vis

Code base for MinD-Vis

Python 743 91 Updated May 24, 2023

timothybrooks / instruct-pix2pix

Python 6,249 529 Updated Mar 3, 2024

givkashi / Focal-Unet

Focal-Unet: Unet-like Focal Modulation for Medical Image Segmentation

Python 39 8 Updated May 27, 2023

FocalNet / FocalNet-DINO

Forked from IDEA-Research/DINO

This repo contains the code and configuration files for reproducing object detection results of FocalNets with DINO

Python 64 10 Updated Mar 10, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly