patrick-tssn

Yuxuan Wang patrick-tssn

No pride and no prejudice

55 followers · 347 following

Peking University
https://patrick-tssn.github.io

Achievements

Block or Report

Block or report patrick-tssn

Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Stars

openvinotoolkit / openvino

OpenVINO™ is an open-source toolkit for optimizing and deploying AI inference

C++ 6,555 2,117 Updated Jul 26, 2024

segment-any-text / wtpsplit

Toolkit to segment text into sentences or other semantic units in a robust, efficient and adaptable way.

Python 634 37 Updated Jul 12, 2024

mbodiai / embodied-agents

Seamlessly integrate state-of-the-art transformer models into robotics stacks

Python 135 17 Updated Jul 26, 2024

siyuyuan / evoagent

Resources for our paper: "EvoAgent: Towards Automatic Multi-Agent Generation via Evolutionary Algorithms"

Python 67 7 Updated Jul 12, 2024

cambrian-mllm / cambrian

Cambrian-1 is a family of multimodal LLMs with a vision-centric design.

Python 1,600 98 Updated Jul 26, 2024

EvolvingLMMs-Lab / LongVA

Long Context Transfer from Language to Vision

Python 251 12 Updated Jul 12, 2024

karpathy / LLM101n

LLM101n: Let's build a Storyteller

25,631 1,360 Updated Jul 21, 2024

voxel51 / fiftyone

The open-source tool for building high-quality datasets and computer vision models

Python 7,934 520 Updated Jul 26, 2024

facebookresearch / chameleon

Repository for Meta Chameleon, a mixed-modal early-fusion foundation model from FAIR.

Python 1,591 96 Updated Jul 26, 2024

patrick-tssn / VideoHallucer

VideoHallucer, The first comprehensive benchmark for hallucination detection in large video-language models (LVLMs)

Python 14 Updated Jun 25, 2024

mshukor / ima-lmms

Official code for (IMA) Implicit Multimodal Alignment: On the Generalization of Frozen LLMs to Multimodal Inputs

7 Updated May 28, 2024

mlfoundations / MINT-1T

MINT-1T: A one trillion token multimodal interleaved dataset.

450 6 Updated Jul 24, 2024

lucasjinreal / LLaVA-Magvit2

Python 30 2 Updated Jun 20, 2024

TencentARC / Open-MAGVIT2

Open-MAGVIT2: Democratizing Autoregressive Visual Generation

Python 345 10 Updated Jul 10, 2024

patrick-tssn / MM-NIAVH

Pressure Testing Large Video-Language Models (LVLM): Doing multimodal retrieval from LVLM at any video lengths to measure accuracy

Python 3 1 Updated Jun 21, 2024

HITsz-TMG / UMOE-Scaling-Unified-Multimodal-LLMs

The codes about "Uni-MoE: Scaling Unified Multimodal Models with Mixture of Experts"

Python 738 33 Updated Jul 23, 2024

OpenGVLab / OmniCorpus

OmniCorpus: A Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text

201 4 Updated Jun 16, 2024

FoundationVision / OmniTokenizer

OmniTokenizer: one model and one weight for image-video joint tokenization.

Python 199 4 Updated Jul 9, 2024

openvla / openvla

Forked from TRI-ML/prismatic-vlms

OpenVLA: An open-source vision-language-action model for robotic manipulation.

Python 736 85 Updated Jul 15, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Yuxuan Wang patrick-tssn

Achievements

Achievements

Block or report patrick-tssn

Stars

openvinotoolkit / openvino

segment-any-text / wtpsplit

mbodiai / embodied-agents

siyuyuan / evoagent

cambrian-mllm / cambrian

EvolvingLMMs-Lab / LongVA

karpathy / LLM101n

voxel51 / fiftyone

facebookresearch / chameleon

patrick-tssn / VideoHallucer

mshukor / ima-lmms

mlfoundations / MINT-1T

lucasjinreal / LLaVA-Magvit2

TencentARC / Open-MAGVIT2

patrick-tssn / MM-NIAVH

HITsz-TMG / UMOE-Scaling-Unified-Multimodal-LLMs

OpenGVLab / OmniCorpus

FoundationVision / OmniTokenizer

openvla / openvla

tencent-ailab / V-Express

ShihaoZhaoZSH / LaVi-Bridge

OpenGVLab / InternVL

2noise / ChatTTS

IDEA-Research / MotionLLM

RL4VLM / RL4VLM

MC-E / ReVideo

maitrix-org / Pandora

minyoungg / platonic-rep

robocasa / robocasa

fkodom / fft-conv-pytorch