Skip to content
View iamxiaoyubei's full-sized avatar
💭
I may be slow to respond.
💭
I may be slow to respond.
  • SYSU
  • China
Block or Report

Block or report iamxiaoyubei

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

A prize winning solution for DFDC challenge

Python 745 198 Updated Feb 11, 2023

DeepFaceLab is the leading software for creating deepfakes.

Python 46,427 10,402 Updated Oct 24, 2023

Deepfakes Software For All

Python 49,978 13,012 Updated Jul 13, 2024

Official PyTorch Implementation of "Scalable Diffusion Models with Transformers"

Python 5,774 509 Updated May 31, 2024

A comprehensive benchmark of deepfake detection

Python 408 55 Updated Jul 12, 2024

Grounded Language-Image Pre-training

Python 2,101 188 Updated Jan 24, 2024

[CVPR 2022] Official code for "RegionCLIP: Region-based Language-Image Pretraining"

Python 679 50 Updated Mar 20, 2024

official implementation of "Interpreting CLIP's Image Representation via Text-Based Decomposition"

Jupyter Notebook 132 16 Updated Jun 7, 2024

This is the official repository for M2UGen

Jupyter Notebook 426 39 Updated May 8, 2024

[CVPR2024] The code for "Osprey: Pixel Understanding with Visual Instruction Tuning"

Python 730 42 Updated Apr 15, 2024

AppAgent: Multimodal Agents as Smartphone Users, an LLM-based multimodal agent framework designed to operate smartphone apps.

Python 4,718 496 Updated Jul 23, 2024

PyTorch Implementation of "V* : Guided Visual Search as a Core Mechanism in Multimodal LLMs"

Python 480 31 Updated Jan 7, 2024

Recent LLM-based CV and related works. Welcome to comment/contribute!

804 33 Updated Jun 5, 2024

Implementation for "DualCoOp: Fast Adaptation to Multi-Label Recognition with Limited Annotations" (NeurIPS 2022))

Python 47 6 Updated Oct 24, 2023
Python 81 6 Updated Sep 23, 2023

Chinese version of CLIP which achieves Chinese cross-modal retrieval and representation generation.

Python 4,048 421 Updated Nov 29, 2023

CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an image

Jupyter Notebook 23,878 3,148 Updated Jul 23, 2024

🦦 Otter, a multi-modal model based on OpenFlamingo (open-sourced version of DeepMind's Flamingo), trained on MIMIC-IT and showcasing improved instruction-following and in-context learning ability.

Python 3,526 241 Updated Mar 5, 2024

X-LLM: Bootstrapping Advanced Large Language Models by Treating Multi-Modalities as Foreign Languages

Python 299 16 Updated Aug 10, 2023

[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.

Python 18,393 2,013 Updated Jul 14, 2024
65 3 Updated Jul 1, 2023

ImaginaryNet: Learning Object Detectors without Real Images and Annotations

Jupyter Notebook 24 1 Updated Mar 11, 2023

Fast Segment Anything

Python 7,195 677 Updated Jul 26, 2024

Image to prompt with BLIP and CLIP

Python 2,599 429 Updated May 15, 2024

SD.Next: Advanced Implementation of Stable Diffusion and other Diffusion-based generative image models

Python 5,352 382 Updated Jul 28, 2024

Stable Diffusion web UI

Python 136,813 26,047 Updated Jul 28, 2024

A playbook for systematically maximizing the performance of deep learning models.

25,980 2,164 Updated Jun 18, 2024

GLIDE: a diffusion-based text-conditional image synthesis model

Python 3,500 493 Updated Mar 8, 2024

Multiple Stable Diffusion Projects.

Python 6 1 Updated Dec 19, 2022

Is synthetic data from generative models ready for image recognition?

Python 168 6 Updated Feb 16, 2023
Next