Skip to content
View FlyingRoastDuck's full-sized avatar
🎯
Focusing
🎯
Focusing
Block or Report

Block or report FlyingRoastDuck

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

[CVPR 2024] Official repo for "InteractDiffusion: Interaction-Control for Text-to-Image Diffusion Model".

Python 86 6 Updated Jun 26, 2024

The MIG benchmark of CVPR2024 MIGC

Jupyter Notebook 7 Updated Mar 3, 2024

Papers and resources on Controllable Generation using Diffusion Models, including ControlNet, DreamBooth, T2I-Adapter, IP-Adapter.

329 19 Updated Aug 12, 2024

Official Implementation of ICLR'24: Kosmos-G: Generating Images in Context with Multimodal Large Language Models

Python 38 2 Updated May 25, 2024

PartCraft: Crafting Creative Objects by Parts (ECCV2024)

Python 71 1 Updated Aug 3, 2024

Official code for paper: Auto Cherry-Picker: Learning from High-quality Generative Data Driven by Language

Python 19 Updated Jul 1, 2024

[CVPR 2024] Code release for "InstanceDiffusion: Instance-level Control for Image Generation"

Python 450 23 Updated Jul 16, 2024

Rich-Text-to-Image Generation

Python 749 63 Updated Oct 9, 2023

Official implementation of CVPR 2024 paper: "FreeControl: Training-Free Spatial Control of Any Text-to-Image Diffusion Model with Any Condition"

Python 386 13 Updated Feb 27, 2024

Open-MAGVIT2: Democratizing Autoregressive Visual Generation

Python 368 10 Updated Aug 7, 2024

Official Pytorch Implementation for “Plug-and-Play Diffusion Features for Text-Driven Image-to-Image Translation” (CVPR 2023)

Python 884 52 Updated Jun 19, 2023

[ICML 2024] Mastering Text-to-Image Diffusion: Recaptioning, Planning, and Generating with Multimodal LLMs (PRG)

Jupyter Notebook 1,632 90 Updated Jul 31, 2024

Official PyTorch implementation of "A Unified Approach for Text- and Image-guided 4D Scene Generation", [CVPR 2024]

Python 52 3 Updated Apr 23, 2024

A WebUI for Efficient Fine-Tuning of 100+ LLMs (ACL 2024)

Python 28,956 3,546 Updated Aug 14, 2024

A curated list of recent diffusion models for video generation, editing, restoration, understanding, etc.

2,986 182 Updated Aug 9, 2024

This project aim to reproduce Sora (Open AI T2V model), we wish the open source community contribute to this project.

Python 11,118 988 Updated Aug 14, 2024

A GPipe implementation in PyTorch

Python 791 97 Updated Jul 25, 2024

[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.

Python 18,706 2,051 Updated Aug 12, 2024

Open-Sora: Democratizing Efficient Video Production for All

Python 21,217 2,020 Updated Aug 9, 2024

本项目旨在分享大模型相关技术原理以及实战经验。

HTML 8,523 828 Updated Aug 11, 2024

ELITE: Encoding Visual Concepts into Textual Embeddings for Customized Text-to-Image Generation (ICCV 2023, Oral)

Python 498 28 Updated Jan 8, 2024

FlashInfer: Kernel Library for LLM Serving

Cuda 971 88 Updated Aug 14, 2024

Course to get into Large Language Models (LLMs) with roadmaps and Colab notebooks.

Jupyter Notebook 35,780 3,759 Updated Jul 28, 2024

Simple and efficient pytorch-native transformer text generation in <1000 LOC of python.

Python 5,449 501 Updated Aug 1, 2024

Paper list about multimodal and large language models, only used to record papers I read in the daily arxiv for personal needs.

488 29 Updated Aug 13, 2024

A comprehensive guide to building RAG-based LLM applications for production.

Jupyter Notebook 1,645 209 Updated Aug 2, 2024

The image prompt adapter is designed to enable a pretrained text-to-image diffusion model to generate images with image prompt.

Jupyter Notebook 4,798 312 Updated Jun 28, 2024

[CVPR 2023] Official repository of paper titled "MaPLe: Multi-modal Prompt Learning".

Python 610 43 Updated Jul 24, 2023

Recent LLM-based CV and related works. Welcome to comment/contribute!

813 33 Updated Jun 5, 2024

🤗 Diffusers: State-of-the-art diffusion models for image and audio generation in PyTorch and FLAX.

Python 24,621 5,079 Updated Aug 14, 2024
Next