Skip to content
View zideliu's full-sized avatar
:octocat:
No Time To Die
:octocat:
No Time To Die
  • Zhejiang University
  • Hangzhou Zhejiang
  • 05:20 (UTC +08:00)

Highlights

  • Pro

Block or report zideliu

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Beta Lists are currently in beta. Share feedback and report bugs.
Showing results

Depth Any Video with Scalable Synthetic Data

Python 194 7 Updated Oct 15, 2024

Rectified Flow Inversion (RF-Inversion)

122 4 Updated Oct 15, 2024

Official code implementation of General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model

Python 5,422 450 Updated Oct 13, 2024

Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"

Python 3,661 304 Updated Oct 16, 2024

A collection of ZSH frameworks, plugins, themes and tutorials.

Shell 15,326 544 Updated Oct 16, 2024

Official implementations for paper: Dynamic Typography: Bringing Text to Life via Video Diffusion Prior

Python 270 13 Updated Apr 25, 2024

Memory optimized finetuning scripts for CogVideoX using TorchAO and DeepSpeed

Python 233 19 Updated Oct 16, 2024

ControlNet++: All-in-one ControlNet for image generations and editing!

Python 1,702 39 Updated Sep 30, 2024

[ECCV 2024] Official Repository for DiffiT: Diffusion Vision Transformers for Image Generation

446 14 Updated Jul 1, 2024

Next-Token Prediction is All You Need

Python 972 26 Updated Oct 8, 2024

Fine-Tuning Image-Conditional Diffusion Models is Easier than You Think

Python 293 3 Updated Sep 26, 2024

(NeurIPS 2024 Oral 🔥) Improved Distribution Matching Distillation for Fast Image Synthesis

Python 474 25 Updated Sep 27, 2024

🔨AI 方向好用的科研工具

2,338 348 Updated Jun 10, 2024

Open-MAGVIT2: Democratizing Autoregressive Visual Generation

Python 653 26 Updated Sep 27, 2024

「大模型」3小时完全从0训练26M的小参数GPT,个人显卡即可推理训练!

Python 2,330 277 Updated Oct 16, 2024

(ෆ`꒳´ෆ) A Survey on Text-to-Image Generation/Synthesis.

2,126 190 Updated Oct 9, 2024

Diffusion Illusions: Hiding Images in Plain Sight

Jupyter Notebook 186 10 Updated Oct 13, 2024

PT 助手 Plus,为 Microsoft Edge、Google Chrome、Firefox 浏览器插件(Web Extensions),主要用于辅助下载 PT 站的种子。

JavaScript 6,813 845 Updated Oct 15, 2024

Generative Camera Dolly: Extreme Monocular Dynamic Novel View Synthesis (ECCV 2024 Oral) - Official Implementation

Python 165 3 Updated Sep 13, 2024

[ICML'24] SeeAct is a system for generalist web agents that autonomously carry out tasks on any given website, with a focus on large multimodal models (LMMs) such as GPT-4V(ision).

Python 605 74 Updated Aug 26, 2024

Official code for "DPM-Solver: A Fast ODE Solver for Diffusion Probabilistic Model Sampling in Around 10 Steps" (Neurips 2022 Oral)

Python 1,534 120 Updated Feb 6, 2024
Python 100 3 Updated Feb 26, 2024

[Neurips 2023] T2I-CompBench: A Comprehensive Benchmark for Open-world Compositional Text-to-image Generation

Python 194 6 Updated Aug 21, 2024

[CVPR 2024 🔥] Grounding Large Multimodal Model (GLaMM), the first-of-its-kind model capable of generating natural language responses that are seamlessly integrated with object segmentation masks.

Python 757 37 Updated Jun 2, 2024

NAS媒体库自动化管理工具

Python 6,431 772 Updated Oct 16, 2024

High-resolution models for human tasks.

Python 4,260 228 Updated Oct 15, 2024

Evaluating text-to-image/video/3D models with VQAScore

Python 194 17 Updated Sep 9, 2024

MixTeX multimodal LaTeX, ZhEn, and, Table OCR. It performs efficient CPU-based inference in a local offline on Windows.

Python 718 35 Updated Oct 1, 2024

Kolors Team

Python 3,717 248 Updated Sep 4, 2024
Python 1,490 109 Updated Sep 23, 2024
Next