zideliu

Follow

No Time To Die

zideliu zideliu

No Time To Die

Follow

Lucky!

65 followers · 84 following

Zhejiang University
Hangzhou Zhejiang
05:20 (UTC +08:00)

Achievements

Achievements

Highlights

Pro

Lists (8)

Sort

3D

🦁Agent

🤩customize

hallucination

🤪inversion

LLM

😍Lucky！

📹Video

Beta Lists are currently in beta. Share feedback and report bugs.

Stars

Nightmare-n / DepthAnyVideo

Depth Any Video with Scalable Synthetic Data

Python 194 7 Updated Oct 15, 2024

LituRout / RF-Inversion

Rectified Flow Inversion (RF-Inversion)

122 4 Updated Oct 15, 2024

Ucas-HaoranWei / GOT-OCR2.0

Official code implementation of General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model

Python 5,422 450 Updated Oct 13, 2024

SWivid / F5-TTS

Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"

Python 3,661 304 Updated Oct 16, 2024

unixorn / awesome-zsh-plugins

A collection of ZSH frameworks, plugins, themes and tutorials.

Shell 15,326 544 Updated Oct 16, 2024

zliucz / animate-your-word

Official implementations for paper: Dynamic Typography: Bringing Text to Life via Video Diffusion Prior

Python 270 13 Updated Apr 25, 2024

a-r-r-o-w / cogvideox-factory

Memory optimized finetuning scripts for CogVideoX using TorchAO and DeepSpeed

Python 233 19 Updated Oct 16, 2024

xinsir6 / ControlNetPlus

ControlNet++: All-in-one ControlNet for image generations and editing!

Python 1,702 39 Updated Sep 30, 2024

NVlabs / DiffiT

[ECCV 2024] Official Repository for DiffiT: Diffusion Vision Transformers for Image Generation

446 14 Updated Jul 1, 2024

baaivision / Emu3

Next-Token Prediction is All You Need

Python 972 26 Updated Oct 8, 2024

VisualComputingInstitute / diffusion-e2e-ft

Fine-Tuning Image-Conditional Diffusion Models is Easier than You Think

Python 293 3 Updated Sep 26, 2024

tianweiy / DMD2

(NeurIPS 2024 Oral 🔥) Improved Distribution Matching Distillation for Fast Image Synthesis

Python 474 25 Updated Sep 27, 2024

bighuang624 / AI-research-tools

🔨AI 方向好用的科研工具

2,338 348 Updated Jun 10, 2024

TencentARC / Open-MAGVIT2

Open-MAGVIT2: Democratizing Autoregressive Visual Generation

Python 653 26 Updated Sep 27, 2024

jingyaogong / minimind

「大模型」3小时完全从0训练26M的小参数GPT，个人显卡即可推理训练！

Python 2,330 277 Updated Oct 16, 2024

Yutong-Zhou-cv / Awesome-Text-to-Image

(ෆ`꒳´ෆ) A Survey on Text-to-Image Generation/Synthesis.

2,126 190 Updated Oct 9, 2024

RyannDaGreat / Diffusion-Illusions

Diffusion Illusions: Hiding Images in Plain Sight

Jupyter Notebook 186 10 Updated Oct 13, 2024

pt-plugins / PT-Plugin-Plus

PT 助手 Plus，为 Microsoft Edge、Google Chrome、Firefox 浏览器插件（Web Extensions），主要用于辅助下载 PT 站的种子。

JavaScript 6,813 845 Updated Oct 15, 2024

basilevh / gcd

Generative Camera Dolly: Extreme Monocular Dynamic Novel View Synthesis (ECCV 2024 Oral) - Official Implementation

Python 165 3 Updated Sep 13, 2024

OSU-NLP-Group / SeeAct

[ICML'24] SeeAct is a system for generalist web agents that autonomously carry out tasks on any given website, with a focus on large multimodal models (LMMs) such as GPT-4V(ision).

Python 605 74 Updated Aug 26, 2024

LuChengTHU / dpm-solver

Official code for "DPM-Solver: A Fast ODE Solver for Diffusion Probabilistic Model Sampling in Around 10 Steps" (Neurips 2022 Oral)

Python 1,534 120 Updated Feb 6, 2024

ML-GSAI / SDE-Drag

Python 100 3 Updated Feb 26, 2024

Karine-Huang / T2I-CompBench

[Neurips 2023] T2I-CompBench: A Comprehensive Benchmark for Open-world Compositional Text-to-image Generation

Python 194 6 Updated Aug 21, 2024

mbzuai-oryx / groundingLMM

[CVPR 2024 🔥] Grounding Large Multimodal Model (GLaMM), the first-of-its-kind model capable of generating natural language responses that are seamlessly integrated with object segmentation masks.

Python 757 37 Updated Jun 2, 2024

jxxghp / MoviePilot

NAS媒体库自动化管理工具

Python 6,431 772 Updated Oct 16, 2024

facebookresearch / sapiens

High-resolution models for human tasks.

Python 4,260 228 Updated Oct 15, 2024

linzhiqiu / t2v_metrics

Evaluating text-to-image/video/3D models with VQAScore

Python 194 17 Updated Sep 9, 2024

RQLuo / MixTeX-Latex-OCR

MixTeX multimodal LaTeX, ZhEn, and, Table OCR. It performs efficient CPU-based inference in a local offline on Windows.

Python 718 35 Updated Oct 1, 2024

Kwai-Kolors / Kolors

Kolors Team

Python 3,717 248 Updated Sep 4, 2024

XLabs-AI / x-flux

Python 1,490 109 Updated Sep 23, 2024