Stars
StoryMaker: Towards consistent characters in text-to-image generation
VideoSys: An easy and efficient system for video generation
A Prompt Enhancer for flux.1 in ComfyUI
Official implementation code of the paper <AnyText: Multilingual Visual Text Generation And Editing>
Dead simple FLUX LoRA training UI with LOW VRAM support
An open source library for face detection in images. The face detection speed can reach 1000FPS.
[CVPR 2023] Towards Any Structural Pruning; LLMs / SAM / Diffusion / Transformers / YOLOv8 / CNNs
text and image to video generation: CogVideoX (2024) and CogVideo (ICLR 2023)
[CAAI AIR'24] Bilateral Reference for High-Resolution Dichotomous Image Segmentation
The swiss army knife of lossless video/audio editing
fbc_cv is an open source image process library.
Semantic Search on Wikipedia with Upstash Vector
real time face swap and one-click video deepfake with only a single image
Code and pruned models for our paper: K. Gkrispanis, N. Gkalelis, V. Mezaris, "Filter-Pruning of Lightweight Face Detectors Using a Geometric Median Criterion", Proc. IEEE/CVF Winter Conference on …
The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery 🧑🔬
A general fine-tuning kit geared toward diffusion models.
State-of-the-art 2D and 3D Face Analysis Project
A1111/SD Forge extension for IC-Light
AI-First Album: Chat with your gallery using plain language! LLM Vision + RAG + Album/Gallery.
I2V-Adapter: A General Image-to-Video Adapter for Diffusion Models
Bring portraits to life via Monitor!