Stars
[NeurIPS 2024 D&B Spotlight🔥] ChronoMagic-Bench: A Benchmark for Metamorphic Evaluation of Text-to-Time-lapse Video Generation
Based on GroundingDino and SAM, use semantic strings to segment any element in an image. The comfyui version of sd-webui-segment-anything.
TRAM: Global Trajectory and Motion of 3D Humans from in-the-wild Videos
Codes for Visual Sketchpad: Sketching as a Visual Chain of Thought for Multimodal Language Models
FlashTex: Fast Relightable Mesh Texturing with LightControlNet
text to image to generation: CogView3-Plus and CogView3(ECCV 2024)
PyTorch implementation of RCG https://arxiv.org/abs/2312.03701
Official Implementation of weights2weights
The official code of paper "LVCD: Reference-based Lineart Video Colorization with Diffusion Models"
24/7 local AI screen & mic recording. Build AI apps that have the full context. Works with Ollama. Alternative to Rewind.ai. Open. Secure. You own your data. Rust.
This repository demonstrates browser based implementation of DepthAnything and DepthAnythingV2 models. It is powered by Onnx and does not require any web servers or APIs.
Official code repository for the paper "Neural Light Spheres for Implicit Image Stitching and View Synthesis"
Fast, reliable, and free document scanner app for iPhone
Official implementation of "Ctrl-X: Controlling Structure and Appearance for Text-To-Image Generation Without Guidance" (NeurIPS 2024)
Fast Corrects for fisheye distortion in an image.
A streamlined implementation of Grounding DINO and SAM for advanced image segmentation. This lightweight solution simplifies the integration of powerful object detection and segmentation models, of…
A Simple yet Effective Pathway to Empowering LLaVA to Understand and Interact with 3D World
Official implementation of the paper "DreamWaltz-G: Expressive 3D Gaussian Avatars from Skeleton-Guided 2D Diffusion".
Gaussian Haircut: Human Hair Reconstruction with Strand-Aligned 3D Gaussians
Uguu is a simple lightweight temporary file host with support for drop, paste, click and API uploading.
Official Implementation of Lotus: Diffusion-based Visual Foundation Model for High-quality Dense Prediction
Pytorch implementation of MIMO, Controllable Character Video Synthesis with Spatial Decomposed Modeling, from Alibaba Intelligence Group
[ECCV 2024] Official Pytorch Implementation for "Eta Inversion: Designing an Optimal Eta Function for Diffusion-based Real Image Editing"