-
Stability.ai, Eleuther.ai
- Seattle, WA
- https://dmarx.github.io
- @DigThatData
TTIv2.0 wishlist
Huggingface-compatible SDXL Unet implementation that is readily hackable
[ICCV 2023 Oral] "FateZero: Fusing Attentions for Zero-shot Text-based Video Editing"
The framework for building with WebAssembly (wasm). Easily load wasm modules, move data, call functions, and build extensible apps.
Original reference implementation of "3D Gaussian Splatting for Real-Time Radiance Field Rendering"
Change Python code while it's running without losing state
Grounded SAM: Marrying Grounding DINO with Segment Anything & Stable Diffusion & Recognize Anything - Automatically Detect , Segment and Generate Anything
Refine high-quality datasets and visual AI models
Controlling diffusion-based image generation with just a few strokes
This is the official codes for "Level-S2fM: Structure from Motion on Neural Level Set of Implicit Surfaces" accepted as CVPR2023.
The image prompt adapter is designed to enable a pretrained text-to-image diffusion model to generate images with image prompt.
CoTracker is a model for tracking any point (pixel) on a video.
Fast Example-based Image Synthesis and Style Transfer
Motion Module fine tuner for AnimateDiff.
PatchMatch based image inpainting for C++ and Python.
Low rank adaptation for segmentation anything model (SAM)
Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable…
Implementation of HyperDreamBooth: HyperNetworks for Fast Personalization of Text-to-Image Models
A diffusers based implementation of HyperDreamBooth
Interactive Jupyter notebook widget for visually comparing embedding spaces.
[ICLR 2024 Oral] Generative Gaussian Splatting for Efficient 3D Content Creation
Faster LCM is a script which enables to transfer image styles at 45fps with RTX4090, 33fps with A100.
Embed arbitrary modalities (images, audio, documents, etc) into large language models.
SplaTAM: Splat, Track & Map 3D Gaussians for Dense RGB-D SLAM (CVPR 2024)
[arXiv 2023] DreamGaussian4D: Generative 4D Gaussian Splatting
[CVPR 2024] Real-Time Open-Vocabulary Object Detection
Easily turn your Click CLI into a powerful terminal application