Block or Report
Block or report Mikerhinos
Contact GitHub support about this user’s behavior. Learn more about reporting abuse.
Report abuseStars
Language
Sort by: Recently starred
Google Research
Tool for robust segmentation of >100 important anatomical structures in CT and MR images
Distilled variant of Whisper for speech recognition. 6x faster, 50% smaller, within 1% word error rate.
GUI for a Vocal Remover that uses Deep Neural Networks.
[SIGGRAPH Asia 2022] VideoReTalking: Audio-based Lip Synchronization for Talking Head Video Editing In the Wild
[ICLR 2024] Official implementation of DreamCraft3D: Hierarchical 3D Generation with Bootstrapped Diffusion Prior
Real time speech to text transcription app.
We write your reusable computer vision tools. đź’ś
A lightweight yet powerful audio-to-MIDI converter with pitch bend detection
Foundational Models for State-of-the-Art Speech and Text Translation
This project aims to enhance the working environment on Windows
[ICCV 2023] ProPainter: Improving Propagation and Transformer for Video Inpainting
A natural language interface for computers
An open source implementation of Microsoft's VALL-E X zero-shot TTS model. Demo is available in https://plachtaa.github.io
Next generation face swapper and enhancer
3D Gaussian Splatting, reimagined: Unleashing unmatched speed with C++ and CUDA from the ground up!
Original reference implementation of "3D Gaussian Splatting for Real-Time Radiance Field Rendering"
ComfyUI Web Extension for saving views and navigating graphs
One-click launcher for Audiocraft MusicGen + AudioGen Gradio Web UI
StableSwarmUI, A Modular Stable Diffusion Web-User-Interface, with an emphasis on making powertools easily accessible, high performance, and extensibility.
Generative Models by Stability AI
SoftVC VITS Singing Voice Conversion
The most powerful and modular stable diffusion GUI, api and backend with a graph/nodes interface.
Implementation of “DreamDiffusion: Generating High-Quality Images from Brain EEG Signals”
Generate 3D objects conditioned on text or images
Auto1111 extension implementing text2video diffusion models (like ModelScope or VideoCrafter) using only Auto1111 webui dependencies