- Los Angeles, CA
- https://pkmital.com
- @pkmital
Highlights
- Pro
Stars
Silero VAD: pre-trained enterprise-grade Voice Activity Detector
Silero Models: pre-trained speech-to-text, text-to-speech and text-enhancement models made embarrassingly simple
idiap / coqui-ai-TTS
Forked from coqui-ai/TTS🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
Muzic: Music Understanding and Generation with Artificial Intelligence
Instant voice cloning by MIT and MyShell.
A multi-voice TTS system trained with an emphasis on quality
[WIP] VoiceSmith makes training text to speech models easy.
A repository for storing models that have been inter-converted between various frameworks. Supported frameworks are TensorFlow, PyTorch, ONNX, OpenVINO, TFJS, TFTRT, TensorFlowLite (Float32/16/INT8…
Official implementation of "Separate Anything You Describe"
A new timeline addon for openframeworks.
Implementation of paper - YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors
This repo includes ChatGPT prompt curation to use ChatGPT better.
Running large language models on a single GPU for throughput-oriented scenarios.
Collection of audio-focused loss functions in PyTorch
Tracking states of the arts and recent results (bibliography) on sound tasks.
The “Quite OK Audio Format” for fast, lossy audio compression
This toolbox aims to unify audio generation model evaluation for easier comparison.
🤗 Diffusers: State-of-the-art diffusion models for image and audio generation in PyTorch and FLAX.
ColBERT: state-of-the-art neural search (SIGIR'20, TACL'21, NeurIPS'21, NAACL'22, CIKM'22, ACL'23, EMNLP'23)
A novel diffusion-based model for synthesizing long-context, high-fidelity music efficiently.
Audio generation using diffusion models, in PyTorch.
A collection of pre-trained audio models, in PyTorch.
A collection of resources and papers on Diffusion Models
"Automatic Language-Agnostic Subtitle Synchronization"
Wavelet scattering transforms in Python with GPU acceleration
Trainer for audio-diffusion-pytorch
A generative network for animal vocalizations. For dimensionality reduction, sequencing, clustering, corpus-building, and generating novel 'stimulus spaces'. All with notebook examples using freely…
A collaboration friendly studio for NeRFs