Skip to content
View pkmital's full-sized avatar

Highlights

  • Pro

Block or report pkmital

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Silero VAD: pre-trained enterprise-grade Voice Activity Detector

Python 4,334 418 Updated Nov 7, 2024

Silero Models: pre-trained speech-to-text, text-to-speech and text-enhancement models made embarrassingly simple

Jupyter Notebook 4,969 311 Updated Oct 18, 2023

🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production

Python 543 56 Updated Nov 11, 2024

Muzic: Music Understanding and Generation with Artificial Intelligence

Python 4,538 448 Updated Oct 12, 2024

Instant voice cloning by MIT and MyShell.

Python 29,729 2,925 Updated Aug 21, 2024

A multi-voice TTS system trained with an emphasis on quality

Jupyter Notebook 13,191 1,822 Updated Aug 19, 2024

[WIP] VoiceSmith makes training text to speech models easy.

Python 222 32 Updated Oct 10, 2022

A repository for storing models that have been inter-converted between various frameworks. Supported frameworks are TensorFlow, PyTorch, ONNX, OpenVINO, TFJS, TFTRT, TensorFlowLite (Float32/16/INT8…

Python 3,603 573 Updated Nov 6, 2024

Official implementation of "Separate Anything You Describe"

Python 1,623 117 Updated Oct 25, 2024

A new timeline addon for openframeworks.

C++ 40 3 Updated Jun 27, 2024

loaf: lua, osc, and openFrameworks

C++ 53 4 Updated Feb 3, 2024

Implementation of paper - YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors

Jupyter Notebook 13,364 4,219 Updated Aug 19, 2024

This repo includes ChatGPT prompt curation to use ChatGPT better.

HTML 112,518 15,350 Updated Sep 26, 2024

Let us control diffusion models!

Python 30,327 2,725 Updated Feb 25, 2024

Running large language models on a single GPU for throughput-oriented scenarios.

Python 9,188 549 Updated Oct 28, 2024

Collection of audio-focused loss functions in PyTorch

Python 739 67 Updated Jul 30, 2024

Tracking states of the arts and recent results (bibliography) on sound tasks.

32 2 Updated Jan 10, 2023

The “Quite OK Audio Format” for fast, lossy audio compression

C 767 42 Updated Oct 3, 2024

This toolbox aims to unify audio generation model evaluation for easier comparison.

Python 301 31 Updated Sep 29, 2024

🤗 Diffusers: State-of-the-art diffusion models for image and audio generation in PyTorch and FLAX.

Python 26,117 5,381 Updated Nov 11, 2024

ColBERT: state-of-the-art neural search (SIGIR'20, TACL'21, NeurIPS'21, NAACL'22, CIKM'22, ACL'23, EMNLP'23)

Python 3,054 388 Updated Sep 4, 2024

A novel diffusion-based model for synthesizing long-context, high-fidelity music efficiently.

Python 194 10 Updated Apr 27, 2023

Audio generation using diffusion models, in PyTorch.

Python 1,956 168 Updated Jun 12, 2023

A collection of pre-trained audio models, in PyTorch.

Python 110 4 Updated Jan 27, 2023

A collection of resources and papers on Diffusion Models

HTML 11,056 947 Updated Aug 1, 2024

"Automatic Language-Agnostic Subtitle Synchronization"

Rust 1,047 53 Updated Dec 28, 2023

Wavelet scattering transforms in Python with GPU acceleration

Python 760 138 Updated May 29, 2024

Trainer for audio-diffusion-pytorch

Python 127 22 Updated Jan 13, 2023

A generative network for animal vocalizations. For dimensionality reduction, sequencing, clustering, corpus-building, and generating novel 'stimulus spaces'. All with notebook examples using freely…

Jupyter Notebook 69 21 Updated Dec 27, 2022

A collaboration friendly studio for NeRFs

Python 9,521 1,297 Updated Nov 8, 2024
Next