Skip to content
View EmreOzkose's full-sized avatar
🎯
Focusing
🎯
Focusing

Highlights

  • Pro

Block or report EmreOzkose

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Beta Lists are currently in beta. Share feedback and report bugs.
Showing results

This is the PyTorch implementation of the Universal Source Separation with Weakly labelled Data.

Python 325 15 Updated Sep 1, 2023

Collection of Open Source Speech Data

113 5 Updated Oct 1, 2024
Jupyter Notebook 133 14 Updated Mar 3, 2024

DUSTED: Spoken-Term Discovery using Discrete Speech Units

Jupyter Notebook 11 Updated Oct 2, 2024

Official Implementation of TSELM: Target speaker extraction using discrete tokens and language models

Python 11 5 Updated Sep 20, 2024

ICASSP2025Dynamic Embedding Causal Target Speech Extraction

Python 26 3 Updated Sep 27, 2024

Target Speaker Extraction Toolkit

Python 81 11 Updated Sep 27, 2024

SSR-Speech: Towards Stable, Safe and Robust Zero-shot Speech Editing and Synthesis

Python 75 8 Updated Sep 24, 2024

GANs for time series generation in pytorch

Python 271 78 Updated Sep 13, 2019

Lightweight wrapper for Silero VAD using internal ONNX Runtime and with no python package dependencies

Python 5 Updated Oct 7, 2024

Focus on prompting and generating

Python 40,576 5,663 Updated Aug 21, 2024

Source code for ACL 2023 paper "End-to-End Simultaneous Speech Translation with Differentiable Segmentation"

Python 32 2 Updated Dec 6, 2023

ControlSpeech: Towards Simultaneous Zero-shot Speaker Cloning and Zero-shot Language Style Control With Decoupled Codec

Python 185 6 Updated Sep 3, 2024
Python 6,075 454 Updated Oct 4, 2024

StyleTTS-ZS: Efficient High-Quality Zero-Shot Text-to-Speech Synthesis with Distilled Time-Varying Style Diffusion

140 7 Updated Sep 27, 2024

LLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1-8B-Instruct, aiming to achieve speech capabilities at the GPT-4o level.

Python 2,215 131 Updated Sep 24, 2024
Python 89 6 Updated Jul 4, 2024
Python 59 6 Updated Aug 26, 2024
Python 11 Updated May 7, 2022

Spatial Voice Conversion: Voice Conversion Preserving Spatial Information and Non-target Signals

Python 14 1 Updated Aug 8, 2024

Semantic segmentation models with 500+ pretrained convolutional and transformer-based backbones.

Python 9,490 1,657 Updated Oct 7, 2024

Official Implementation for "Only a Matter of Style: Age Transformation Using a Style-Based Regression Model" (SIGGRAPH 2021) https://arxiv.org/abs/2102.02754

Python 628 151 Updated Jan 7, 2024

A deep learning model to age faces in the wild, currently runs at 60+ fps on GPUs

Python 229 47 Updated May 18, 2024

The official pytorch implemention of the Intespeech 2024 paper "Reshape Dimensions Network for Speaker Recognition"

Python 96 5 Updated Sep 3, 2024

A fast speech-to-any translation model that supports simultaneous decoding and offers 28× speedup.

Python 60 4 Updated Aug 12, 2024

Code for ACL 2024 main conference paper "Can We Achieve High-quality Direct Speech-to-Speech Translation Without Parallel Speech Data?".

Python 21 5 Updated Jul 2, 2024

Code for ACL 2024 findings paper "CTC-based Non-autoregressive Textless Speech-to-Speech Translation"

8 Updated Jun 11, 2024
Next