DurFlex-EVC: Duration-Flexible Emotional Voice Conversion with Parallel Generation (Under review)

About the Project

Emotional voice conversion (EVC) seeks to modify the emotional tone of a speaker’s voice while preserving the original linguistic content and the speaker’s unique vocal characteristics. Recent advancements in EVC have involved the simultaneous modeling of pitch and duration, utilizing the potential of sequence-to-sequence (seq2seq) models. To enhance reliability and efficiency in conversion, this study shifts focus towards parallel speech generation. We introduce Duration-Flexible EVC (DurFlex-EVC), which integrates a style autoencoder and unit aligner. Traditional models, while incorporating self-supervised learning (SSL) representations that contain both linguistic and paralinguistic information, have neglected this dual nature, leading to reduced controllability. Addressing this issue, we implement cross-attention to synchronize these representations with various emotions. Additionally, a style autoencoder is developed for the disentanglement and manipulation of style elements. The efficacy of our approach is validated through both subjective and objective evaluations, establishing its superiority over existing models in the field.

Architecture

Tech Stack

Framework

Getting Started

Prerequisites

 pip install -r requirements.txt

Training

TBA

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
data_gen/tts		data_gen/tts
egs		egs
inference/tts		inference/tts
modules		modules
tasks		tasks
utils		utils
README.md		README.md
preprocess.sh		preprocess.sh
requirements.txt		requirements.txt
run.sh		run.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DurFlex-EVC: Duration-Flexible Emotional Voice Conversion with Parallel Generation (Under review)

About the Project

Architecture

Tech Stack

Getting Started

Prerequisites

Training

Acknowledgements

About

Releases

Packages

Languages

hs-oh-prml/DurFlexEVC

Folders and files

Latest commit

History

Repository files navigation

DurFlex-EVC: Duration-Flexible Emotional Voice Conversion with Parallel Generation (Under review)

About the Project

Architecture

Tech Stack

Getting Started

Prerequisites

Training

Acknowledgements

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages