Skip to content

madiistvan/Neural-audio-coding-for-speech-enhancement

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

20 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Neural audio coding for speech enhancement

Poster

Description of classes

preprocessing:

  • DataEncoder: Responsible for creating and saving the pre-encoded versions of the audio files used for training.
  • EncoderDataset: Responsible for loading the raw audio files, noise files, applying preprocessing steps (such as cutting to equal length, applying bandpass filter) and creating the noisy audio files.

train:

  • EncodedDataset: Responsible for loading the pre-encoded input and target files for training the model.
  • LatentNetwork: Contains the structure of the latent network used in the audio denoising problem.
  • LatentTrainer: Responsible for training the latent network with the pre-encoded dataset.
  • utils:
    • ModelSaveHandler: Utility class for saving model checkpoints.

Demos

From the training data

normal_6sec.mp4

Real life setting

  1. Noise: Hitting table
original1.mp4
cleaned1.mp4
  1. Noise: Pressing plastic bag
original2.mp4
cleaned2.mp4
  1. Noise: Hitting glass with a fork
original3.mp4
cleaned3.mp4
  1. Noise: Playing pop music
original4.mp4
cleaned4.mp4
  1. Noise: Playing classical music
original5.mp4
cleaned5.mp4
  1. Noise: Playing dubstep music
original6.mp4
cleaned6.mp4

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published