Voice_Activity_Detection_V2

Voice Activity Detection Version2.0

It is because the testing file "test05mi.wav" has quite low SNR and energy that I attempt to cut the file in frequency domain. First thing we do is the Short Time Fourier Transform, which reflects all frequencies in a short frame along time axis. After summarizing the average and moving average, a threshold is manually choosen. Finally, 10 seperate files are saved!

Usage

Python vad_v2.py

Tips

You need to make sure the .wav file is mono-channel. Change the file name as well as all neccessary parameters in py script.

Spectrogram and its moving average figures are shown firstly. Check carefully if the boundary and threshold are properly set. (Or there will be odd endpoints which gives broken chunk parts!!)

1sec zeros are padded at the both ends of the frame.

Example

Known Bugs

Some .pcm files may cause severe error. I highly recommend the .wav file as input first.
When padding 0, in some rare cases, some harsh noise instead of silence are padded.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
.gitignore		.gitignore
Figure_1-1.png		Figure_1-1.png
Figure_1.png		Figure_1.png
Figure_2.png		Figure_2.png
LICENSE		LICENSE
README.md		README.md
test05mi.wav		test05mi.wav
vad_v2.py		vad_v2.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Voice_Activity_Detection_V2

Usage

Tips

Example

Known Bugs

About

Releases

Packages

Languages

License

guozhonghao1994/Voice_Activity_Detection_V2

Folders and files

Latest commit

History

Repository files navigation

Voice_Activity_Detection_V2

Usage

Tips

Example

Known Bugs

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages