Vocalization segmentation

This is a set of simple algorithms for segmenting vocalizations without supervision using (many) parameters. It's basically meant as an easy dumb way of doing segmentation by eye so that you can repeat segmentation on many different vocalizations simultaneously using the same methods.

There are a bunch of Jupyter notebook examples for European starlings, bengalese finch, canary, and mouse USVs.

There are two main segmentation algorithms:

Dynamic thresholding: segments syllables in time by computing a spectral envelope and modifying the threshold for segmentation on that envelope based on a set of parameters.

Continuous element segmentation: segments elements of song spectro-temporally, so that two elements can be overlapping in time but not frequency.

Parameters

There are a whole bunch of different parameters for the two algorithms. You need to carefully set them to get the results you want. Here is a quick description of each:

Arguments:
        vocalization {[type]} -- waveform of song
        rate {[type]} -- samplerate of datas

    Keyword Arguments:
        min_level_db {int} -- default dB minimum of spectrogram (threshold anything below) (default: {-80})
        min_level_db_floor {int} -- highest number min_level_db is allowed to reach dynamically (default: {-40})
        db_delta {int} -- delta in setting min_level_db (default: {5})
        n_fft {int} -- FFT window size (default: {1024})
        hop_length_ms {int} -- number audio of frames in ms between STFT columns (default: {1})
        win_length_ms {int} -- size of fft window (ms) (default: {5})
        ref_level_db {int} -- reference level dB of audio (default: {20})
        pre {float} -- coefficient for preemphasis filter (default: {0.97})
        spectral_range {[type]} -- spectral range to care about for spectrogram (default: {None})
        verbose {bool} -- display output (default: {False})
        mask_thresh_std {int} -- standard deviations above median to threshold out noise (higher = threshold more noise) (default: {1})
        neighborhood_time_ms {int} -- size in time of neighborhood-continuity filter (default: {5})
        neighborhood_freq_hz {int} -- size in Hz of neighborhood-continuity filter (default: {500})
        neighborhood_thresh {float} -- threshold number of neighborhood time-frequency bins above 0 to consider a bin not noise (default: {0.5})
        min_syllable_length_s {float} -- shortest expected length of syllable (default: {0.1})
        min_silence_for_spec {float} -- shortest expected length of silence in a song (used to set dynamic threshold) (default: {0.1})
        silence_threshold {float} -- threshold for spectrogram to consider noise as silence (default: {0.05})
        max_vocal_for_spec {float} -- longest expected vocalization in seconds  (default: {1.0})
        temporal_neighbor_merge_distance_ms {float} -- longest distance at which two elements should be considered one (default: {0.0})
        overlapping_element_merge_thresh {float} -- proportion of temporal overlap to consider two elements one (default: {np.inf})
        min_element_size_ms_hz {list} --  smallest expected element size (in ms and HZ). Everything smaller is removed. (default: {[0, 0]})
        figsize {tuple} -- size of figure for displaying output (default: {(20, 5)})

Project based on the cookiecutter data science project template. #cookiecutterdatascience

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
assets		assets
docs		docs
models		models
notebooks		notebooks
references		references
reports		reports
vocalseg		vocalseg
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
requirements.txt		requirements.txt
setup.py		setup.py
test_environment.py		test_environment.py
tox.ini		tox.ini

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Vocalization segmentation

Parameters

About

Releases

Packages

Languages

License

timsainb/vocalization-segmentation

Folders and files

Latest commit

History

Repository files navigation

Vocalization segmentation

Parameters

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages