Skip to content
/ PyHa Public

A repo designed to convert audio-based "weak" labels to "strong" intraclip labels. Provides a pipeline to compare automated moment-to-moment labels to human labels. Methods range from DSP based foreground-background separation, cross-correlation based template matching, as well as bird presence sound event detection deep learning models!

License

Notifications You must be signed in to change notification settings

UCSD-E4E/PyHa

Repository files navigation

PyHa logo

PyHa

A tool designed to convert audio-based "weak" labels to "strong" moment-to-moment labels. Provides a pipeline to compare automated moment-to-moment labels to human labels. Current proof of concept work being fulfilled on Bird Audio clips using Microfaune predictions.

This package is being developed and maintained by the Engineers for Exploration Acoustic Species Identification Team in collaboration with the San Diego Zoo Wildlife Alliance.

PyHa = Python + Piha (referring to a bird species of our interest known as the screaming-piha)

Contents

Installation and Setup

  1. Navigate to a desired folder and clone the repository onto your local machine. git clone https://github.com/UCSD-E4E/PyHa.git
  • If you wish to reduce the size of the repository on your local machine you can alternatively use git clone https://github.com/UCSD-E4E/PyHa.git --depth 1 which will only install the most up-to-date version of the repo without its history.
  1. Install Python 3.8, Python 3.9, or Python 3.10
  2. Create a venv by running python3.x -m venv .venv where python3.x is the appropriate python.
  3. Activate the venv with the following commands:
  • Windows: .venv\Scripts\activate
  • macOS/Linux: source .venv/bin/activate
  1. Install the build tools: python -m pip install --upgrade pip poetry
  2. Install the environment: poetry install
  3. Here you can download the Xeno-canto Screaming Piha test set used in our demos: https://drive.google.com/drive/u/0/folders/1lIweB8rF9JZhu6imkuTg_No0i04ClDh1
  4. Run jupyter notebook while in the proper folder to activate the PyHa_Tutorial.ipynb notebook and make sure PyHa is running properly. Make sure the paths are properly aligned to the TEST folder in the notebook as well as in the ScreamingPiha_Manual_Labels.csv file

Functions

design

This image shows the design of the automated audio labeling system.

isolation_parameters

Many of the functions take in the isolation_parameters argument, and as such it will be defined globally here.

The isolation_parameters dictionary definition depends on the model used. The currently supported models are BirdNET-Lite, Microfaune, and TweetyNET.

The BirdNET-Lite isolation_parameters dictionary is as follows:

isolation_parameters = {
    "model" : "birdnet",
    "output_path" : "",
    "lat" : 0.0,
    "lon" : 0.0,
    "week" : 0,
    "overlap" : 0.0,
    "sensitivity" : 0.0,
    "min_conf" : 0.0,
    "custom_list" : "",
    "filetype" : "",
    "num_predictions" : 0,
    "write_to_csv" : False,
    "verbose" : True
}

The Microfaune isolation_parameters dictionary is as follows:

isolation_parameters = {
    "model" : "microfaune",
    "technique" : "",
    "threshold_type" : "",
    "threshold_const" : 0.0,
    "threshold_min" : 0.0,
    "window_size" : 0.0,
    "chunk_size" : 0.0,
    "verbose" : True
}

The technique parameter can be: Simple, Stack, Steinberg, and Chunk. This input must be a string in all lowercase.
The threshold_type parameter can be: median, mean, average, standard deviation, or pure. This input must be a string in all lowercase.

The remaining parameters are floats representing their respective values.


The TweetyNET isolation_parameters dictionary is as follows:

isolation_parameters = {
    "model" : "tweetynet",
    "tweety_output": False,
    "technique" : "",
    "threshold_type" : "",
    "threshold_const" : 0.0,
    "threshold_min" : 0.0,
    "window_size" : 0.0,
    "chunk_size" : 0.0,
    "verbose" : True
}

The tweety_output parameter sets whether to use TweetyNET's original output or isolation techniques. If set to False, TweetyNET will use the specified technique parameter.

annotation_post_processing.py file

Found in annotation_post_processing.py

This function converts a Kaleidoscope-formatted Dataframe containing annotations to uniform chunks of chunk_length. Drops any annotation that less than chunk_length.

Parameter Type Description
kaleidoscope_df Dataframe Dataframe of automated or human labels in Kaleidoscope format
chunk_length int Duration in seconds of each annotation chunk

This function returns a dataframe with annotations converted to uniform second chunks.

Usage: annotation_chunker(kaleidoscope_df, chunk_length)

IsoAutio.py file

Found in IsoAutio.py

This function is the wrapper function for audio isolation techniques, and will call the respective function based on isolation_parameters "technique" key.

Parameter Type Description
local_scores list of floats Local scores of the audio clip as determined by Microfaune Recurrent Neural Network.
SIGNAL list of ints Samples that make up the audio signal.
SAMPLE_RATE int Sampling rate of the audio clip, usually 44100.
audio_dir string Directory of the audio clip.
filename string Name of the audio clip file.
isolation_parameters dict Python Dictionary that controls the various label creation techniques.

This function returns a dataframe of automated labels for the audio clip based on the passed in isolation technique.

Usage: isolate(local_scores, SIGNAL, SAMPLE_RATE, audio_dir, filename, isolation_parameters)

Found in IsoAutio.py

This function takes in the local score array output from a neural network and determines the threshold at which we determine a local score to be a positive ID of a class of interest. Most proof of concept work is dedicated to bird presence. Threshold is determined by "threshold_type" and "threshold_const" from the isolation_parameters dictionary.

Parameter Type Description
local_scores list of floats Local scores of the audio clip as determined by Microfaune Recurrent Neural Network.
isolation parameters dict Python Dictionary that controls the various label creation techniques.

This function returns a float representing the threshold at which the local scores in the local score array of an audio clip will be viewed as a positive ID.

Usage: threshold(local_scores, isolation_parameters)

Found in IsoAutio.py

This function uses the technique developed by Gabriel Steinberg that attempts to take the local score array output of a neural network and lump local scores together in a way to produce automated labels based on a class across an audio clip. It is called by the isolate function when isolation_parameters['technique'] == steinberg.

Parameter Type Description
local_scores list of floats Local scores of the audio clip as determined by Microfaune Recurrent Neural Network.
SIGNAL list of ints Samples that make up the audio signal.
SAMPLE_RATE int Sampling rate of the audio clip, usually 44100.
audio_dir string Directory of the audio clip.
filename string Name of the audio clip file.
isolation_parameters dict Python Dictionary that controls the various label creation techniques.
manual_id string controls the name of the class written to the pandas dataframe

This function returns a dataframe of automated labels for the audio clip.

Usage: steinberg_isolate(local_scores, SIGNAL, SAMPLE_RATE, audio_dir, filename,isolation_parameters, manual_id)

Found in IsoAutio.py

This function uses the technique suggested by Irina Tolkova and implemented by Jacob Ayers. Attempts to produce automated annotations of an audio clip based on local score array outputs from a neural network. It is called by the isolate function when isolation_parameters['technique'] == simple.

Parameter Type Description
local_scores list of floats Local scores of the audio clip as determined by Microfaune Recurrent Neural Network.
SIGNAL list of ints Samples that make up the audio signal.
SAMPLE_RATE int Sampling rate of the audio clip, usually 44100.
audio_dir string Directory of the audio clip.
filename string Name of the audio clip file.
isolation_parameters dict Python Dictionary that controls the various label creation techniques.
manual_id string controls the name of the class written to the pandas dataframe

This function returns a dataframe of automated labels for the audio clip.

Usage: simple_isolate(local_scores, SIGNAL, SAMPLE_RATE, audio_dir, filename,isolation_parameters, manual_id)

Found in IsoAutio.py

This function uses a technique created by Jacob Ayers. Attempts to produce automated annotations of an audio clip based on local score array outputs from a neural network. It is called by the isolate function when isolation_parameters['technique'] == stack.

Parameter Type Description
local_scores list of floats Local scores of the audio clip as determined by Microfaune Recurrent Neural Network.
SIGNAL list of ints Samples that make up the audio signal.
SAMPLE_RATE int Sampling rate of the audio clip, usually 44100.
audio_dir string Directory of the audio clip.
filename string Name of the audio clip file.
isolation_parameters dict Python Dictionary that controls the various label creation techniques.
manual_id string controls the name of the class written to the pandas dataframe

This function returns a dataframe of automated labels for the audio clip.

Usage: stack_isolate(local_scores, SIGNAL, SAMPLE_RATE, audio_dir, filename,isolation_parameters, manual_id)

Found in IsoAutio.py

This function uses a technique created by Jacob Ayers. Attempts to produce automated annotations of an audio clip based on local score array outputs from a neural network. It is called by the isolate function when isolation_parameters['technique'] == chunk.

Parameter Type Description
local_scores list of floats Local scores of the audio clip as determined by Microfaune Recurrent Neural Network.
SIGNAL list of ints Samples that make up the audio signal.
SAMPLE_RATE int Sampling rate of the audio clip, usually 44100.
audio_dir string Directory of the audio clip.
filename string Name of the audio clip file.
isolation_parameters dict Python Dictionary that controls the various label creation techniques.
manual_id string controls the name of the class written to the pandas dataframe

This function returns a dataframe of automated labels for the audio clip.

Usage: chunk_isolate(local_scores, SIGNAL, SAMPLE_RATE, audio_dir, filename,isolation_parameters, manual_id)

Found in IsoAutio.py

This function generates labels across a folder of audio clips determined by the model and other parameters specified in the isolation_parameters dictionary.

Parameter Type Description
audio_dir string Directory with wav audio files
isolation_parameters dict Python Dictionary that controls the various label creation techniques.
manual_id string controls the name of the class written to the pandas dataframe
weight_path string File path of weights to be used by the RNNDetector for determining presence of bird sounds.
normalized_sample_rate int Sampling rate that the audio files should all be normalized to.
normalize_local_scores boolean Set whether or not to normalize the local scores.

This function returns a dataframe of automated labels for the audio clips in audio_dir.

Usage: generate_automated_labels(audio_dir, isolation_parameters, manual_id, weight_path, normalized_sample_rate, normalize_local_scores)

Found in IsoAutio.py

This function is called by generate_automated_labels if isolation_parameters["model"] is set to birdnet. It generates bird labels across a folder of audio clips using BirdNET-Lite given the isolation parameters.

Parameter Type Description
audio_dir string Directory with wav audio files
isolation_parameters dict Python Dictionary that controls the various label creation techniques.

This function returns a dataframe of automated labels for the audio clips in audio_dir.

Usage: generate_automated_labels_birdnet(audio_dir, isolation_parameters)

Found in IsoAutio.py

This function is called by generate_automated_labels if isolation_parameters["model"] is set to microfaune. It applies the isolation technique determined by the isolation_parameters dictionary across a whole folder of audio clips.

Parameter Type Description
audio_dir string Directory with wav audio files
isolation_parameters dict Python Dictionary that controls the various label creation techniques.
manual_id string controls the name of the class written to the pandas dataframe
weight_path string File path of weights to be used by the RNNDetector for determining presence of bird sounds.
normalized_sample_rate int Sampling rate that the audio files should all be normalized to.
normalize_local_scores boolean Set whether or not to normalize the local scores.

This function returns a dataframe of automated labels for the audio clips in audio_dir.

Usage: generate_automated_labels_microfaune(audio_dir, isolation_parameters, manual_id, weight_path, normalized_sample_rate, normalize_local_scores)

Found in IsoAutio.py

This function is called by generate_automated_labels if isolation_parameters["model"] is set to tweetynet. It applies the isolation technique determined by the isolation_parameters dictionary across a whole folder of audio clips.

Parameter Type Description
audio_dir string Directory with wav audio files
isolation_parameters dict Python Dictionary that controls the various label creation techniques.
manual_id string controls the name of the class written to the pandas dataframe
weight_path string File path of weights to be used by the RNNDetector for determining presence of bird sounds.
normalized_sample_rate int Sampling rate that the audio files should all be normalized to.
normalize_local_scores boolean Set whether or not to normalize the local scores.

This function returns a dataframe of automated labels for the audio clips in audio_dir.

Usage: generate_automated_labels_tweetynet(audio_dir, isolation_parameters, manual_id, weight_path, normalized_sample_rate, normalize_local_scores)

Found in IsoAutio.py

This function strips away Pandas Dataframe columns necessary for the PyHa package that aren't compatible with the Kaleidoscope software.

Parameter Type Description
df Pandas Dataframe Dataframe compatible with PyHa package whether it be human labels or automated labels.

This function returns a Pandas Dataframe compatible with Kaleidoscope.

Usage: kaleidoscope_conversion(df)

statistics.py file

Found in statistics.py

This function calculates basic statistics related to the duration of annotations of a Pandas Dataframe compatible with PyHa.

Parameter Type Description
df Pandas Dataframe Dataframe of automated labels or manual labels.

This function returns a Pandas Dataframe containing count, mean, mode, standard deviation, and IQR values based on annotation duration.

Usage: annotation_duration_statistics(df)

Found in statistics.py

This function generates a dataframe with statistics relating to the efficiency of the automated label compared to the human label. These statistics include true positive, false positive, false negative, true negative, union, precision, recall, F1, and Global IoU for general clip overlap.

Parameter Type Description
automated_df Dataframe Dataframe of automated labels for one clip
human_df Dataframe Dataframe of human labels for one clip.

This function returns a dataframe with general clip overlap statistics comparing the automated and human labeling.

Usage: clip_general(automated_df, human_df)

Found in statistics.py

This function allows users to easily pass in two dataframes of manual labels and automated labels, and returns a dataframe with statistics examining the efficiency of the automated labelling system compared to the human labels for multiple clips.

Parameter Type Description
automated_df Dataframe Dataframe of automated labels of multiple clips.
manual_df Dataframe Dataframe of human labels of multiple clips.
stats_type String String that determines which type of statistics are of interest
threshold float Defines a threshold for certain types of statistics

This function returns a dataframe of statistics comparing automated labels and human labels for multiple clips.

The stats_type parameter can be set as follows:

Name Description
"IoU" Default. Compares the intersection over union of automated annotations with respect to manual annotations for individual clips.
"general" Consolidates all automated annotations and compares them to all of the manual annotations that have been consolidated across a clip.

Usage: automated_labeling_statistics(automated_df, manual_df, stats_type, threshold)

Found in statistics.py

This function takes in a dataframe of efficiency statistics for multiple clips and outputs their global values.

Parameter Type Description
statistics_df Dataframe Dataframe of statistics value for multiple audio clips as returned by the function automated_labelling_statistics.

This function returns a dataframe of global statistics for the multiple audio clips' labelling.

Usage: global_dataset_statistics(statistics_df)

Found in statistics.py

This function takes in the manual and automated labels for a clip and outputs IoU metrics of each human label with respect to each automated label.

Parameter Type Description
automated_df Dataframe Dataframe of automated labels for one clip
human_df Dataframe Dataframe of human labels for one clip.

This function returns an IoU_Matrix (arr) - (human label count) x (automated label count) matrix where each row contains the IoU of each automated annotation with respect to a human label.

Usage: clip_IoU(automated_df, manual_df)

Found in statistics.py

This function takes in the manual and automated labels for a clip and outputs IoU metrics of each human label with respect to each automated label.

Parameter Type Description
IoU_Matrix arr (human label count) x (automated label count) matrix where each row contains the IoU of each automated annotation with respect to a human label.
manual_df Dataframe Dataframe of human labels for an audio clip.
threshold float IoU threshold for determining true positives, false positives, and false negatives.

This function returns a dataframe of clip statistics such as True Positive, False Negative, False Positive, Precision, Recall, and F1 values for an audio clip.

Usage: matrix_IoU_Scores(IoU_Matrix, manual_df, threshold)

Found in statistics.py

This function determines whether or not a human label has been found across all of the automated labels.

Parameter Type Description
automated_df Dataframe Dataframe of automated labels for one clip
human_df Dataframe Dataframe of human labels for one clip.

This function returns a Numpy Array of statistics regarding the amount of overlap between the manual and automated labels relative to the number of samples.

Usage: clip_catch(automated_df,manual_df)

Found in statistics.py

This function takes the output of dataset_IoU Statistics and outputs a global count of true positives and false positives, as well as computes the precision, recall, and f1 metrics across the dataset.

Parameter Type Description
statistics_df Dataframe Dataframe of matrix IoU scores for multiple clips.

This function returns a dataframe of global IoU statistics which include the number of true positives, false positives, and false negatives. Contains Precision, Recall, and F1 metrics as well

Usage: global_statistics(statistics_df)

Found in statistics.py

This function determines the overlap of each human label with respect to all of the human labels in a clip across a large number of clips.

Parameter Type Description
automated_df Dataframe Dataframe of automated labels for one clip
human_df Dataframe Dataframe of human labels for one clip.

This function returns a dataframe of human labels with a column for the catch values of each label.

Usage: dataset_Catch(automated_df, manual_df)

Found in statistics.py

Parameter Type Description
automated_df Dataframe Dataframe of automated labels for multiple classes.
human_df Dataframe Dataframe of human labels for multiple classes.
stats_type String String that determines which statistics are of interest.
threshold float Defines a threshold for certain types of statistics.

This function returns a dataframe with clip overlap statistics comparing automated and human labeling for multiple classes

The stats_type parameter can be set as follows:

Name Description
"IoU" Default. Compares the intersection over union of automated annotations with respect to manual annotations for individual clips.
"general" Consolidates all automated annotations and compares them to all of the manual annotations that have been consolidated across a clip.

Usage: clip_statistics(automated_df, manual_df, stats_type, threshold)

Found in statistics.py

Parameter Type Description
clip_statistics Dataframe Dataframe of multi-class statistics values for audio clips as returned by the function clip_statistics.

This function returns a dataframe of global efficacy values for multiple classes.

Usage: class_statistics(clip_statistics)

visualizations.py file

Found in visualizations.py

This function produces graphs with the spectrogram of an audio clip. It is now integrated with Pandas so you can visualize human and automated annotations.

Parameter Type Description
clip_name string Directory of the clip.
sample_rate int Sample rate of the audio clip, usually 44100.
samples list of ints Each of the samples from the audio clip.
automated_df Dataframe Dataframe of automated labelling of the clip.
premade_annotations_df Dataframe Dataframe labels that have been made outside of the scope of this function.
premade_annotations_label string Descriptor of premade_annotations_df
save_fig boolean Whether the clip should be saved in a directory as a png file.

This function does not return anything.

Usage: spectrogram_graph(clip_name, sample_rate, samples, automated_df, premade_annotations_df, premade_annotations_label, save_fig, normalize_local_scores)

Found in visualizations.py

This function produces graphs with the local score plot and spectrogram of an audio clip. It is now integrated with Pandas so you can visualize human and automated annotations.

Parameter Type Description
local_scores list of floats Local scores for the clip determined by the RNN.
clip_name string Directory of the clip.
sample_rate int Sample rate of the audio clip, usually 44100.
samples list of ints Each of the samples from the audio clip.
automated_df Dataframe Dataframe of automated labelling of the clip.
premade_annotations_df Dataframe Dataframe labels that have been made outside of the scope of this function.
premade_annotations_label string Descriptor of premade_annotations_df
log_scale boolean Whether the axis for local scores should be logarithmically scaled on the plot.
save_fig boolean Whether the clip should be saved in a directory as a png file.

This function does not return anything.

Usage: local_line_graph(local_scores, clip_name, sample_rate, samples, automated_df, premade_annotations_df, premade_annotations_label, log_scale, save_fig, normalize_local_scores)

Found in visualizations.py

This is the wrapper function for the local_line_graph and spectrogram_graph functions for ease of use. Processes clip for local scores to be used for the local_line_graph function.

Parameter Type Description
clip_path string Path to an audio clip.
weight_path string Weights to be used for RNNDetector.
premade_annotations_df Dataframe Dataframe of annotations to be displayed that have been created outside of the function.
premade_annotations_label string String that serves as the descriptor for the premade_annotations dataframe.
automated_df Dataframe Whether the audio clip should be labelled by the isolate function and subsequently plotted.
log_scale boolean Whether the axis for local scores should be logarithmically scaled on the plot.
save_fig boolean Whether the plots should be saved in a directory as a png file.

This function does not return anything.

Usage: spectrogram_visualization(clip_path, weight_path, premade_annotations_df, premade_annotations_label,automated_df = False, isolation_parameters, log_scale, save_fig, normalize_local_scores)

Found in visualizations.py

This function visualizes automated and human annotation scores across an audio clip.

Parameter Type Description
automated_df Dataframe Dataframe of automated labels for one clip.
human_df Dataframe Dataframe of human labels for one clip.
plot_fig boolean Whether or not the efficiency statistics should be displayed.
save_fig boolean Whether or not the plot should be saved within a file.

This function returns a dataframe with statistics comparing the automated and human labeling.

Usage: binary_visualization(automated_df,human_df,save_fig)

Found in visualizations.py

This function builds a histogram to visualize the length of annotations.

Parameter Type Description
annotation_df Dataframe Dataframe of automated or human labels.
n_bins int Number of histogram bins in the final histogram.
min_length int Minimum length of the audio clip.
max_length int Maximum length of the audio clip.
save_fig boolean Whether or not the plot should be saved within a file.
filename String Name of the file to save the histogram to.

This function returns a histogram with the length of the annotations.

Usage: binary_visualization(annotation_df, n_bins, min_length, max_length, save_fig, filename)

All files in the birdnet_lite directory are from a modified version of the BirdNET Lite repository, and their associated documentation can be found there.

All files in the microfaune_package directory are from the microfaune repository, and their associated documentation can be found there.

All files in the tweetynet directory are from the tweetynet repository, and their associated documentation can be found there.

All files in the tweetynet directory are from the tweetynet repository, and their associated documentation can be found there.

Examples

These examples were created on an Ubuntu 16.04 machine. Results may vary between different Operating Systems and Tensorflow versions.

Examples using Microfaune were created using the following dictionary for isolation_parameters:

isolation_parameters = {
     "model" : "microfaune",
     "technique" : "steinberg",
     "threshold_type" : "median",
     "threshold_const" : 2.0,
     "threshold_min" : 0.0,
     "window_size" : 2.0,
     "chunk_size" : 5.0
 }

To generate automated labels and get manual labels:

automated_df = generate_automated_labels(path,isolation_parameters,normalize_local_scores=True)
manual_df = pd.read_csv("ScreamingPiha_Manual_Labels.csv")

Function that gathers statistics about the duration of labels

annotation_duration_statistics(automated_df)

image

annotation_duration_statistics(manual_df)

image

Function that converts annotations into 3 second chunks

annotation_chunker(automated_df, 3)

annotation chunker

Helper function to convert to kaleidoscope-compatible format

kaleidoscope_conversion(manual_df)

image

Baseline Graph without any annotations

clip_path = "./TEST/ScreamingPiha2.wav"
spectrogram_visualization(clip_path)

image

Baseline Graph with log scale

spectrogram_visualization(clip_path,log_scale = True)

image

Baseline graph with normalized local score values between [0,1]

spectrogram_visualization(clip_path, normalize_local_scores = True)

image

Graph with Automated Labeling

spectrogram_visualization(clip_path,automated_df = True, isolation_parameters = isolation_parameters)

image

Graph with Human Labelling

spectrogram_visualization(clip_path, premade_annotations_df = manual_df[manual_df["IN FILE"] == "ScreamingPiha2.wav"],premade_annotations_label = "Piha Human Labels")

image

Graph with Both Automated and Human Labels

Legend:

- Orange ==> True Positive
- Red ==> False Negative
- Yellow ==> False Positive
- White ==> True Negative
spectrogram_visualization(clip_path,automated_df = True,isolation_parameters=isolation_parameters,premade_annotations_df = manual_df[manual_df["IN FILE"] == "ScreamingPiha2.wav"])

image

Another Visualization of True Positives, False Positives, False Negatives, and True Negatives

automated_piha_df = automated_df[automated_df["IN FILE"] == "ScreamingPiha2.wav"]
manual_piha_df = manual_df[manual_df["IN FILE"] == "ScreamingPiha2.wav"]
piha_stats = binary_visualization(automated_piha_df,manual_piha_df)

image

Function that generates statistics to gauge efficacy of automated labeling compared to human labels

statistics_df = automated_labeling_statistics(automated_df,manual_df,stats_type = "general")

image

Function that takes the statistical output of all of the clips and gets the equivalent global scores

global_dataset_statistics(statistics_df)

image

Function that takes in the manual and automated labels for a clip and outputs human label-by-label IoU Scores. Used to derive statistics that measure how well a system is isolating desired segments of audio clips

Intersection_over_Union_Matrix = clip_IoU(automated_piha_df,manual_piha_df)

image

Function that turns the IoU Matrix of a clip into true positive and false positives values, as well as computing the precision, recall, and F1 statistics

matrix_IoU_Scores(Intersection_over_Union_Matrix,manual_piha_df,0.5)

image

Wrapper function that takes matrix_IoU_Scores across multiple clips. Allows user to modify the threshold that determines whether or not a label is a true positive.

stats_df = automated_labeling_statistics(automated_df,manual_df,stats_type = "IoU",threshold = 0.5)

image

Function that takes the output of dataset_IoU Statistics and outputs a global count of true positives and false positives, as well as computing common metrics across the dataset

global_stats_df = global_statistics(stats_df)

image

All relevant audio from the PyHa tutorial can be found within the "TEST" folder. In order to replicate the results displayed in the GitHub repository, make sure the audio clips are located in a folder called "TEST" in the same directory path as we had in the Jupyter Notebook tutorial.

All audio clips can be found on xeno-canto.org under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0) [https://creativecommons.org/licenses/by-nc-sa/4.0/ license](https://creativecommons.org/licenses/by-nc-sa/4.0/ license).

The manual labels provided for this dataset are automatically downloaded as a .csv when the repository is cloned.

About

A repo designed to convert audio-based "weak" labels to "strong" intraclip labels. Provides a pipeline to compare automated moment-to-moment labels to human labels. Methods range from DSP based foreground-background separation, cross-correlation based template matching, as well as bird presence sound event detection deep learning models!

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published