Skip to content

PiSchool/spoken-language-id

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Spoken Language Identification from Short Utterances

This is a model for identifying the language spoken in a short audio segment.

Installation

To install the required libraries (tested on Ubuntu 17.11) run:

python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt

Predicting the language in an audio file

  1. Convert an audio file to a spectrogram:

     python data/dataset_gen.py -z speech.wav -o .
    
  2. Obtain the prediction using a pre-trained model:

     python main.py --model-dir your-trained-model/ --params your-trained-model/params.json --model combo --predict speech.png
    

Training the model from scratch

  1. Prepare a dataset:

    • Place your spectrograms in a folder
    • Create a test set CSV file containing "Filename,Language" pairs
    • Create an evaluation set CSV file (same format as the test)
  2. Train the model:

     python main.py --model-dir your-trained-model/ --params your-trained-model/params.json --model combo --image-dir your-data/ --train-set your-data/train-set.csv --eval-set your-data/eval-set.csv
    

Author

This project was developed by Rimvydas Naktinis during Pi School's AI programme in Fall 2017.

photo of Rimvydas Naktinis

About

Spoken Language Identification from Short Utterances

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages