Skip to content

Latest commit

 

History

History
17 lines (12 loc) · 585 Bytes

README.md

File metadata and controls

17 lines (12 loc) · 585 Bytes

This repository contains the CSV files for the processed dataset used to train VoiceLDM. These files include the transcriptions generated using the Whisper model.

Speech Segments

  • as_speech_en.csv
  • cv1.csv (cv.csv has been split into two due to file size limitations on GitHub.)
  • cv2.csv
  • voxceleb.csv

Non-Speech Segments

  • as_noise.csv
  • noise_demand.csv

Evaluation Segments

Additionally, I've included the CSV file corresponding to the ac_filtered test set, which was specifically used to evaluate VoiceLDM.

  • ac_filtered.csv