Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possibly corrupted files in fma_small #36

Closed
Samaritan1011001 opened this issue Apr 3, 2020 · 3 comments
Closed

Possibly corrupted files in fma_small #36

Samaritan1011001 opened this issue Apr 3, 2020 · 3 comments

Comments

@Samaritan1011001
Copy link

I apologize if I missed a step or did not do something on my part. Thank you for the data and all the examples.

The training using cnn after pre-processing the audio files starts off but as soon as some files are fetched, the training stops with the below error:
Unknown: CalledProcessError: Command '['ffmpeg', '-i', 'path-to-dataset\\fma_small\\099\\**099134**.mp3', '-f', 's16le', '-acodec', 'pcm_s16le', '-ac', '1', '-']' return ed non-zero exit status 1.

Looking at this, I checked the file 099134 and my default audio player could not play it and also the metadata(in File explorer) for that file seems to be missing as shown below
image

@Samaritan1011001
Copy link
Author

Librosa throws error File contains data in an unknown format when trying to load that file but works well with other files. Similarly other files like 133297 also behave this way.

Here is a gist to reproduce it

import utils
import librosa
import os
import IPython.display as ipd

AUDIO_DIR = os.environ.get('AUDIO_DIR')

filename = utils.get_audio_path(AUDIO_DIR, 99134)
print('File: {}'.format(filename))

x, sr = librosa.load(filename, sr=None, mono=True)
print('Duration: {:.2f}s, {} samples'.format(x.shape[-1] / sr, x.size))

start, end = 7, 17
ipd.Audio(data=x[start*sr:end*sr], rate=sr)

@andimarafioti
Copy link

This has already been mentioned here #8
aka, you're not alone

@Samaritan1011001
Copy link
Author

Oh, yeah right! Thanks.
I forgot to note down the filenames that I found are corrupted so this issue seems to be not much use, so I'll close it for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants