You came this way: Home > cheyenne_h > Blog > FMA Data Set for Researchers Released

cheyenne_h (FMA Admin)

Mini Profile

REGISTERED:10/01/2014
COMMENTS POSTED:126
MIXES CREATED:92
AFFILIATIONS:---
cheyenne_h on 12/15/2016 at 02:07PM

FMA Data Set for Researchers Released

Oh, Flickr Commons, how I love thee. "Science Gossip" from flickr commons/Internet Archive Book Images. No known copyright restrictions.

A team of researchers recently released three data sets with music from the Free Music Archive. The team is from the Signal Processing Laboratory at the École polytechnique fédérale de Lausanne (EPFL) in Switzerland.

The FMA data sets consist of audio excerpts and metadata from a collection of songs from the Free Music Archive which will be used to 'train' and 'test' music information software. The sets provide a legal and up-to-date alternative to other available music data sets, which are outdated, sometimes inaccessible, and fettered by copyright concerns. By embracing the "some rights reserved" philosophy of Creative Commons, artists are not only making their music available for the public to listen to, but also for educational and research applications.

Kirell Benzi, a PhD student at the lab, said "[R]esearchers now have access to large datasets for image processing, pushing science forward... However, because of copyright issues, Music Information Retrieval researchers had no equivalent. As a results the image processing community methods are more advanced than the ones for audio. Indeed, the former biggest dataset was the Million Song Dataset with Echonest features but no access to raw files. We noticed that Echonest features are subject to intellectual property [laws] and are outdated. Worse, the website Echonest for developers seems down for good, leaving MIR [Music Information Retrieval] researchers with the old GTZAN dataset of 1000 illegal mp3 excerpts. With the new FMA dataset, all these issues are history!"

"The main goal is to advance Machine Learning research for music, and we think this dataset can have a great impact," said Michaël Defferrard, another member of the team.

If you're interested in learning more about the rationale behind the project, or see a quick survey of other data sets that are currently available to music researchers, you can read their article (PDF) here: https://arxiv.org/abs/1612.01840

The actual sets can be found here (small and medium are the only ones available at this time) and the code is on github.

Thanks to Kirell Benzi, Michaël Defferrard, Pierre Vandergheynst and Xavier Bresson for sharing the FMA in a new way!

Share

Comments

There are no comments for this page, but feel free to be the first!
log in to post comments