mdeff / fma

FMA: A Dataset For Music Analysis

Home Page:https://arxiv.org/abs/1612.01840

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Are genres sorted by importance?

kristijanbartol opened this issue · comments

Hi, is the order in genres list for each track sorted by significance, i.e., is it random? Why it would be great to have that information is because you can say "This song is mostly jazz with elements of experimental rock and a bit of reggae", even though that task is mostly too fuzzy to give strong claims, but still relying just a bit on this information seems better than having a collection of tags in random order.

The genres are not ordered. The genres in the genres column are in the order returned by the https://freemusicarchive.org API. So there might be some order, e.g. if they kept the order in which the artists introduced them (though we don't know how different artists made their choice).

This song is mostly jazz with elements of experimental rock and a bit of reggae

That would be terrific, but I don't think it's achievable with the data at hand. Maybe something could be inferred by the listening patterns of users, or the public playlists (I collected none of those). Anyway, the best way forward in my opinion would be to run a crowd-sourcing task and ask multiple people to rate each track with e.g. "predominantly <genre x>", "contain some of <genre y>", "contain none of <genre z>". With multiple annotators, we would get an accurate genre representation of each track with a measure of uncertainty. But that would cost quite some time and money.

Sure, that's pity. It is convenient to define a classification problem using ordered tags. Nonetheless, I will try training the model treating all the given genres equally and also try to introduce some bias, for example, put higher importance on more frequently used tags in the dataset or similar... Thank you!