[Cross-posted from http://www.open.ac.uk/blogs/vem/2014/06/exploring-genre-on-soundcloud-part-ii/]
In my previous post on this topic, I introduced a problem – how to understand the work that explicit genre categorisations are made to do by people uploading tracks to the SoundCloud audio-sharing website – and a potential solution – identifying the three categories most frequently used by each individual in a sample and studying regularities in the ways in which pairs of categories tend to pop up within the same group of three. I also presented some partial and preliminary findings in the form of a matrix comparing co-occurrences of the five genre categories most frequently used by people within an initial sample. And I either glossed over or left unmentioned a slew of problems, some of which we’ve been more successful in addressing than others at present (because these are only blog posts, and we haven’t finished the research yet). The biggest problem is the sample itself: the analysis was done on the basis of a snowball sample, when a random sample would be more appropriate. Hence the provisionality of all this. The analysis will be redone soon on the basis of a sample that will enable us to make more robust claims, but in the meantime I wanted to share our thought processes and working methods with the world because – quite apart from anything else – I’m excited about the patterns that are emerging.