19 January 2011

The Music of FluidDB I: Albums, Tracks and Songs

I have been thinking for a while about what conventions for tagging kinds musical entities in FluidDB. The kinds of things I have in mind include recordings of music, pieces of music (compositions), artists and composers. My firmest conclusion so far is that it’s complicated and I can’t tackle it all in one go.

In particular, classical music feels very complicated to me, with a common situation for a classical “record” being recordings of several pieces with somewhat variable names, often by different composers, being played often by a somewhat fluid and ambiguous collection of musicians.

In this post, therefore, I’m going to try to tackle what feels like a simpler problem by restricting myself to considering non-classical music and three kinds of entities—albums, tracks and songs.

My basic suggestion is to adopt conventions very similar to those I have been championing for books, in the form of the book-1 convention.

Books (Recap)

Recall that book-1 convention for about tags for books in English has the following basic components:

  • the prefix book:
  • the title of the book, normalized using NACO-like conventions, which standardize to lower case, remove most punctuation and accents and regularize spacing;
  • the author, again normalized in a NACO-like manner, in parentheses.

For example, Alice in Wonderland, by Lewis Carroll, uses the about tag

book:alice in wonderland (lewis carroll)

So far this convention seems to have worked quite well. Its virtues include:

  • it is simple to construct with only easily available information (the stuff you can see if you have the book or a normal reference to it)
  • it is unique for the almost all books
  • it is clearly identified as a book (and thus disambiguated from a film, for example).

The next stage beyond a single-author book is multi-author books, and there the convention is simply to list the authors, in the order they appear on the book, separated by semicolons. For example, The Feynman Lectures on Physics, by Richard P. Feynman, Robert B. Leighton and Matthew Sands uses the about tag:

book:the feynman lectures on physics (richard p feynman; robert b leighton; matthew sands)

Albums, Tracks and Songs

Recorded non-classical music consists primarily of albums—a named collection of tracks, normally purchased together—and individual tracks, sometimes known as singles or songs.

At the simplest level, the conventions I am going to propose for about tags for albums and tracks are very similar to those for books but using the prefixes album: and track:. So the album, The Dark Side of the Moon, by Pink Floyd, is

album:the dark side of the moon (pink floyd)

and the track The Great Gig in the Sky, from that same album, is

track:the great gig in the sky (pink floyd)

But there are number of points to discuss.

Albums

The suggested about tag for albums is fairly straightforward. The main complication/ambiguity I can see concerns multi-volume sets. So, on vinyl, for example, Neil Young’s Decade has three disks; and it is a double CD. This is quite an easy case: I think we ignore the ‘disk’ number entirely where an just regard double and triple albums as albums. So all of Decade is:

album: decade (neil young)

For multi-volume collections that are normally sold separately, simply include the volume number. So, for example, The Tatum Group Masterpieces Volume 1, by Art Tatum, Benny Carter, Louis Bellson, becomes

album:the tatum group masterpieces volume 1 (art tatum; benny carter; louis bellson)

The NACO-like normalization conventions were described in this post and are implemented in the abouttag library.

The handling of artists is in principle quite simple, though in practice slightly hard to automate completely. My suggestion is that whenever there is a list of musicians, as with authors, they are simply separated with semicolons (and a space); any ampersands or ands are removed. In the case of groups, the group name is simply used. The interesting and slightly troubling cases are those where a group combines with person. The most common case of this is exemplified by Diana Ross and the Supremes. My suggestion is that such cases are left intact, other than normalization, using ‘and’ rather than ampersand (&). So the album “Reflections” becomes

album:reflections (diana ross and the supremes)

There are probably awkward corner cases, but I think this handles most.

The biggest problem I foresee is that it will hard to automate the construction of the standard form of an artist from something like iTunes metadata because the input (from Gracenote) doesn’t separate out a list of artists in any remotely consistent way, so I think standardizing them will require a degree of human intervention. This is not, however, in any way particular to this suggested convention; it’s fundamentally to do with the fact that some artists identified as a list of people, and others have a group name, and telling these apart is hard, even without complication such as the band Alice Cooper!

Here are a few examples of the sorts of album about tags I’m suggesting:

  • The Black Balloon, by John Renbourn album:the blank balloon (john renbourn)
  • The Composer, by Thelonious Monk album:the composer (thelonious monk)
  • Fleetwood Mac, by Fleetwood Mac album:fleetwood mac (fleetwood mac)
  • Wu Wei, by Pierre Bensusan album:wu wei (pierre bensusan)
  • The Tatum Group Masterpieces Volume 1, by Art Tatum, Benny Carter, Louis Bellson album:the tatum group masterpieces volume 1 (art tatum; benny carter; louis bellson)
  • Ms. Right, by Duck Baker album:ms right (duck baker)
  • ‘Round About Midnight, by The Miles Davis Quintet album:round about midnight (the miles davis quintet)
  • A Matter Of Time, by Gordon Giltrap & Martin Taylor album:a matter of time (gordon giltrap; martin taylor)
  • Musiques / Solilaï, by Pierre Bensusan album:musiques solilai (pierre bensusan)
  • Live Au New Morning, by Bensusan & Malherbe album:live au new morning (bensusan; malherbe)
  • Eye To The Telescope, by KT Tunstall album:eye to the telescope (k t tunstall)
  • Grace & Danger, by John Martyn album:grace & danger (john martyn)
  • Alas, I Cannot Swim, by Laura Marling album:alas i cannot swim (laura marling)
  • Lady In Autumn: The Best Of The Verve Years, by Billie Holiday album:lady in autumn the best of the verve years (billie holiday)

Tracks

I was originally minded to suggest using song: as the prefix for individual album tracks, notwithstanding the fact that this is slighty inappropriate for instrumental pieces. This was until I realised that we will certainly want to have entries for songs themselves (independent of artist) in FluidDB. Given this, I think we have little choice but to fall back to track, which is more perhaps more appropriate anyway.

I think there are couple of points to made about tracks. The first is that I do not propose to tie them to albums. Thus if an artist records a track (piece/song), I suggest that in the common case we don’t distinguish between different records. When you talk about Billie Holiday’s recording of God Bless the Child, you actually talk about all her records of that song, in the general case.

track:god bless the child (billie holiday)

Similarly, if, as is quite common, a track is qualified by (live) or [live], I suggest that be omitted in the standard case.

The other reasonably common complication, particularly for folk music, is the medley. In this case, my suggestion is just hand the track name to the NACO-like normalization routine and use what it produces. In most cases, this works fine.

To try to illustrate lots of common cases, here is a fairly long list of examples:

  • Rhythm-a-Ning, by Thelonious Monk track:rhythm a ning (thelonious monk)
  • Round Midnight, by Thelonious Monk track:round midnight (thelonious monk)
  • Straight, No Chaser, by Thelonious Monk track:straight no chaser (thelonious monk)
  • Bourrée I and II, by John Renbourn track:bourree i and ii (john renbourn)
  • Medley: The Mist Covered Mountains of Home / The Orphan / Tarboulton, by John Renbourn track:medley the mist covered mountains of home the orphan tarboulton (john renbourn)
  • Monday Morning, by Fleetwood Mac track:monday morning (fleetwood mac)
  • Poussière d’Amants, by Pierre Bensusan track:poussiere damants (pierre bensusan)
  • Doherty’s - Return to Milltown - Tommy People’s, by Tony McManus track:dohertys return to milltown tommy peoples (tony mcmanus)
  • Jackie Coleman’s - The Milliner’s Daughter - Rakish Paddy - Connor Dunn’s, by Tony McManus track:jackie colemans the milliners daughter rakish paddy connor dunns (tony mcmanus)
  • Blues in C, by Art Tatum, Benny Carter, Louis Bellson track:blues in c (art tatum; benny carter; louis bellson)
  • S’Wonderful, by Art Tatum, Benny Carter, Louis Bellson track:swonderful (art tatum; benny carter; louis bellson)
  • Makin’ Whoopee, by Art Tatum, Benny Carter, Louis Bellson track:makin whoopee (art tatum; benny carter; louis bellson)
  • (I’m Left With the) Blues in my Heart, by Art Tatum, Benny Carter, Louis Bellson track:im left with the blues in my heart (art tatum; benny carter; louis bellson)
  • The Nine Maidens a. Clarsach b. The Nine Maidens c. The Fiddler, by John Renbourn track:the nine maidens a clarsach b the nine maidens c the fiddler (john renbourn)
  • Ms. Right, by Duck Baker track:ms right (duck baker)
  • ‘Round Midnight, by The Miles Davis Quintet track:round midnight (the miles davis quintet)
  • Ah-Leu-Cha, by The Miles Davis Quintet track:ah leu cha (the miles davis quintet)
  • Across The Pond, by Gordon Giltrap & Martin Taylor track:across the pond (gordon giltrap; martin taylor)
  • G & T Blues, by Gordon Giltrap & Martin Taylor track:g & t blues (gordon giltrap; martin taylor)
  • Abide With Me / Old Gloryland, by Stefan Grossman & John Renbourn track:abide with me old gloryland (stefan grossman; john renbourn)
  • Badhra, by Anouar Brahem, John Surman, Dave Holland, track:badhra (anouar brahem; john surman; dave holland)
  • Biodag Aig Mac Thomais/The Nine Pint Coggie/The Spike Island Lasses, by Tony McManus track:biodag aig mac thomais the nine pint coggie the spike island lasses (tony mcmanus)
  • Three Pieces By O’Carolan;The Lamentation Of Owen Roe O’Neill; Lord Inchiquin; Mrs Power (O’Carlan’s Concerto), by John Renbourn track:three pieces by ocarolan the lamentation of owen roe oneill lord inchiquin mrs power ocarlans concerto (john renbourn)
  • Heman Dubh, by Pierre Bensusan track:heman dubh (pierre bensusan)
  • Le Voyage pour L’Irelande, by Pierre Bensusan track:le voyage pour lirelande (pierre bensusan)
  • 50 Ways To Leave Your Lover, by Paul Simon track:50 ways to leave your lover (paul simon)
  • La Danse Du Capricorne 1, by Pierre Bensusan track:la danse du capricorne 1 (pierre bensusan)
  • Reels - “The Pure Drop”/”The Flax In Bloom”, by Pierre Bensusan track:reels "the pure drop" "the flax in bloom" (pierre bensusan)
  • Mille Vallées, by Bensusan & Malherbe track:mille vallees (bensusan; malherbe)
  • Bamboo Shoot (Improvisation), by Bensusan & Malherbe track:bamboo shoot improvisation (bensusan; malherbe)
  • Black Horse And The Cherry Tree, by KT Tunstall track:black horse and the cherry tree (k t tunstall)
  • Universe & U, by KT Tunstall track:universe & u (k t tunstall)
  • Sigmund Freud’s Impersonation Of Albert Einstein In America, by Randy Newman track:sigmund freuds impersonation of albert einstein in america (randy newman)
  • Mr. President (Have Pity On The Working Man), by Randy Newman track:mr president have pity on the working man (randy newman)
  • I Love L.A., by Randy Newman track:i love l a (randy newman)
  • The Blues, by Randy Newman track:the blues (randy newman)
  • Through-Us-All, by Isaac Guillory track:through us all (isaac guillory)
  • A Terrible Pickle, by Dean Friedman track:a terrible pickle (dean friedman)
  • Money, by Pink Floyd track:money (pink floyd)
  • Take Five, by Dave Brubeck Quartet track:take five (dave brubeck quartet)
  • Pirates (So Long Lonely Avenue), by Rickie Lee Jones track:pirates so long lonely avenue (rickie lee jones)
  • The Returns, by Rickie Lee Jones track:the returns (rickie lee jones)
  • Chuck E’s In Love, by Rickie Lee Jones track:chuck es in love (rickie lee jones)
  • Harry’s House/Centerpiece, by Joni Mitchell track:harrys house centerpiece (joni mitchell)
  • I’s A Muggin’ (Rap), by Joni Mitchell track:is a muggin rap (joni mitchell)
  • Miles Beyond, by Mahavishnu Orchestra track:miles beyond (mahavishnu orchestra)
  • A Surfer Courted Me, by Martha Tilston and the Woods track:a surfer courted me (martha tilston and the woods)
  • Lookin’ On, by John Martyn track:lookin on (john martyn)
  • The Captain And The Hourglass, by Laura Marling track:the captain and the hourglass (laura marling)
  • Le Chien Sur Les Genoux de la Devineresse, by Anouar Brahem, Barbaros erkose, Kudsi Erguner & Lassad Hosni track:le chien sur les genoux de la devineresse (anouar brahem; barbaros erkose; kudsi erguner; lassad hosni)
  • A Prayer, by Madeleine Peyroux track:a prayer (madeleine peyroux)
  • Was I?, by Madeleine Peyroux track:was i (madeleine peyroux)
  • (I Got A Man Crazy For Me) He’s Funny That Way, by Billie Holiday track:i got a man crazy for me hes funny that way (billie holiday)
  • Lover Man (Oh, Where Can You Be?), by Billie Holiday track:lover man oh where can you be (billie holiday)
  • St. Louis Blues, by Billie Holiday track:st louis blues (billie holiday)

Songs

[UPDATE 2011/01/19: I have modified this recommendation since it was first posted, after thinking more about the lack of consistency in how composers are identified.]

I have given less thought to songs (as distinct from tracks, or recordings of songs), but the obvious convention would seem to be to use the song: prefix, followed by the normalized song title, followed by the composer or composers in brackets, again in whatever order they are normally listed. The only real complication I can see there is the fairly common case in which music and lyrics are given separate credits. In that case, I think I suggest simply listing the music composer ahead of the lyrics composer.

The slightly subtle question concerns ow to standardize the composer’s name. I the case of artists (and authors) my normal recommendation is to start from the name as it appears on the work, so John Martyn, J. D. Salinger etc. This works well because you just have to look at the work to see how it is written; and for this reason, there’s a well-defined, standard place to look (the work).

Composers are more awkward, because it is much less clear where to look. If you own a record, the easy thing to do is to look at the sleeve, or the liner notes, or sometimes on the record (or CD) itself. But the same song can be recorded many times and the composer won’t always be displayed consistently. You could also look at the sheet music. Or in Wikipedia. In short, there is no consistency. A quick look through the first half dozen make it clear there’s not even consistency on a single CD in many cases.

In this case, therefore, my recommendation is to use surnames only. So in a simple case, Summertime by George Gershwin, is

song:summertime (gershwin)

The Lennon/McCartney partnership would produce, for example

song:hey jude (lennon; mccartney)

A case in which lyrics and music are credited separately would be Officer Krupke, from Westside Story, by Leonard Bernstein (music) and Stephen Sondheim (lyrics). So this would be:

song:office krupke (bernstein; sondheim)

The reason I’ve gone for surname only is that it seems to involve very little loss of precision (it will be rare indeed for two songs with the same title to have different composers with the same surname but different forenames), and to use the smallest amount of information that is commonly available. I think this is probably a fairly good convention.

Comments Invited

As ever, I’d be interested in thoughts from anyone, in the blog comments or directly. I haven’t pushed an updated version of the abouttag library containing these to github yet, but will probably do so in a few days unless there is significant push-back.

6 comments:

  1. I'd recommend looking at musicbrainz.org, which has solved a lot of the same problems already, and even if you decide to make different decisions, they make their enormous data-set available for free, so you could use it to import a lot of this information from.

    Things are, as you say, complicated, and I think the above barely scratches the surface. While classical music does have some unique metadata challenges, I wouldn't say that popular music is any simpler, when you consider remixes, mash-ups (often listing 'Artist A vs. Artist B' as a track's artist,) live recordings, bootlegs, and albums that are released with different track listings in different countries.

    Musicbrainz may not have solved all of these problems conclusively, but they're the closest thing to perfection I've seen, and they are continually addressing new issues.

    ReplyDelete
  2. thisfred

    Wow, what a fabulous-looking site. Thanks so much for pointing it out. As you say, that looks like a tremendous resource, and licensing looks like it would be easy to get a lot of very useful stuff very painlessly. I haven't immediately seen whether they have a way of identifying tracks/albums etc. that would be suitable for an about tag, but it certainly merits looking into.

    You're obviously right when you say that there is more than the simple scheme I discuss here caters for perfectly, even for non-classical music, but I think some of it comes down to what level you want to work at. FluidDB can cater for lots of different levels, and I imagine that we will end up with different conventions for different "levels" of the the music hierarchy. The case I wanted to tackle first is the one that I guess most people will care about most of the time. I think that is the conceptual work --- you know, Dark Side of the Moon by Pink Floyd or Summertime by Billie Holiday or whatever. Clearly, you could (entirely validly) have different objects for different recordings of Summertime by Billie Holiday, or even for (say) the original and the re-mastered version of Dark Side of the Moon, or even the CD vs. the LP etc. I'm not wanting to discourage this at all, but that's not really what the convention I'm discussing here for about tags is trying to do. (Similarly, for books, you could distinguish the paperback from the hardback, and all the different publishers and editions, but the book-1 convention deliberately operates at the level of the "work" because, for most purposes, it seems more useful to combine them. I accept that the case is less clear-cut for music, but in general I still think the same is true.)

    Thanks again for a really useful comment. I'll look at Musicbrainz some more.

    Nick

    ReplyDelete
  3. Someone on IRC encouraged me to look more at your blog for info about standards or conventions for listing books in Fluidinfo. It's really been helpful, and I plan on trying out fdb soon to see what it can do.

    So far I've noticed that you've described conventions for many things, including books, songs, albums, tracks, but I was wondering if you've thought about films. I did separate queries to Fluidinfo for 'fluiddb/about matches "film"' and '..."movie"' but both of those came up with a ton of urls to BoingBoing and other sites, but few if any films.

    Using your book conventions, I created an object film:pulp fiction (quentin tarantino), using the director in place of the author. It's a start, but it might be nice to know if there any thoughts or existing conventions. I've got about 900+ movies I want to add then tag for a possible app.

    ReplyDelete
  4. Michael H:

    I haven't specifically thought about films, and I'm not aware of anyone's having systematically put any in, but, as you might expect, I like the film convention you suggest and if you use it and let me know I'll (1) add it to the reference page (2) add it to the about tag library and (3) (perhaps) add a category to about tag app (slightly more work, but not too much).

    Is it the best convention? I think it's great, but there probably two alternatives worth considering. Clearly, the film title isn't enough, and adding the director seems very natural. A clear alternative (perhaps in more common use outside FluidDB) is to add the release year instead of the director 'film:pulp fiction (1994)'; I'm not saying that would be better, but it would certainly be an alternative. And I guess the third practical option would be to use some kind of IMDB identifier --- probably the page URL or the ID part of it. I don't like that nearly as much because it requires a specific look-up, whereas I think a great property for about tags is that they can be constructed for information you would naturally have for an object. Title clearly fits this, and director and year are pretty natural too.

    Good luck, and let me know what you decide and I'll do some of the above (and blog too, perhaps).

    Nick

    ReplyDelete
  5. Wow, that was a quick response ;-)

    I think identifying films by year would probably be a good idea. Though I tend to think of films by their title, then by the director, some have multiple directors. One film, "Paris, je t'aime" came out a few years ago and it has over twenty directors. I don't know if there's a limit on the amount of data that can be in an about tag.

    An IMDB identifier might be good for a separate tag, but I don't think it would be great for an about tag. I've come across some films that aren't in IMDB, though it's rare. An IMDB identifier is almost like an ISBN. Not all books have ISBN's and not all movies make it into IMDB.

    ReplyDelete
  6. Michael H:

    Right, the multiple directors is a good one and I agree that an IMDB identifier is better as a separate tag. So title + year does seem good (though, like you, I could name a lot more film directors than years...in fact, I'm not sure I could get the year of any film correct off the top of my head).

    I suppose people from some continents might think 'movie:' was a better prefix than 'film:', but I'm not from that continent ;-)

    ReplyDelete

Labels