09 June 2011

The Music of Fluidinfo II

I pushed an update to the abouttag.py library last night; it now includes some conventions and normalization for some music-related items. This supports work by Eric Seidel (@gridaphobe), who is looking at importing data from MusicBrainz to Fluidinfo.

These are similar (but not identical) to the ideas I proposed previously in the post The Music of FluidDB I: Albums, Tracks and Songs.

The first three kinds of things covered are works—named albums and named tracks respectively. As with books, this conceptual work seems like the single most important level to represent in Fluidinfo. So there may be many different issues, editions and pressings of The Dark Side of the Moon by Pink Floyd, but there is only one work. Even more clearly, Billie Holliday may have recorded God Bless the Child a number of times (I’m playing through several as I type this) , but there is only one conceptual work God Bless the Child sung by Billie Holliday.

  • Albums (as works). The convention for this, called album-u, is similar, but not identical to the convention for books, having the general form

    album:name of album (artist)

    The name of the album and the artist are normalized using the usual normalize function from abouttag.py, removing some punctuation, regularizing spacing and converting to lower case, but not removing accents. (I think it’s increasingly clear that removing accents was a mistake in book-1; I’m now recommending using the relatied book-u conventions, which is like book-1 except that it preserves accents.)

    The other main difference between the book-u convention and the album-u convention is that in the case of books, multiple authors are consolidated into a standard list, separated by semicolons (in fact, a semicolon followed by a space). This works less well for music, where artists take more different forms Diana Ross and the Supremes, Pink Floyd, John Renbourn and Stefan Grossman, Crosby, Stills, Nash & Young etc. Since MusicBrainz, in particular, has a well-defined artist field, which is supposed to be the official recording credit, Eric is just planning to standardize that.

    Example usage is:

    from abouttag.music import album
    
    print album(u"Solilaï", u'Pierre Bensusan')
    print album(u"Déjà   Vu", u'Crosby, Stills, Nash & Young')
    

    producing

    album:solilaï (pierre bensusan)
    album:déjà vu (crosby stills nash & young)

    Of course, we may also create objects for paricular releases of an album, but those would use a different convention.

  • Named Tracks / Recorded Songs. These are have a form that is identical to albums except that they use track: as the prefix. Again, two different recordings of the same song get consolodated into a single (conceptual) track (convention track-u).

    Example usage is:

    from abouttag.music import track
    
    print track(u'Bamboulé', u'Bensusan and Malherbe')
    
    print track(u'''Archie Campbell/Marjorie Campbell/Miss Lyall's '''
                u'''Strathspey/Miss Lyall's Reel/The St Kilda Wedding''',
                u'The Cast'),
    

    producing

    track:bamboulé (bensusan and malherbe)
    track:archie campbell marjorie campbell miss lyalls strathspey miss lyalls reel the st kilda wedding (the cast)
  • Recordings. In the case of conceptual tracks, it is particularly clear that same artist may record the same track several times. Happily, there is a standard identifier for such recordings of tracks, the International Standard Recording code. This is 12-character code, usually formatted for print as CC-XXX-YYY-NNNNN, where CC is a registrant country code, XXX is a registrant code, UU is the last two digits of the registration year and NNNNN identifies the recording.

    Despite minor misgivings on my part (I would have chosen to keep the dashes, since Fluidinfo generally favours humans over machines), we have chosen to standardize this in the form isrn:CCXXXYYYNNNNN.

    Examples:

    from aboutag.music import isrc_recording
    
    print isrc_recording(u'US-PR3-73-00012')
    print isrc_recording(u'uspr37300012')
    

    produces

    isrc:USPR37300012
    isrc:USPR37300012
  • Artist. The artist is simply identified by artist:name, where name is normalized as usual. Since accents are preversed, metal fans need not fear for their heavy metal umlauts (a.k.a. röck döts). For example:

    from aboutag.music import artist
    
    print artist(u"Crosby, Stills, Nash & Young")
    print artist(u"Motörhead")
    

    produces:

    artist:crosby stills nash & young'
    artist:motörhead'

No comments:

Post a Comment

Labels