Hack 30. Correct Music Metadata with MusicBrainz
Learn about MusicBrainz and how to use and contribute to the project.
MusicBrainz aims to aggregate the "brainz" of thousands of music lovers to create an open, user-contributed, music encyclopedia. MusicBrainz collects information about the music artists have released and how musical works and musicians relate to one another. In turn, people can use the project along with tools that utilize MusicBrainz (see "Tools that Use MusicBrainz") to add metadata to their audio files and correct the data that is already there. The community of moderators (editors) collects information about all kinds of music, regardless of its genre, time period, ethnicity, or religion. Unlike some music information databases out there, MusicBrainz strives to eliminate duplicate data, fix incorrect data, and augment missing data. The end result is a project that can be used as a general music reference, a place to share your music trivia, and as a reference to use for cleaning up your music collection.
MusicBrainz got its start in 1998 as the CD Index, which intended to provide an open data/open source alternative to the formerly open CDDB project. After a short while, the CD Index reinvented itself as MusicBrainz and expanded its focus to include support for tagging digital music files and not just Audio CDs. Today MusicBrainz has more than 100,000 members and information for over 3,000,000 tracks.
2.19.1. MusicBrainz's Database
The MusicBrainz database contains a list of artists, the albums they have released, and a complete track listing for each album. MusicBrainz assigns each artist, album, and track a unique identifier (a UUID), so that client software can unambiguously talk about any of these musical entities. These identifiers should eliminate confusion about different versions of the same piece of music. For instance, the live version of Pink Floyd's Great Gig in the Sky has a different track identifier than the studio version.
Most albums in MusicBrainz have CD Index IDs (grandfathered in from the old CD Index project) derived from the Audio CD's table of contents, so MusicBrainz-enabled players can identify Audio CDs by reading the table of contents and retrieving the music metadata from the MusicBrainz server.
Recently MusicBrainz also added support for Advanced Relationships, which lets users create relationships between artists, albums, tracks, and URLs to web resources. For instance, the page for the rapper Snoop Dogg shows that Snoop Dogg:
The Advanced Relationships concept enables MusicBrainz to become a true music encyclopedia. Any contributor with any bit of information about an artist or a piece of music can now capture that detail in MusicBrainz.
MusicBrainz attempts to keep the numerous contributors all on one page by providing style guidelines (http://musicbrainz.org/style.html). Note the use of the term guidelines and not rulesMusicBrainz attempts to capture widely varying uses of metadata and few pieces of this data ever fit neatly into rules. MusicBrainz recognizes this and tries to keep the guidelines flexible, in hopes of capturing the world's unruly music metadata into one single database.
2.19.2. Open Source and Open Data
MusicBrainz embodies the principles of open source and applies the open source methods to data, coining the term "open data." Open data, much like open source, benefits from making data widely available under liberal licenses. Linus Torvalds famously said, "Given enough eyeballs, all bugs are shallow." This same concept applies to data as well: "Given enough users, all data problems and omissions are obvious."
With this idea in mind, MusicBrainz makes its data available for anyone to download at http://musicbrainz.org/products/server/download.html. MusicBrainz makes the core data (artists, albums, tracks, identifiers, relationships) available in the Public Domain. Ancillary data (search indexes, changes to the database, etc.) is available under the Creative Commons Non-Commercial Share-Alike 2.0 license. MusicBrainz makes its software available under the GPL and/or the LGPL.
2.19.3. Tools that Use MusicBrainz
The MusicBrainz project released two flagship tagging applications, Picard and pimpmytunes, that are covered in "Clean Music Metadata at the Command Line" [Hack #31] and "Clean Music Metadata with a GUI" [Hack #32] respectively.
The following applications also make use of MusicBrainz for metadata cleanup or audio CD lookup:
More applications that support MusicBrainz can be found on their applications page at http://musicbrainz.org/wd/MusicBrainzEnabledApplications.
2.19.4. How to Participate
MusicBrainz needs all kinds of help. First and foremost, MusicBrainz needs the music knowledge that you have in your brain, so please go to MusicBrainz (http://musicbrainz.org) and create yourself an account. Once you've logged in, use the search page to find your favorite artist(s) and see if the artist pages are complete. Does MusicBrainz have all the albums for this artist and are they properly classified? Do they adhere to the official style guidelines? Check the release dates, album language, and CD Index IDs on albums and add missing ones or fix incorrect ones.
If the MusicBrainz database has all that data in order, consider grabbing your favorite album to see if anyone has captured the credits for the album using the Advanced Relationships feature. If not, use the Add Relationship link add relationships that capture contributions to this album.
Second, MusicBrainz needs developersevery open source project has more people coming up with ideas than it has people bringing those ideas to life, and MusicBrainz is no different. Once you've played with the MusicBrainz web site a bit, you may find shortcomings or think of features you'd like to see implemented. If you've hacked on Apache, Perl, or PostgreSQL, you'll likely make a great candidate for hacking on MusicBrainz. To find out more, visit the developer pages (http://musicbrainz.org/development/index.html)or the help wanted page listed next.
Finally, even if you don't hack on code, MusicBrainz can still use your helpthe project always needs people to help test new applications, file bug reports, and write documentation for the site and the applications. For all the gory details on how you can participate, visits the MusicBrainz help wanted page (http://musicbrainz.org/about/helpwanted.html).