Parsing ID3 Tags from MP3s using Python

While working on my Python mp3 player I realized I needed to research what Python had to offer for parsing ID3 tags. There are tons of projects out there, but most of them appear to be either dead, don’t have documentation or both. In this post, you will discover the wild world of MP3 tag parsing in Python along with me and we’ll see if we can find something that I can use to enhance my mp3 player project.

For this exercise, we’ll try to get the following information from our parsers:

  • Artist
  • Title of Album
  • Track Title
  • Length of Track
  • Album Release Date

We’ll probably need more metadata than that, but this is the usual stuff I care about in my mp3 playing experience. We will look at the following 3rd party libraries to see how they hold up:

Let’s get started!

Can Mutagen Save the Day?

One of the reasons to include Mutagen in this round up is because it supports ASF, FLAC, M4A, Monkey’s Audio, Musepack, Ogg FLAC, Ogg Speex, Ogg Theora, Ogg Vorbis, True Audio, WavPack and OptimFROG in addition to MP3 parsing. Thus, we could potentially expand our MP3 player quite a bit. I was pretty excited when I found this package. However, while the package appears to be actively developed, the documentation is almost non-existent. If you are a new Python programmer, you will find this library difficult to just jump into and use.

To install Mutagen, you’ll need to unpack it and navigate to its folder using the command line. Then execute the following:


python setup.py install

You may be able to use easy_install or pip as well, although their website doesn’t really say one way or the other. Now comes the fun part: trying to figure out how to use the module without documentation! Fortunately, I found a blog post that gave me some clues. From what I gather, Mutagen follows the ID3 specification pretty closely, so rather than abstracting it so that you would have functions like GetArtist, you actually end up reading ID3 text frames and use their terminology. Thus, TPE1 = Artist (or Lead Singer), TIT2 = Title, etc. Let’s look at an example:

>>> path = r'D:\mp3\12 Stones\2002 - 12 Stones\01 - Crash.mp3'
>>> from mutagen.id3 import ID3
>>> audio = ID3(path)
>>> audio
>>> audio['TPE1']
TPE1(encoding=0, text=[u'12 Stones'])
>>> audio['TPE1'].text
[u'12 Stones']

Here’s a more proper example:

from mutagen.id3 import ID3
 
#----------------------------------------------------------------------
def getMutagenTags(path):
    """"""
    audio = ID3(path)
 
    print "Artist: %s" % audio['TPE1'].text[0]
    print "Track: %s" % audio["TIT2"].text[0]
    print "Release Year: %s" % audio["TDRC"].text[0]

I personally find this to be difficult to read and use, so I won’t be using this module for my mp3 player unless I need to add additional digital file formats to it. Also note that I wasn’t able to figure out how to get the track’s play length or album title. Let’s move on to our next ID3 parser and see how it fares.

eyeD3

If you go to eyeD3’s website, you’ll notice that it doesn’t seem to support Windows. This is a problem for many users and almost caused me to drop it from this round-up. Fortunately, I found a forum that mentioned a way to make it work. The idea was to rename the “setup.py.in” file in the main folder to just “setup.py” and the “__init__.py.in” file to “__init__.py”, which you’ll find in “src\eyeD3”. Then you can install it using the usual “python setup.py install”. Once you have it installed, it’s really easy to use. Check out the following function:

import eyeD3
 
#----------------------------------------------------------------------
def getEyeD3Tags(path):
    """"""
    trackInfo = eyeD3.Mp3AudioFile(path)
    tag = trackInfo.getTag()
    tag.link(path)
 
    print "Artist: %s" % tag.getArtist()
    print "Album: %s" % tag.getAlbum()
    print "Track: %s" % tag.getTitle()
    print "Track Length: %s" % trackInfo.getPlayTimeString()
    print "Release Year: %s" % tag.getYear()

This package does meet our arbitrary requirements. The only regrettable aspect of the package is its lack of official Windows support. We’ll reserve judgment until after we’ve tried out our third possibility though.

Ned Batchelder’s id3reader.py

This module is probably the easiest of the three to install since it’s just one file. All you need to do is download it and put the file into the site-packages or somewhere else on your Python path. The primary problem of this parser is that Batchelder no longer supports it. Let’s see if there’s an easy way to get the information that we need.

import id3reader
 
#----------------------------------------------------------------------
def getTags(path):
    """"""
    id3r = id3reader.Reader(path)
 
    print "Artist: %s" % id3r.getValue('performer')
    print "Album: %s" % id3r.getValue('album')
    print "Track: %s" % id3r.getValue('title')
    print "Release Year: %s" % id3r.getValue('year')

Well, I didn’t see an obvious way to get the track length with this module without knowing the ID3 specification. Alas! While I like the simplicity and power of this module, the lack of support and a super simple API makes me reject it in favor of eyeD3. For now, that will be my library of choice for my mp3 player. If you know of a great ID3 parsing script, feel free to drop me a line in the comments. I saw others listed on Google as well, but quite a few of them were just as dead as Batchelder’s was.

Print Friendly, PDF & Email
  • Gabriel

    mutagen can be easier to use … after struggling for a bit doing something similar I realized you can use the EasyID3 functions to get a dictionary of the common fields, namely for your example

    [code]
    from mutagen.mp3 import MP3
    from mutagen.easyid3 import EasyID3

    #———————————————————————-
    def getTags(path):
    “”””””
    m = MP3(path, id3=EasyID3)

    print “Artist: %s” % m['artist'][0]
    print “Album: %s” % m['album'][0]
    print “Track: %s” % m['title'][0]
    print “Release Year: %s” % m['date][0]
    [/code]
    you need to index as the result is a list. I usually just wrap this functionality into a simple to use class

  • Gabriel

    oops … forgot to add that mutagen can also give you the length with:
    [code]
    m.info.length
    [/code]
    And the interface I have shown works across all the supported platforms, just change the MP3 call to Vorbis etc

  • driscollis

    Hi Gabriel,

    I figured it was something simple that I was missing when it came to mutagen. Thank you for the information!

    – Mike

  • In working with our Python mp3 player we realized we needed to research what Python had to offer for parsing ID3 tags. In this post, we will discover the wild world of MP3 tag parsing in Python.

  • Thank you for sharing those codes. I'm new to Phython and I'm just finding some ways how to properly use it.

  • I prefer taglib, a versatile and fast C++ library with multi-format support.
    There are two python bindings available for taglib, the newer (and maintained) one being tagpy:
    http://mathema.tician.de/software/tagpy
    HTH,
    Hans