We all write bad code from time to time. I tend to do it deliberately when I’m experimenting in the interpreter or the debugger. I can’t say why I do it exactly except that it’s fun to see what I can fit in one line of code. Anyway, this week I needed to parse a file that looked like this:

00000       TM        YT                                                TSA1112223  0000000000000000000000000020131007154712PXXXX_Name_test_test_section.txt
data line 1
data line 2


You can download it here if you don’t want to scroll around the above snippet. Anyway, the objective was to parse out the part of the filename in the header that is labeled “section” in the above. The filename itself is XXXX_Name_test_test_section.txt. That part of the filename is used to identify something in a database. So in the Python interpreter, I came up with the following hack:

header.split("|")[0][125:].split("_")[-1].split(".")[0]

Let’s give this some more context. The above snippet fits into a function like this:

#----------------------------------------------------------------------
def getSection(path):
    """"""
    with open(path) as inPath:
        lines = inPath.readlines()
 
    header = lines.pop(0)
    section = header.split("|")[0][125:].split("_")[-1].split(".")[0]

That’s pretty ugly code and would require a fairly lengthy comment explaining what I’m doing. Not that I planned to actually use it in the first place, but it was fun to write. Anyway, I ended up breaking it down into some code that’s something like the following:

#----------------------------------------------------------------------
def getSection(path):
    """"""
    with open(path) as inPath:
        lines = inPath.readlines()
 
    header = lines.pop(0)
    filename = header[125:256].strip()
    fname = os.path.splitext(filename)[0]
    section = fname.split("_")[-1]
 
    print section
    print header.split("|")[0][125:].split("_")[-1].split(".")[0]
 
#----------------------------------------------------------------------
if __name__ == "__main__":
    path = "test.txt"
    getSection(path)

I’m leaving the “bad” code in the example above to show that the result is the same. Yes, it’s still a little ugly, but it’s much easier to follow and the code is self-documenting.

What funky code have you written lately?

Print Friendly