Category Archives: Cross-Platform

This article will be about a topic that can be used across platforms, such as Linux, Windows and Mac.

Python 101: An Intro to urllib

The urllib module in Python 3 is a collection of modules that you can use for working with URLs. If you are coming from a Python 2 background you will note that in Python 2 you had urllib and urllib2. These are now a part of the urllib package in Python 3. The current version of urllib is made up of the following modules:

  • urllib.request
  • urllib.error
  • urllib.parse
  • urllib.rebotparser

We will be covering each part individually except for urllib.error. The official documentation actually recommends that you might want to check out the 3rd party library, requests, for a higher-level HTTP client interface. However, I believe that it can be useful to know how to open URLs and interact with them without using a 3rd party and it may also help you appreciate why the requests package is so popular.

Continue reading Python 101: An Intro to urllib

Python 101: Redirecting stdout

Redirecting stdout to something most developers will need to do at some point or other. It can be useful to redirect stdout to a file or to a file-like object. I have also redirected stdout to a text control in some of my desktop GUI projects. In this article we will look at the following:

  • Redirecting stdout to a file (simple)
  • The Shell redirection method
  • Redirecting stdout using a custom context manager
  • Python 3’s contextlib.redirect_stdout()
  • Redirect stdout to a wxPython text control

Continue reading Python 101: Redirecting stdout

Python: An Intro to Regular Expressions

Regular expressions are basically a tiny language all their own that you can use inside of Python and many other programming languages. You will often hear regular expressions referred to as “regex”, “regexp” or just “RE”. Some languages, such as Perl and Ruby, actually support regular expression syntax directly in the language itself. Python only supports them via a library that you need to import. The primary use for regular expressions is matching strings. You create the string matching rules using a regular expression and then you apply it to a string to see if there are any matches.

The regular expression “language” is actually pretty small, so you won’t be able to use it for all your string matching needs. Besides that, while there are some tasks that you can use a regular expression for, it may end up so complicated that it becomes difficult to debug. In cases like that, you should just use Python. It should be noted that Python is an excellent language for text parsing in its own right and can be used for anything you do in a regular expression. However, it may take a lot more code to do so and be slower than the regular expression because regular expressions are compiled down and executed in C.

Continue reading Python: An Intro to Regular Expressions

Python 201: An Intro to importlib

Python provides the importlib package as part of its standard library of modules. Its purpose is to provide the implementation to Python’s import statement (and the __import__() function). In addition, importlib gives the programmer the ability to create their own custom objects (AKA an importer) that can be used in the import process.

What about imp?

There is another module called imp that provides an interface to the mechanisms behind Python’s import statement. This module was deprecated in Python 3.4. It is intended that importlib should be used in its place.

This module is pretty complicated, so we’ll be limiting the scope of this article to the following topics:

  • Dynamic imports
  • Checking is a module can be imported
  • Importing from the source file itself

Let’s get started by looking at dynamic imports!

Continue reading Python 201: An Intro to importlib

Python 201 – super

I’ve written about super briefly in the past, but decided to take another go at writing something more interesting about this particular Python function.

The super built-in function was introduced way back in Python 2.2. The super function will return a proxy object that will delegate method calls to a parent or sibling class of type. If that was a little unclear, what it allows you to do is access inherited methods that have been overridden in a class. The super function has two use cases. The first is in single inheritance where super can be used to refer to the parent class or classes without actually naming them explicitly. This can make your code more maintainable in the future. This is similar to the behavior that you will find in other programming languages, like Dylan’s *next-method*.

The second use case is in a dynamic execution environment where super supports cooperative multiple inheritance. This is actually a pretty unique use case that may only apply to Python as it is not found in languages that only support single inheritance nor in statically compiled languages.

super has had its fair share of controversy even among core developers. The original documentation was confusing and using super was tricky. There were some who even labeled super as harmful, although that article seems to apply more to the Python 2 implementation of super then the Python 3 version. We will start out this chapter by looking at how to call super in both Python 2 and 3. Then we will learn about Method Resolution Order.

Continue reading Python 201 – super

Python 101: An Intro to Benchmarking your code

What does it mean to benchmark ones code? The main idea behind benchmarking or profiling is to figure out how fast your code executes and where the bottlenecks are. The main reason to do this sort of thing is for optimization. You will run into situations where you need your code to run faster because your business needs have changed. When this happens, you will need to figure out what parts of your code are slowing it down.

This chapter will only cover how to profile your code using a variety of tools. It will not go into actually optimizing your code. Let’s get started!

Continue reading Python 101: An Intro to Benchmarking your code

Python 3: An Intro to Encryption

Python 3 doesn’t have very much in its standard library that deals with encryption. Instead, you get hashing libraries. We’ll take a brief look at those in the chapter, but the primary focus will be on the following 3rd party packages: PyCrypto and cryptography. We will learn how to encrypt and decrypt strings with both of these libraries.


If you need secure hashes or message digest algorithms, then Python’s standard library has you covered in the hashlib module. It includes the FIPS secure hash algorithms SHA1, SHA224, SHA256, SHA384, and SHA512 as well as RSA’s MD5 algorithm. Python also supports the adler32 and crc32 hash functions, but those are in the zlib module.

One of the most popular uses of hashes is storing the hash of a password instead of the password itself. Of course, the hash has to be a good one or it can be decrypted. Another popular use case for hashes is to hash a file and then send the file and its hash separately. Then the person receiving the file can run a hash on the file to see if it matches the hash that was sent. If it does, then that means no one has changed the file in transit.

Continue reading Python 3: An Intro to Encryption

Python 101: How to timeout a subprocess

The other day I ran into a use case where I needed to communicate with a subprocess I had started but I needed it to timeout. Unfortunately, Python 2 does not have a way to timeout the communicate method call so it just blocks until it either returns or the process itself closes. There are lots of different approaches that I found on StackOverflow, but I think my favorite was using Python’s threading module’s Timer class:

import subprocess
from threading import Timer
kill = lambda process: process.kill()
cmd = ['ping', '']
ping = subprocess.Popen(
    cmd, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
my_timer = Timer(5, kill, [ping])
    stdout, stderr = ping.communicate()

Continue reading Python 101: How to timeout a subprocess

Python 201: namedtuple

Python’s collections module has specialized container datatypes that can be used to replace Python’s general purpose containers. The one that we’ll be focusing on here is the namedtuple which you can use to replace Python’s tuple. Of course, the namedtuple is not a drop-in replacement as you will soon see. I have seen some programmers use it like a struct. If you haven’t used a language with a struct in it, then that needs a little explanation. A struct is basically a complex data type that groups a list of variables under one name. Let’s look at an example of how to create a namedtuple so you can see how they work:

from collections import namedtuple
Parts = namedtuple('Parts', 'id_num desc cost amount')
auto_parts = Parts(id_num='1234', desc='Ford Engine',
                   cost=1200.00, amount=10)
print auto_parts.id_num

Here we import namedtuple from the collections module. Then we called namedtuple, which will return a new subclass of a tuple but with named fields. So basically we just created a new tuple class. you will note that we have a strange string as our second argument. This is a space delimited list of properties that we want to create.

Now that we have our shiny new class, let’s create an instance of it! As you can see above, we do that as our very next step when we create the auto_parts object. Now we can access the various items in our auto_parts using dot notation because they are now properties of our Parts class.

One of the benefits of using a namedtuple over a regular tuple is that you no longer have to keep track of each item’s index because now each item is named and accessed via a class property. Here’s the difference in code:

>>> auto_parts = ('1234', 'Ford Engine', 1200.00, 10)
>>> auto_parts[2]  # access the cost
>>> id_num, desc, cost, amount = auto_parts
>>> id_num

In the code above, we create a regular tuple and access the cost of the vehicle engine by telling Python the appropriate index we want. Alternatively, we can also extract everything from the tuple using multiple assignment. Personally, I prefer the namedtuple approach just because it fits the mind easier and you can use Python’s dir() method to inspect the tuple and find out its properties. Give that a try and see what happens!

The other day I was looking for a way to convert a Python dictionary into an object and I came across some code that did something like this:

>>> from collections import namedtuple
>>> Parts = {'id_num':'1234', 'desc':'Ford Engine',
             'cost':1200.00, 'amount':10}
>>> parts = namedtuple('Parts', Parts.keys())(**Parts)
>>> parts
Parts(amount=10, cost=1200.0, id_num='1234', desc='Ford Engine')

This is some weird code, so let’s take it a piece at a time. The first line we import namedtuple as before. Next we create a Parts dictionary. So far, so good. Now we’re ready for the weird part. Here we create our namedtuple class and name it ‘Parts’. The second argument is a list of the keys from our dictionary. The last piece is this strange piece of code: (**Parts). The double asterisk means that we are calling our class using keyword arguments, which in this case is our dictionary. We could split this line into two parts to make it a little clearer:

>>> parts = namedtuple('Parts', Parts.keys())
>>> parts
<class '__main__.Parts'>
>>> auto_parts = parts(**Parts)
>>> auto_parts
Parts(amount=10, cost=1200.0, id_num='1234', desc='Ford Engine')

So here we do the same thing as before, except that we create the class first, then we call the class with our dictionary to create an object. The only other piece I want to mention is that namedtuple also accepts a verbose argument and a rename argument. The verbose argument is a flag that will print out class definition right before it’s built if you set it to True. The rename argument is useful if you’re creating your namedtuple from a database or some other system that your program doesn’t control as it will automatically rename the properties for you.

Wrapping Up

Now you know how to use Python’s handy namedtuple. I am already finding multiple uses for it in my code base and I hope you’ll find it helpful in yours. Happy coding!

Related Reading

Python – The datefinder package

Earlier this week, I came across another fun package called datefinder. The idea behind this package is that it can take any string with dates in it and transform them into a list of Python datetime objects. I would have loved this package at my old job where I did a lot of text file and database query parsing as there were many a time when finding the date and getting it into a format I could easily use was quite the nuisance.

Anyway, to install this handy package all you need to do is this:

pip install datefinder

I should note that when I ran this, it ended up also installing the following packages:

  • PyYAML-3.11
  • dateparser-0.3.2
  • jdatetime-1.7.2
  • python-dateutil-2.4.2
  • pytz-2015.7
  • regex-2016.1.10
  • six-1.10.0
  • umalqurra-0.2

Because of all this extra stuff, you might want to install this package into a virtualenv first. Let’s take a look at some code. Here’s a quick demo I tried:

>>> import datefinder
>>> data = '''Your appointment is on July 14th, 2016. Your bill is due 05/05/2016'''
>>> matches = datefinder.find_dates(data)
>>> for match in matches:
... 	print(match)
2016-07-14 00:00:00
2016-05-05 00:00:00

As you can see, it worked quite well with these two common date formats. Another format that I used to have to support was the fairly typical ISO 8601 date format. Let’s see how datefinder behaves with that.

>>> data = 'Your report is due: 2016-02-04T20:16:26+00:00'
>>> matches = datefinder.find_dates(x)
>>> for i in matches: 
...     print(i)
2016-02-04 00:00:00
2016-02-04 20:16:26

Interestingly, this particular version of the ISO 8601 format causes datefinder to return two matches. The first is just the date while the second has both the date and the time. Anyway, hopefully you’ll find this package useful in your projects. Have fun!