Category Archives: Cross-Platform

This article will be about a topic that can be used across platforms, such as Linux, Windows and Mac.

Python 3 Concurrency – The concurrent.futures Module

The concurrent.futures module was added in Python 3.2. According to the Python documentation it provides the developer with a high-level interface for asynchronously executing callables. Basically concurrent.futures is an abstraction layer on top of Python’s threading and multiprocessing modules that simplifies using them. However it should be noted that while the abstraction layer simplifies the usage of these modules, it also removes a lot of their flexibility, so if you need to do something custom, then this might not be the best module for you.

Concurrent.futures includes an abstract class called Executor. It cannot be used directly though, so you will need to use one of its two subclasses: ThreadPoolExecutor or ProcessPoolExecutor. As you’ve probably guessed, these two subclasses are mapped to Python’s threading and multiprocessing APIs respectively. Both of these subclasses will provide a pool that you can put threads or processes into.

The term future has a special meaning in computer science. It refers to a construct that can be used for synchronization when using concurrent programming techniques. The future is actually a way to describe the result of a process or thread before it has finished processing. I like to think of them as a pending result.

Continue reading Python 3 Concurrency – The concurrent.futures Module

Python 201: A Tutorial on Threads

The threading module was first introduced in Python 1.5.2 as an enhancement of the low-level thread module. The threading module makes working with threads much easier and allows the program to run multiple operations at once.

Note that the threads in Python work best with I/O operations, such as downloading resources from the Internet or reading files and directories on your computer. If you need to do something that will be CPU intensive, then you will want to look at Python’s multiprocessing module instead. The reason for this is that Python has the Global Interpreter Lock (GIL) that basically makes all threads run inside of one master thread. Because of this, when you go to run multiple CPU intensive operations with threads, you may find that it actually runs slower. So we will be focusing on what threads do best: I/O operations!


Intro to Threads

A thread let’s you run a piece of long running code as if it were a separate program. It’s kind of like calling subprocess except that you are calling a function or class instead of a separate program. I always find it helpful to look at a concrete example. Let’s take a look at something that’s really simple:

import threading
 
 
def doubler(number):
    """
    A function that can be used by a thread
    """
    print(threading.currentThread().getName() + '\n')
    print(number * 2)
    print()
 
 
if __name__ == '__main__':
    for i in range(5):
        my_thread = threading.Thread(target=doubler, args=(i,))
        my_thread.start()

Continue reading Python 201: A Tutorial on Threads

Python: Visualization with Bokeh

The Bokeh package is an interactive visualization library that uses web browsers for its presentation. Its goal is to provide graphics in the vein of D3.js that look elegant and are easy to construct. Bokeh supports large and streaming datasets. You will probably be using this library for creating plots / graphs. One of its primary competitors seems to be Plotly.

Note: This will not be an in-depth tutorial on the Bokeh library as the number of different graphs and visualizations it is capable of is quite large. Instead, the aim of the article is to give you a taste of what this interesting library can do.

Let’s take a moment and get it installed. The easiest way to do so is to use pip or conda. Here’s how you can use pip:

pip install bokeh

This will install Bokeh and all its dependencies. You may want to install Bokeh into a virtualenv because of this, but that’s up to you. Now let’s check out a simple example. Save the following code into a file with whatever name you deem appropriate.

Continue reading Python: Visualization with Bokeh

Python 201: An Intro to mock

The unittest module now includes a mock submodule as of Python 3.3. It will allow you to replace portions of the system that you are testing with mock objects as well as make assertions about how they were used. A mock object is used for simulating system resources that aren’t available in your test environment. In other words, you will find times when you want to test some part of your code in isolation from the rest of it or you will need to test some code in isolation from outside services.

Note that if you have a version of Python prior to Python 3, you can download the Mock library and get the same functionality.

Let’s think about why you might want to use mock. One good example is if your application is tied to some kind of third party service, such as Twitter or Facebook. If your application’s test suite goes out and retweets a bunch of items or “likes” a bunch of posts every time its run, then that is probably undesirable behavior since it will be doing that every time the test is run. Another example might be if you had designed a tool for making updates to your database tables easier. Each time the test runs, it will do some updates on the same records every time and could wipe out valuable data.

Instead of doing any of those things, you can use unittest’s mock. It will allow you to mock and stub out those kinds of side-effects so you don’t have to worry about them. Instead of interacting with the third party resources, you will be running your test against a dummy API that matches those resources. The piece that you care about the most is that your application is calling the functions it’s supposed to. You probably don’t care as much if the API itself actually executes. Of course, there are times when you will want to do an end-to-end test that does actually execute the API, but those tests don’t need mocks!

Continue reading Python 201: An Intro to mock

Yet Another Python datetime Replacement: Pendulum

I stumbled across another new library that purports that it is better than Python’s datetime module. It is called Pendulum. Pendulum is heavily influenced by Carbon for PHP according to its documentation.

These libraries are always interesting, although I am not sure what makes this one better than Arrow or Delorean. There were some comments on Reddit between the creator of Pendulum where he stated that he found Arrow to behave strangely in certain circumstances.

Regardless, this article isn’t about comparing these competing libraries. It’s just to take a look at how Pendulum itself works. One of the primary purposes for these various libraries is to create a datetime module with a more “Pythonic” API and one that is timezone aware.

Let’s get it installed so we can check it out:

pip install pendulum

If you don’t like installing it directly to your main Python installation, feel free to install it in a virtualenv instead. Regardless of where you install it, you will see it install a couple of dependencies of its own such as tzlocal, python-translate, codegen, and polib.

Now that it is installed, let’s open up Python’s interpreter and give it a try:

>>> import pendulum
>>> pendulum.now('Japan')
<pendulum [2016-07-13T10:39:22.788807+09:00]>
>>> pendulum.now()
</pendulum><pendulum [2016-07-12T20:39:27.250320-05:00]>
 
</pendulum>

Continue reading Yet Another Python datetime Replacement: Pendulum

Python 101: An Intro to urllib

The urllib module in Python 3 is a collection of modules that you can use for working with URLs. If you are coming from a Python 2 background you will note that in Python 2 you had urllib and urllib2. These are now a part of the urllib package in Python 3. The current version of urllib is made up of the following modules:

  • urllib.request
  • urllib.error
  • urllib.parse
  • urllib.rebotparser

We will be covering each part individually except for urllib.error. The official documentation actually recommends that you might want to check out the 3rd party library, requests, for a higher-level HTTP client interface. However, I believe that it can be useful to know how to open URLs and interact with them without using a 3rd party and it may also help you appreciate why the requests package is so popular.

Continue reading Python 101: An Intro to urllib

Python 101: Redirecting stdout

Redirecting stdout to something most developers will need to do at some point or other. It can be useful to redirect stdout to a file or to a file-like object. I have also redirected stdout to a text control in some of my desktop GUI projects. In this article we will look at the following:

  • Redirecting stdout to a file (simple)
  • The Shell redirection method
  • Redirecting stdout using a custom context manager
  • Python 3’s contextlib.redirect_stdout()
  • Redirect stdout to a wxPython text control

Continue reading Python 101: Redirecting stdout

Python: An Intro to Regular Expressions

Regular expressions are basically a tiny language all their own that you can use inside of Python and many other programming languages. You will often hear regular expressions referred to as “regex”, “regexp” or just “RE”. Some languages, such as Perl and Ruby, actually support regular expression syntax directly in the language itself. Python only supports them via a library that you need to import. The primary use for regular expressions is matching strings. You create the string matching rules using a regular expression and then you apply it to a string to see if there are any matches.

The regular expression “language” is actually pretty small, so you won’t be able to use it for all your string matching needs. Besides that, while there are some tasks that you can use a regular expression for, it may end up so complicated that it becomes difficult to debug. In cases like that, you should just use Python. It should be noted that Python is an excellent language for text parsing in its own right and can be used for anything you do in a regular expression. However, it may take a lot more code to do so and be slower than the regular expression because regular expressions are compiled down and executed in C.

Continue reading Python: An Intro to Regular Expressions

Python 201: An Intro to importlib

Python provides the importlib package as part of its standard library of modules. Its purpose is to provide the implementation to Python’s import statement (and the __import__() function). In addition, importlib gives the programmer the ability to create their own custom objects (AKA an importer) that can be used in the import process.

What about imp?

There is another module called imp that provides an interface to the mechanisms behind Python’s import statement. This module was deprecated in Python 3.4. It is intended that importlib should be used in its place.

This module is pretty complicated, so we’ll be limiting the scope of this article to the following topics:

  • Dynamic imports
  • Checking is a module can be imported
  • Importing from the source file itself

Let’s get started by looking at dynamic imports!

Continue reading Python 201: An Intro to importlib

Python 201 – super

I’ve written about super briefly in the past, but decided to take another go at writing something more interesting about this particular Python function.

The super built-in function was introduced way back in Python 2.2. The super function will return a proxy object that will delegate method calls to a parent or sibling class of type. If that was a little unclear, what it allows you to do is access inherited methods that have been overridden in a class. The super function has two use cases. The first is in single inheritance where super can be used to refer to the parent class or classes without actually naming them explicitly. This can make your code more maintainable in the future. This is similar to the behavior that you will find in other programming languages, like Dylan’s *next-method*.

The second use case is in a dynamic execution environment where super supports cooperative multiple inheritance. This is actually a pretty unique use case that may only apply to Python as it is not found in languages that only support single inheritance nor in statically compiled languages.

super has had its fair share of controversy even among core developers. The original documentation was confusing and using super was tricky. There were some who even labeled super as harmful, although that article seems to apply more to the Python 2 implementation of super then the Python 3 version. We will start out this chapter by looking at how to call super in both Python 2 and 3. Then we will learn about Method Resolution Order.

Continue reading Python 201 – super