All posts by Mike

ReportLab: PDF Publishing with Python is now Available!

My latest book, ReportLab: PDF Processing with Python is now available for purchase.

ReportLab has been around since the year 2000 and has remained the primary package that Python developers use for creating reports in the PDF format. It is an extremely powerful package that works across all the major platforms. This book will also introduce the reader to other Python PDF packages.

You can get the book at the following online retailers:

PyDev of the Week: Qumisha Goss

This week we welcome Qumisha Goss (@QatalystGoss) as our PyDev of the Week. Q is a librarian from Detroit who gave one of the best keynotes I’ve ever seen at PyCon US this year. For some reason, the people who uploaded the Keynotes from that morning didn’t separate the keynotes from each other or from the morning’s lightning talks, so you have to seek about 2/3’s of the way through the official video to find Q’s keynote here: Continue reading PyDev of the Week: Qumisha Goss

Anaconda Data Science Survey Results

Anaconda announced the results of their first “state of data science” survey today. Out of the 4,218 responses, 26% were from students, 16% were from data scientists, 15% were academics and 15% were software developers.

The key takeaways according to Anaconda are as follows:

  • 99% of respondents use Anaconda for Python
  • Docker and Kubernetes is beating out Hadoop / Spark.
  • Google Cloud data services beat out Amazon Web Services and Microsoft’s Azure
  • Matplotlib is still the top package for visualization. However “Plotly, Tableau, Microsoft Power BI and Tibco Spotfire are all strong commercial
    competitors to Matplotlib and other open source projects like ggplot, Bokeh, D3 and Altair.”
  • 14% of respondents do machine learning with Anaconda
  • The fact that Anaconda is free was ranked as its most important attribute while the fact that it has open source licensing was second to last.

That last bullet point kind of bothers me, but I guess students don’t understand the importance of open source. Regardless, this is a really interesting survey on one of the top alternative versions of the Python distribution. You can read the full announcement here.

wxPython: Set Which Display the Frame is on

The other day, I saw an interesting question in the wxPython IRC channel. They were asking if there was a way to set which display their application would appear on. Robin Dunn, the creator of wxPython, gave the questioner some pointers, but I decided to go ahead and write up a quick tutorial on the topic.

The wxPython toolkit actually has all the bits and pieces you need for this sort of thing. The first step is getting the combined screen size. What I mean by this is asking wxPython what it thinks is the total size of the screen. This would be the total width and height of all your displays combined. You can get this by calling wx.DisplaySize(), which returns a tuple. If you would like to get individual display resolutions, then you have to call wx.Display and pass in the index of the display. So if you have two displays, then the first display’s resolution could be acquired like this:

index = 0
display = wx.Display(index)
geo = display.GetGeometry()

Let’s write up a quick little application that has a single button that will just switch which display the application is on. Continue reading wxPython: Set Which Display the Frame is on

Python 3 – Assignment Expressions

I recently came across PEP 572, which is a proposal for adding assignment expressions to Python 3.8 from Chris Angelico, Tim Peters and Guido van Rossum himself! I decided to check it out and see what an assignment expression was. The idea is actually quite simple. The Python core developers want a way to assign variables within an expression using the following notation:

NAME := expr

This topic has had a LOT of arguments about it and you can read the details on the Python-Dev Google group if you want to. I personally found that reading through the various pros and cons put forth by Python’s core development community to be very insightful.

Regardless, let’s look at some of the examples from PEP 572 to see if we can figure out how you might use an assignment expression yourself.

# Handle a matched regex
if (match := is not None:
# A more explicit alternative to the 2-arg form of iter() invocation
while (value := read_next_item()) is not None:
# Share a subexpression between a comprehension filter clause and its output
filtered_data = [y for x in data if (y := f(x)) is not None]

In these 3 examples, we are creating a variable in the expression statement itself. The first example creates the variable, match, by assigning it the result of the regex pattern search. The second example assigns the variable, value, to the result of calling a function in the while loop’s expression. Finally we assign the result of calling f(x) to the variable y inside of a list comprehension.

One of the most interesting features of assignment expressions (to me at least), is that they can be used in contexts that an assignment statement cannot, such as in a lambda or the previously mentioned comprehension. However they also do NOT support some things that assignment statements can do. For example, you cannot to multiple target assignment:

x = y = z = 0  # Equivalent: (x := (y := (z := 0)))

You can see a full list of differences in the PEP

There is a lot more information in the PEP that covers a few other examples, talks about rejected alternatives and scope.

Related Reading

PyDev of the Week: Naomi Ceder

This week we welcome Naomi Ceder (@NaomiCeder) as our PyDev of the Week. Naomi has been a long-time member of the Python community and is the author of The Quick Python Book. Naomi is the current chair of the board of directors for the Python Software Foundation and is a regular speaker at programming conferences. Let’s take a few moments to get to know Naomi better!

Can you tell us a little about yourself (hobbies, education, etc):

Well, a lot of people know that I’m currently chair of the PSF, and many have heard me speak about my journey involving gender.

People seem to be surprised to find out that I have a PhD in Classics (from the U of Wisconsin, Madison) – Ancient Greek, Latin, and historical Linguistics, with some Sanskrit and Egyptian Hieroglyphs thrown in. So I’ve been interested in different human languages and how they work for a long time and I think that’s helped me learn and think about computer languages. It certainly helps me read languages like Portuguese, Spanish, and French. Sadly, it doesn’t do much to help me speak those languages so I end up understanding a fair bit of what I read or hear, but then being tongue tied and looking like a dolt when I try to have a conversation.

My other odd claim to fame is that I was a competitive dog obedience trainer and judge (I even wrote a humor column for the national obedience training magazine) for several years, taking my first dog through all the levels of AKC and UKC obedience. That’s a pretty time-consuming hobby, and I retired from it a few years ago. Our current dog knows some obedience stuff, but she also knows that I’m no longer the stickler that I was when we were competing, and she takes advantage of that. But she’s cute and she knows how to work that, so who am I to nitpick? All of our dogs so far have been Australian shepherds – smart, reasonably athletic and energetic, but usually with a bit more of a sense of humor than you’d find in border collies. Anyone who knows the breed will understand what I mean.

Finally, since I started my current job leading a team for Dick Blick Art Materials, I’ve started drawing, something I’d always wanted to do. Like a lot of people, I think, I’d always wanted to capture some of the scenes around me, but I just felt it was beyond me. I guess the lure of using my employee discount, combined with thinking and talking about the tools and materials every day, overcame my reluctance and I started studying/practicing about a year and a half ago. In particular, I’m interested in what’s called “urban sketching”, quick sketches capturing city scenes, since I’ve always been fascinated by the scenes and shapes of cities. After PyCon I shared a sketch I did of the Rock and Roll Hall of Fame on Twitter, which was the first time I’ve done that. I don’t claim to be good, but I am getting better, and I do enjoy it.

Continue reading PyDev of the Week: Naomi Ceder

An Intro to PyPDF2

The PyPDF2 package is a pure-Python PDF library that you can use for splitting, merging, cropping and transforming pages in your PDFs. According to the PyPDF2 website, you can also use PyPDF2 to add data, viewing options and passwords to the PDFs too. Finally you can use PyPDF2 to extract text and metadata from your PDFs.

PyPDF2 is actually a fork of the original pyPdf which was written by Mathiew Fenniak and released in 2005. However, the original pyPdf’s last release was in 2014. A company called Phaseit, Inc spoke with Mathieu and ended up sponsoring PyPDF2 as a fork of pyPdf

At the time of writing this book, the PyPDF2 package hasn’t had a release since 2016. However it is still a solid and useful package that is worth your time to learn.

The following lists what we will be learning in this article:

  • Extracting metadata
  • Splitting documents
  • Merging 2 PDF files into 1
  • Rotating pages
  • Overlaying / Watermarking Pages
  • Encrypting / decrypting

Let’s start by learning how to install PyPDF2!


PyPDF2 is a pure Python package, so you can install it using pip (assuming pip is in your system’s path):

python -m pip install pypdf2

As usual, you should install 3rd party Python packages to a Python virtual environment to make sure that it works the way you want it to. Continue reading An Intro to PyPDF2

Creating and Manipulating PDFs with pdfrw

Patrick Maupin created a package he called pdfrw and released it back in 2012. The pdfrw package is a pure-Python library that you can use to read and write PDF files. At the time of writing, pdfrw was at version 0.4. With that version, it supports subsetting, merging, rotating and modifying data in PDFs. The pdfrw package has been used by the rst2pdf package (see chapter 18) since 2010 because pdfrw can “faithfully reproduce vector formats without rasterization”. You can also use pdfrw in conjunction with ReportLab to re-use potions of existing PDFs in new PDFs that you create with ReportLab.

In this article, we will learn how to do the following:

  • Extract certain types of information from a PDF
  • Splitting PDFs
  • Merging / Concatenating PDFs
  • Rotating pages
  • Creating overlays or watermarks
  • Scaling pages
  • Combining the use of pdfrw and ReportLab

Let’s get started! Continue reading Creating and Manipulating PDFs with pdfrw

Creating PDFs with PyFPDF and Python

ReportLab is the primary toolkit that I use for generating PDFs from scratch. However I have found that there is another one called PyFPDF or FPDF for Python. The PyFPDF package is actually a port of the “Free”-PDF package that was written in PHP. There hasn’t been a release of this project in a few years, but there have been commits to its Github repository so there is still some work being done on the project. The PyFPDF package supports Python 2.7 and Python 3.4+.

This article will not be exhaustive in its coverage of the PyFPDF package. However it will cover more than enough for you to get started using it effectively. Note that there is a short book on PyFPDF called “Python does PDF: pyFPDF” by Edwood Ocasio on Leanpub if you would like to learn more about the library than what is covered in this chapter or the package’s documentation.


Installing PyFPDF is easy since it was designed to work with pip. Here’s how:

python -m pip install fpdf

At the time of writing, this command installed version 1.7.2 on Python 3.6 with no problems whatsoever. You will notice when you are installing this package that it has no dependencies, which is nice. Continue reading Creating PDFs with PyFPDF and Python

PyDev of the Week: Maria Camila Remolina Gutiérrez

This week we welcome Maria Camila Remolina Gutiérrez (@holamariacamila) as our PyDev of the Week! Maria recently gave a talk at PyCon USA in their new PyCon Charlas track last month. You can learn more about Maria on her website or you can check out her Github profile to see what she has been doing in the open source world. Let’s take a few minutes to get to know her better!

Can you tell us a little about yourself (hobbies, education, etc):

My name is Maria Camila. I am a 23 year old colombian. I am a Physicist and soon to be Systems and Computing Engineer (I will graduate in June 2018). I am studying at Universidad de los Andes in Bogotá, Colombia. My main area of research so far is computational astrophysics, especially simulations. I am very interested in the topics of infrastructure, high performance computing and security. Regarding my hobbies, I love pottery and ceramics, I have been doing it for 1.5 years so far. Also, I’ve been trying to learn bongo drums in my spare time. Finally, I love learning new languages, especially because it is binded to the culture of the countries that speak them, so it allows you to discover the world in a different way. Continue reading PyDev of the Week: Maria Camila Remolina Gutiérrez