PyDev of the Week: Paul Ganssle

This week we welcome Paul Ganssle (@pganssle) as our PyDev of the Week. Paul is the maintainer of the dateutil package and also a maintainer of the setuptools project. You can catch up with Paul on his website or check out some of his talks. Let’s take a few moments to get to know Paul better!

Can you tell us a little about yourself (hobbies, education, etc):

One thing that sometimes surprises people is that I started out my career as a chemist. I have a bachelor’s degree in Chemistry from the University of Massachusetts, Amherst and a Ph.D in Physical Chemistry from the University of California, Berkeley. After that I worked for two years building NMR (nuclear magnetic resonance) devices for use in oil wells. In 2015 I was looking for a career with a bit more flexibility in terms of location and I made the switch to software development; one thing that is nice about the software industry is that tech companies are not afraid to hire people with non-traditional backgrounds if they know how to code.

Paul Ganssle

I have the typical assortment of “hacker” and “autodidact” hobbies – learning languages, picking locks, electronics projects, etc. One of my favorite projects (which has unfortunately fallen a bit by the wayside) is my HapticapMag, a haptic compass that I built into a hat. I had it up and working for 2 or 3 weeks, but some parts broke and I never got around to fixing it. My tentative plan is to start up some new electronics projects in 4-5 years, when my son is old enough to be interested in that sort of thing.

Why did you start using Python?

I have two origin stories for this, actually. The more boring one is that around 2008 a friend of mine told me about this cool and increasingly popular programming language called Python that I should definitely learn, and I sort of picked it up and started using it for little system automation tasks.

What really got me into Python, though, was when I illustrated some point I was making in a forum post using a graph that I had made in Matlab and someone complained about the terrible aliasing in the plot and suggested I use matplotlib instead. I tried it out and the plots were so much better that I was instantly hooked. After that, I moved everything I could over from Matlab to Python and never looked back.

What other programming languages do you know and which is your favorite?

It’s hard to say when you “know” a programming language, but the programming languages I’m most confident with are C++, C and Rust (and probably some others like Matlab that I haven’t used in years but once knew pretty well). I can write enough Javascript to get by, but to say I know it would be kind of like saying I speak Spanish because I can order a beer and ask where the bathroom is.

At the moment, I’m very excited about Rust, which is a memory-safe systems programming language targeting the use cases where C and C++ currently predominate. One of the very nice things about Rust is that there is a very enthusiastic community out there and it already has a flourishing ecosystem of third party packages, which I think is one reason there’s a lot of excitement about Rust in the Python community.

What projects are you working on now?

I do a lot of maintenance tasks that might not be considered “working on a project” for packages I maintain or help maintain like dateutil, setuptools and CPython (and I try to monitor the PyO3 issue tracker, though I have no official standing in that project); things like reviewing PRs, commenting on issues and participating in discussions.

In terms of features and other improvements, I’ve been trying to prepare a proof of concept for PEP 517-compatible editable installations and I’ve started working a bit on adding time zones to the standard library. I also have a few smaller projects in various states of completion that I occasionally work on, like my library variants, which I’ve given a few talks about, or my only half-complete library pyminimp3 – Python bindings around the C minimp3 library for processing MP3 files.

Which Python libraries are your favorite (core or 3rd party)?

I am a big fan of hypothesis, the property-based testing framework. With hypothesis, you write assertions about a property that your code has (e.g. “this parse operation is the inverse of this format operation”), given a domain of inputs (integers, datetimes, etc), and hypothesis randomly generates example inputs for you. I was originally very uneasy about the fact that the tests you write with it are non-deterministic, but I got over that very quickly; in reality, if there’s a bug in your code, it’s very rare for hypothesis to miss it on the first try, particularly if you are running tests for multiple platforms and multiple interpreters on each PR. The bigger problem I’ve had introducing hypothesis into a code base is that it finds a bunch of obscure edge cases that you haven’t handled, so you need to fix a bunch of bugs just to be able to start using it!

How did you become a core developer of Python?

I have been peripherally involved in Python development since I was asked to comment on PEP 495 (adding the fold attribute) as maintainer of dateutil. In late 2017 I started contributing bug fixes and features more actively, and monitoring the issue tracker for datetime-related issues. One of the biggest bottlenecks in the CPython development process is high-quality reviews, which is something I’m used to doing from years of maintaining open source packages, so I stepped in and started reviewing PRs as well.

In terms of the actual process of becoming a core dev, I took an increasingly common path for newer core devs, which is that Victor Stinner noticed my contributions and asked me if I was interested in eventually becoming a core dev; after I said yes, he and Pablo Galindo Salgado agreed to mentor me through the process of becoming a bug triager and eventually core developer.

Which modules do you work on and why?

My main expertise is with the datetime and related modules (e.g. time); this is partially a result of my randomly stumbling into maintaining dateutil back in early 2015 and partially down to the fact that most other people reallywould prefer not to work on datetimes. I’ve also been involved in packaging as well (as a maintainer of setuptools), but these days distutils doesn’t see much improvement because it is almost always preferable to make the improvements in setuptools instead.

Do you have any advice for others who would like to contribute to Python core?

If you have the opportunity to attend a sprint (as part of a conference or otherwise), that is usually the best way to make a first contribution to any open source project. If not, I think the best advice I can give is the same I would give for contributing to any open source project:

  1. Make a small PR with tightly scoped changes: these are much easier to review and merge.
  2. Minimize or eliminate changes to the public API: every change to the public API is something that the maintainers of that module will have to support indefinitely. Behind-the-scenes changes are a lot easier to accept because they’re a lot easier to undo.
  3. Add tests! Good tests are often the hardest part of a PR to write, so it’s very unlikely that your contribution will be accepted without them. They also make a contribution much easier to review, because they demonstrate exactly what your code is supposed to do and they enforce the behavior!

If you’re already a skilled reviewer, I also recommend looking around in the issue tracker for unreviewed PRs and giving your comments. This will really help the project and is a sure fire way to build a reputation as a solid contributor.

Is there anything else you’d like to say?

Thank you for asking me to do this interview, and thanks to all the readers who’ve indulged my verbosity by reading all the way to the end.

Thanks for doing the interview, Paul!