PyDev of the Week: Martijn Faassen

This week we welcome Martijn Faassen (@faassen) as our PyDev of the Week! Martijn is the creator of the popular lxml package and the Morepath web framework, among others. You can see what else he is up to over on Github or check out his blog.

Let’s take a few moments to get to know Martijn better!

Can you tell us a little about yourself (hobbies, education, etc):

I’m from the Netherlands, married, and the father of an 8-year-old son.

I have a garden plot near my house where I grow fruit, vegetables and flowers, and I get a lot of enjoyment out of that. It also gets me moving and away from the computer. I listen to podcasts as I garden. With a garden, you either get very little harvest or a lot all at once. When you get a big harvest the challenge becomes to process it, so I make fruit cordials and jam, and my wife makes sambal (a southeast Asian chili sauce). I like the magic of fermentation too, so I’ve made pickles and alcohol as well as fizzy drinks. I also make yoghurt for fun!

I have a wide range of interests beyond that: I enjoy history, science, and critical thinking. I like to read, both non-fiction as well as fiction. Fiction I enjoy include science fiction and mysteries.

Currently, I’m playing with an e-paper display and writing a Rust driver for it. I also like to dabble with artificial life simulations.

Why did you start using Python?

In 1998 I was working on an artificial life simulation in C++. I was interested in using a scripting language so I could do some automated checks of the source code. On my Linux computer I had Perl available, so I tried that for a little bit. I could see the appeal but much of it was difficult to get into my head. I’d heard of Python as an alternative scripting language. One night I printed out the Python tutorial and I read it before I slept. It made sense to me: “it fits your brain” as was the slogan of the 2001 Python conference. The next day I could write Python. Not very idiomatic Python, but that came fairly quickly afterward. Because it was so much easier than C++ I wanted to use it for everything. I had done some programming temp work already at that time, and the next temp job I got (at the local university) I asked whether I could use Python. “Is it readable?” they asked, and I said yes, and there I was, having a job writing Python in 1998, which was really a rarity at the time. I remember reading the Python mailing list where people wistfully dreamed about a job where they could write Python. Soon after I got involved in web development with Python using Zope, in which I was very involved for many years. For my very personal history of Zope see https://blog.startifact.com/posts/my-exit-from-zope.html

Python had a much smaller community back then. I had to explain what Python was to everybody, even to other programmers. I’ve watched it grow into one of the most popular programming languages in the world. That’s a great advantage in many ways, but a small community has different advantages.

It was so easy to get to know prominent people within it — even though I hadn’t produced much Python code of value in those early days, it was easy to get to know people through mailing lists and at conferences, including Guido. This helped me when I got the ball rolling on the EuroPython series of conferences back in 2002. The first mail I wrote was to Guido asking whether he’d come and whether I could mention his name. After he said “yes”, I mailed a lot of other people, some of whom did a lot more work than I did in organizing it. And then we had our first conference! These days it would take a lot more to get Guido to say “yes”, of course, as there is so much more going on in the community.

It’s interesting to see that already in 1998 Python was a language that had a lot of libraries and integrations: from numeric processing to Windows integration. The breadth and depth of the libraries available for Python is one of its enduring strengths.

What other programming languages do you know and which is your favorite?

I consider myself a web developer and since the frontend has become more prominent in the last decade, JavaScript and TypeScript are also languages I use a lot. I like going all around the stack. Many concepts and programming techniques are as useful in backend code as they are in a frontend codebase.

After looking at Rust for some years, I seriously started to learn it last year. I’ve only used it for hobby projects, but it’s an interesting language because it makes different choices.

I’ve used a lot of programming languages over the years: BASIC, assembler, C, C++, Pascal and SQL among others. I like reading programming language manuals and I play with them, so I’ve dabbled with quite a few more, like Prolog, Haskell, Nim and ReScript, but none of them particularly seriously.

Python was my favorite programming language for years, and I still really enjoy it. But I appreciate aspects of many programming languages, so I find it more difficult to pick a favorite now. I’m happy writing TypeScript or Rust as well.

What projects are you working on now?

A project that I’m quite proud of is Morepath. It’s a Python micro web framework. The codebase is small, it’s easy to use, but packs a lot of power. It’s designed to be used to build complex applications and REST APIs. I wrote a bit about what makes it different here: https://morepath.readthedocs.io/en/latest/compared.html

Morepath is layered on top of two libraries which I also created: Reg, which is an implementation of predicate dispatch, and Dectate, which is an advanced software configuration library built around decorators.

Let me explain Reg.

You can see Python methods as functions that you can view as dispatching on the class of “self”. So if you call the “noise” method on an instance of the Chicken class, it produces a cackle, but if you call “noise” on an instance of the Rocket class, it produces a roaring sound. Which implementation gets executed depends on the class of the object (or “self”) you call it on.

You can do this for functions too: a function that dispatches on the class of the first argument. This way you can add something like “methods” to existing classes without having to change them. This can be useful if you want to strongly decouple domain models from a UI layer, for instance. This is also built into the Python standard library as the @functools.singledispatch decorator.

But sometimes you want to dispatch not just on the first argument, but on multiple arguments at the same time. This is called multiple dispatch. And sometimes you want to dispatch not just on the *class* of an argument but other attributes as well. In Morepath for instance we use Reg to dispatch on web requests: the class of the model instance, the request method (GET/PUT/POST/DELETE, etc), and the name of the last segment of the URL path (the “edit” in “document/edit”). Reg lets you do this.

One thing that would be interesting to explore with Reg is support for Python typing.

To understand more about Dectate I recommend you read my blog article on Framework Patterns. https://blog.startifact.com/posts/framework-patterns.html
The entry about “language integrated declaration” explains the idea behind Dectate.

That’s framework stuff. I also work on applications, of course.

I’ve spent the last few years on a big Python/Django project that does invoice processing, with a React + MobX frontend. I’ve helped build it and mentored developers on the team along the way.

Recently I started working on a project that doesn’t use Python. That’s the first time for me in more than 20 years. It’s all in TypeScript, using React, NextJS and various JS backend technologies. I enjoy discovering cool new ways to do things in JS frameworks, though once every while I run into something that I’m used to in the Python world that I miss in JS: most recently a way to do database rollbacks between test cases, something SQLAlchemy and the Django ORM offer out of the box. But we implemented a way to do it.

Which Python libraries are your favorite (core or 3rd party)?

My friend Holger Krekel created pytest for unit testing, and I’m a big fan of it. I like that you don’t need to remember much to get started with it.

I also like Black, which is a source code auto formatter. Tools that can make manual labor and discussions go away are great.

What is the origin story for the lxml package?

It’s hard to believe now, but long ago XML was cool new technology. Python had a bunch of XML libraries. Some of them were reasonably fast but didn’t have much in the way of features, and others had more features but were slow and memory hungry. There was one library that was both fast and had a lot of features (XPath, XSLT, schemas), in the form of the Python bindings to libxml2 and libxslt which are written in C. But using them was like writing C in Python: you had to be aware of memory leaks and you could get segmentation faults. It wasn’t a nice API at all.

Fredrik Lundh had implemented a Pythonic API for XML that was much better than the DOM API: ElementTree, which was later added to the Python standard library. About the same time I was working on lxml he’d also implemented cElementTree, which was quite fast. But it was still mostly a DOM API and didn’t support other features.

I decided to see whether I could implement the elementtree API on top of libxml2, ignoring the existing Python bindings. To do this I used Pyrex, which was later renamed to Cython. I got it to work: I had a Pythonic API based on ElementTree that was very fast, expanded with support for XPath, XSLT and schemas.

I then had the great fortune that someone much better at maintaining it, Stefan Behnel, took it over early on and pushed it much further. He also helped start the Cython project to improve Pyrex.

It’s the one open-source project that I created that I run into everywhere.

Thanks for doing the interview, Martijn!