This week we welcome Zac Hatfield Dodds as our PyDev of the Week! Zac is a core developer of the Hypothesis package, which is a Python library for creating unit tests.
Let’s take a few moments to get to know Zac better!
Can you tell us a little about yourself (hobbies, education, etc):
I grew up in Australia, until moving to San Francisco for work last year, and for as long as I can remember I’ve loved the outdoors. My family goes camping regularly – at the beach all year as a kid, or for weeks in the mountains every summer. I’ve also always loved reading so you can often find me tucked into a science fiction or fantasy novel in the evenings – I put up with ebooks when travelling to save weight – and I also enjoy nonfiction, usually about history or technology or both. I’ve also been known to make iridescent chocolate for parties, tempering it against a diffraction grating – it’s edible and beautiful nanotechnology!
My undergrad was in interdisciplinary studies at the Australian National University – I started programming towards the end of that time when I discovered that the projects I wanted to do in science classes required traditional programming, for data analysis which was beyond Microsoft Excel. These days I lead a research team at Anthropic, but still working towards my PhD in evenings and weekends – it’s in computer science and largely based on my open-source projects, so while there’s not much subject-matter overlap with work I still spend a lot of time thinking about it.
How did you get involved with the Hypothesis project?
I started using Python late in my degree for scientific data analysis, and loved it. I spent most of my project classes working on various packages for scientific data analysis. In fact one of those packages was my first contribution to the python package index and my first GitHub repository which was useful to someone else! Two of these projects also turned into part-time jobs, which was a lot of fun.
I actually discovered Hypothesis, which was started by David MacIver, through one of those science projects. There’s a research forest at the national arboretum, and the School of Biology had recently started using drones to capture high-resolution 3D models of the forest, which I still find incredibly cool. My project was to take this many-gigabyte 3D model, and… count the trees. We knew how many had been planted, but hadn’t actually counted them and wanted to get some basic metrics like a list of trees with height, area, and canopy colour automatically.
The problem was that my program worked perfectly for the small data sets that we had, and it worked perfectly when I clipped a subset of the large data set, but whenever we run it on the full large data set my program would spend about 16 hours processing it and then crash. It’d get all the way through the data processing, create final outputs, and then crash instead of writing out the output file. I didn’t know what to do, but when it takes you a full working day to do one run of your program before it crashes you have plenty of time to study better ways to test code!
I’d read about property based testing before and decided that it sounded pretty promising. I spent about a day reading the hypothesis docs, playing around with it, and implementing some extensions to its numpy support that I needed to generate arrays of structs (to create input files). Once I had that, it was a matter of about 20 seconds to discover that the library I was using would crash if you attempted to write a file where one of the column names contained a space character. 15 minutes later I had patched it, installed the patched version of the library, and set off another run! I discovered the day after that that had indeed fixed my problem and so after several weeks of confusion and stress the project was successful after all – and I was hooked.
I contributed my extensions upstream, hung around and fixed a few other bugs and documentation problems that I’d seen, and because I had enjoyed that so much I kept hanging around and kept fixing more. Six years later I lead the Hypothesis project, and I’ve made some great friends and gotten to work on fascinating projects that I don’t think would have happened otherwise!
Which Python libraries are your favourite (core or 3rd party)?
I often end up contributing to my favourite libraries, so I want to be clear that I was a fan first – it’s not because I help maintain things that I like them, though I find it does deepen my appreciation.
I’m a huge fan of Trio, the library for async structured concurrency – I feel like I can in fact write nontrivial concurrent code and get it right, which is incredible. Read Notes on Structured Concurrency or watch my talk from PyCon to learn more!
I also really like Hypothesis and the surrounding ecosystem, or I wouldn’t spend so much time working on it. See the rest of this interview!
LibCST makes automated refactoring remarkably easy, and I’ve enjoyed building auto-fixer tools for deprecations in Hypothesis or async antipatterns in flake8-trio, and autoformatting tools like shed have also been fun.
In the standard library, contextlib is an under-appreciated tool for correct code – without try/finally or context managers, forgetting to clean up or close a resource is all too easy when an unexpected exception is raised. I also find functools remarkably useful – easy caching and partial application are great.
What projects are you working on now?
My current open source projects are mostly related to testing in Python. I lead the Hypothesis project, and have a variety of related spinoffs / extensions / new features as part of my ongoing PhD – thankfully the implementation is done, so I’m “just” writing my dissertation around work.
The Ghostwriter writes tests for you – sometimes ready to run, other times partially-filled templates to help you get started. The idea is to make it as quick and easy as possible to start using Hypothesis!
Hypothesis’ explain phase makes it easier to understand why your test failed, going beyond the traditional minimal failing example. We vary each part of the input, trying to find parts of the minimal failing example which can vary without the test ever passing – for example, in `divide(x=0, y=0)` x can be any value! We also report lines of code which were always run by failing examples but never passing examples (if any!), which can be a useful complement to a traceback.
HypoFuzz adds smart, coverage-guided fuzzing for your whole test suite. It’s a higher-overhead way to find failing inputs, but if you want to run something overnight or all weekend it’s a fantastic addition to your workflow – and has zero-effort integration with your local or CI test runs via the shared database of failing examples. You just run your tests, and they replay whatever the fuzzer found!
Finally, I work on a variety of unrelated projects too – as a Pytest core dev I’m particularly interested in making warnings and error messages as helpful as possible (sometimes in CPython); I contribute to Trio occasionally for the async ecosystem; and have added several linter warnings to flake8-bugbear and started flake8-trio. On the niche side, I have an autoformatter named shed which wraps black, pyupgrade, isort, autoflake, and a bunch of custom refactoring logic to provide a maximally standardised code style.
Thanks for doing the interview, Zac!