PyDev of the Week: Jim Crist-Harif

This week we welcome Jim Crist-Harif (@jcristharif) as our PyDev of the Week! Jim is a contributor to Dask, Skein and several other data science / machine learning Python packages. Jim also blogs about Python. You can see what he is currently working on over on Github.

Let’s spend some time getting to know Jim!

Can you tell us a little about yourself (hobbies, education, etc):

Hi, I’m Jim! I grew up near Minneapolis, MN. Growing up we weren’t allowed much screen time, so I didn’t really get into computer-y things until college. I was more into building physical things, and spent a large amount of time in my dad’s workshop.

In college I studied Mechanical Engineering, and liked it so much I continued on to graduate school, focusing on System Dynamics and Controls. Graduate school ended up being fairly detrimental for my mental health, so after 2 years I quit, moved to TX, and took a job with Anaconda. This turned out to be a great decision! My job there was to better the Python ecosystem, which let me work on all sorts of interesting projects (it also led me to give several talks).

Most of my work in the last 5 years has been related (in one way or another) to Dask – a flexible library for parallel computing. When I started working on Dask it was a small side project – it’s been rewarding to see it grow into a large ecosystem, with multiple companies heavily invested in its development and maintenance.

Outside of work, I’m an avid rock climber, cyclist, and wood carver.

Why did you start using Python?

Like any good grad student, I got into hacking on open source Python as a way to procrastinate working on my research :). I was using SymPy to help derive the equations of motion for my research, and started to contribute back features I needed to continue my work. This gradually cascaded, until in the summer of 2014 I was accepted to participate in Google Summer of Code, working on improving SymPy’s classical mechanics and code generation. I learned a lot about software development best practices that summer (my academic code previously was quite unmaintainable), and by the end of it I was hooked.

What other programming languages do you know and which is your favorite?

Most of the work I do is in Python (with some extensions written in Cython or C). I’ve also written and maintained projects in Julia, C++, Java (skein) and Go (dask-gateway). Python is definitely my favorite user-facing language though – if I’m writing code in another language it’s usually to expose something to a Python dev.

What projects are you working on now?

Currently, I’m a software engineer at Prefect, a next-gen workflow management system. Prefect workflows frequently run on Dask, so a lot of the work I’ve been doing lately has been improving their Dask integrations.

I’m also still working on Dask as part of the core maintenance team. Most of my Dask work lately has been helping respond to PRs and issues, not so much development anymore. When I do have development time I’m mostly focused on Dask-Gateway – a centralized service for deploying and managing Dask Clusters.

Which Python libraries are your favorite (core or 3rd party)?

I wouldn’t be able to work nearly as efficiently without the following tools:

  • Conda – cross-platform (and language) dependency management. When developing for multiple platforms (and Python versions), conda makes it painless to create and switch between test environments. And with a robust dependency solver, you can upgrade packages without (or at least with less) fear of breaking things.
  • IPython Shell – fancy Python shell, I always have one running while developing.
  • Black – no longer argue about Python formatting, automate it!
  • pytest – readable tests, powerful fixtures, best testing library I’ve used in any language.
  • flake8 – besides pep8 issues, flake8 catches missing/unnecessary imports, typos, and more! Flake8 runs as part of the CI on all the projects I maintain.

Big thanks to all the developers who helped make these tools.

Is there anything else you’d like to say?

We’re always looking for new contributors to Dask – if anyone wants to start contributing we have plenty of good starter issues. If you don’t know where to start, feel free to reach out on our gitter and we’ll give you some pointers.

Thanks for doing the interview, Jim!