PyDev of the Week: Allan Campopiano

This week we welcome Allan Campopiano (@AllanCampopiano) as our PyDev of the Week! Allan is the creator of Hypothesize, a Python package for hypothesis testing using robust statistics.

If you’d like to see what else Allan is up to, you can check out Allan’s GitHub profile.

Let’s take a few moments to get to know Allan better!

Allan Campopiano

Can you tell us a little about yourself (hobbies, education, etc):

I initially wanted to be a professional drummer—even went to school for it. Turns out that I’m only good enough to do it in my spare time.

After realizing that I should choose a different career, I decided to go back to school for Psychology. This eventually led to me getting a neuroscience degree and falling in love with coding.

Why did you start using Python?

  • I volunteered in a neuroscience laboratory that dealt with massive amounts of electrophysiology data (recordings of electrical brain activity).
  • I noticed that there was always a lineup up at the technician’s door (his name was James). Grad students and professors relied on his coding expertise to process their data.
  • I figured that if I learned those skills, I too might be able to help others and build a viable career.
  • Based on this, I decided to learn how to code—under James’s kind tutelage. It started out as MATLAB, then R, then finally Python.

 

What other programming languages do you know and which is your favorite?

Python is by far my favorite general programming language. I find that SQL and Python work very well together in today’s modern coding interfaces (notebooks and IDEs).

 

What projects are you working on now?

  • I’m preparing for my PyCon 2023 talk! It’s all about the so-called contaminated normal curve. It looks normal but it will ruin your stats (did I mention that I’m a stats enthusiast?).
  • I’m investigating Spark deployment options (Dataproc, K8s, EMR).
  • As a solutions engineer (and data scientist by trade), I’m building notebook-based solutions for users of Deepnote. Shameless plug, but I am extremely proud of where we are taking notebooks at Deepnote.

Which Python libraries are your favorite (core or 3rd party)?

I must say Altair. I just love this library. Much of my work involves exploring data quickly and Altair was a game-changer for me in this respect. Altair also provided much inspiration for no-code chart blocks—making data exploration very easy.

Another killer library is Snowpark. I do a lot of work with data in Snowflake and Snowpark gives me a Python interface for Machine learning in the warehouse. So cool.

What is the origin story for Hypothesize?

Hypothesize is a great passion of mine. It’s all about bringing robust statistics to the Python library ecosystem. It is based on Rand Wilcox’s package in R called WRS.

It began when James and I started coding robust statistical algorithms in our laboratory days. A few years later, I decided to create a proper Python library for robust statistics.

Publishing the associated manuscript with Rand Wilcox was a career highlight for me. He was extremely helpful during the development of Hypothesize.

Is there anything else you’d like to say?

Thanks to Mike Driscoll for inviting me to PyDev of the Week! Been a pleasure!

Thanks for doing the interview, Allan!