PyDev of the Week: Marius van Niekerk

This week we welcome Marius van Niekerk (@__mvn__) as our PyDev of the Week! Marius works on conda-forge, among other projects. You can see what else Marius is working on over on GitHub.

Let’s spend some time getting to know Marius better!

Can you tell us a little about yourself (hobbies, education, etc)

I’m Marius. I live in Brooklyn, NY these days.

I’ve had an interesting education path. I started out doing actuarial science at the University of Stellenbosch (South Africa), switched over to pure statistics, and then somehow ended up writing hotel management software full-time whilst studying.
After that, I transitioned into data science/engineering when I immigrated to the United States to go work at an Ad-tech startup.
I’m presently a principal engineer at Voltron Data working on some very cool problems in the data engineering space.

Of course, I’m not all work and no play! In my free time, I indulge in my love for culinary delights, whipping up mouthwatering dishes in the kitchen. And what better way to wash down all that delicious food than with a glass of fine wine? When I’m not busy sipping and savoring, I love to explore the vibrant streets of New York, always on the lookout for the next adventure!

Why did you start using Python?

Many, many years (2007) ago I was a Delphi programmer and used Python to cobble together a build bot invoking MSBuild in order to churn out CI builds with ease. Shortly after that, I used web2py to build an online hotel reservation system.

When I transitioned to data science I started using Python for basically everything. I’ve been using Python ever since.

What other programming languages do you know and which is your favorite?

I’ve dabbled in many programming languages over the years – Delphi, Go, Scala, and a distressing amount of bash (and we’re not including the mountains of yaml)! My favorite these days is probably Go due to the ease of maintaining high developer velocity in the language. Python can be a bit slow to iterate in (particularly for codebases without proper typing support).

Go and Delphi oddly enough feels somewhat similar due to the snappy compilation speeds achieved by single-pass compilers.

What projects are you working on now?

Open source wise I’m working on some of the ecosystem utilities and standards for the conda and conda-forge ecosystems.

Most recently my focus has been on conda-lock, a tool for generating npm-like lockfiles for conda environments.

Which Python libraries are your favorite (core or 3rd party)?

I’m a huge fan of pydantic for the almost immediate improvement in code quality it has on code bases that adopt it broadly. I also really like typer for building CLI applications.

How did you get involved with the conda-forge project?

So I’ve been contributing packages to conda-forge since 2016 which is just when the project was announced at SciPy. I was already heavily using conda for managing big data computing environments. Much of this was before wheels got reasonably good so conda packages were the only real way to get easily installable packages for the c-extension heavy data ecosystem.

I joined the conda-forge core team in 2018 and have been helping out ever since. My largest contributions were probably a bunch of the larger migration efforts that we’ve done over the years.

The gcc4 to 7 migration stands out to me in particular, as this was the first of the truly large-scale rebuild-the-world migrations that the conda-forge infrastructure had to handle. For around 7 months we basically built all the artifacts twice (once with gcc4 and once with gcc7) and then switched the default compiler to gcc7. This was a huge effort and involved a lot of work from the conda-forge team and the community. Ultimately when we swopped over the channels to point to the gcc7 channel there was relatively minimal fallout and the community could move forward easily.

As part of the move to make conda a fully open-source governed software project in 2022, I joined the steering council for conda to act as an advocate for standards and the broader user community. I’m excited to help shape the future of conda and take it to new heights.

What are the top three things you enjoy about conda?

  • Versatility of use
    Conda is as powerful as Docker for building environments without root access, and it works on platforms other than Linux. Being able to install terraform and pandas with conda is impressive.
  • Openness
    The conda-forge community is filled with incredible people who are willing to help out and contribute to the project. I’ve been really impressed with the level of engagement and the quality of the contributions that we’ve received over the years.
  • Scale
    Conda-forge is rather massive in scale so it is an interesting space to build tools that try to mutate large swaths of the packaging ecosystem at once, whilst deftly avoiding Github rate limits.

Is there anything else you’d like to say?

Check out conda-lock. It’s a great tool for creating reproducible conda+pip environments. I’ve used it in a bunch of projects and it’s been a huge help when it comes to deploying a conda environment in any kind of distributed environment.

Thanks for doing the interview, Marius!