This week we welcome Pierre Denis as our PyDev of the Week! Pierre is the creator of Lea, a probabilistic programming package in Python. He can be found on LinkedIn where you can see his CV and learn more about some of the things he is up to. Let’s take a few moments to get to know Pierre better!
Can you tell us a little about yourself (hobbies, education, etc):
I’ve a Master in Computer Science from UCL Louvain-la-Neuve, Belgium, where I reside. I’m working since 20 years as software engineer in [Spacebel](http://www.spacebel.be), a company developing systems for Space. Basically, I like everything creative and elegant. Beside arts, music, literature, I ‘m looking for this in physics, algorithmic, GUI and mathematics. I love programming, especially in Python. So far, I have initiated three open-source Python projects: UFOPAX (textual virtual universe), Unum (quantities with unit consistency) and Lea (probabilistic programming). For these developments, I tend to be perfectionist and consequently slow: I’m the kind of guy that re-write the same program ten times, just for the sake of inner beauty!
Beside programming, I’m doing research in number theory (twin primes conjecture). Also, I’m writing short stories in French, my mother tongue, with some reference to the ‘Pataphysics of Alfred Jarry and a lot of nonsense. Incidentally and fortunately, programs can be good for producing nonsense, as I showed in my bullshit generator!
Why did you start using Python?
One day, a colleague showed me very interesting things he made with that language, completely unknown for me. It was Python 1.5, in 1999! At that time, I was much in favor of statically-typed languages (C++, Java, Ada, …). Intrigued, I read “Whetting your appetite” of G. van Rossum, then I swallowed the wonderful “Learning Python” book of M. Lutz and D. Asher. I was quickly conquered by the clarity, the conciseness, the beauty of the language. So, I started Python basically because it was so appealing: having the simplicity of an interpreted language with built-in containers, exceptions, OO, operator overloading, and many, many more. I became soon a zealous advocate of Python in my company.
What other programming languages do you know and which is your favorite?
Well, I’ve practiced Pascal, Ada, C, C++, Smalltalk, Java, Prolog, Scala and a few others. Python is my favorite one, by far. Now, if I had to award a silver medal, this would be Scala. Actually, my programming experience with Scala changed a bit the way I program in Python: I smoothly shifted to a more functional style, in particular, preferring immutable objects to mutable ones. Beside this choice, I’ve to mention that C, Smalltalk and Prolog have been very influential for me.
What projects are you working on now?
For my company, I’ve worked recently on the qualification of MicroPython, the awesome project of Damien George, for Flight onboard software (for short, Python running on a spacecraft!). Independently, I’m now contributing to a revamp of our workflow execution engine. This uses Python, RabbitMQ, Open API, MongoDB as well as Angular for the GUI. As for the previous version, this will be used in satellite operations centers (e.g. switch on antenna and perform uplink/downlink at satellite passes).
On the side of personal Python projects, I’m maintaining Lea, in particular for extending its ML abilities. Also, I’m working on SiriusBee, a videogame prototype using PyQt, where you drive a spacecraft through alien caves. As a longer background task, I’m still in search of a proof to twin prime conjecture…
Which Python libraries are your favorite (core or 3rd party)?
Oh, so many of them are just great! In the standard Python modules, I like ast, functools, datetime, json, bisect, just to name a few that I used recently. In my experience, bisect is not very much known, probably due to its cryptic and unappealing name. I advised it several times to people needing to do efficient “range” lookups (they usually started first implementing by themselves).
For 3rd party ones, I do like PyQt, which I’ve used a lot. BTW, I’m coauthor of an introductory book on PyQt in French. SymPy also is really impressive, even if I’m acquainted only with the symbolic computation part. This allowed Lea producing probability formulae, like p**5 * (1 – q)**3, instead of numbers. To my surprise, this was done almost for free, without changing a single line of Lea’s core algorithm. The magic of Python and duck-typing!
What is the Lea package?
In essence, Lea allows you playing with probabilities in a simple way. You can define discrete random variables (RV) as dictionaries of objects with given probabilities. Then, Lea allows you combining these random variables, mostly as you would do for usual (deterministic) objects: typically, using particular arithmetic and comparison operators. You then obtain a new RV that you may query to get the calculated probabilities. An important concept is that the dependencies you define between RV’s are kept internally. Think of that as some kind of “lazy evaluation”. This allows conditioning and Bayesian reasoning (What is the probability of a hypothesis H given observation X?). Lea fits very well in the field of Probabilistic Programming Language (PPL). Several packages, in Python and others, provide similar functionalities but, from the feedbacks I received, I honestly think that Lea is among the most intuitive and easy to use.
On the design side, the package is made of few building blocks that can combine together to make complex probabilistic models. These are processed by an original inference algorithm in which Python generators play a key role. These building blocks are subclasses of “Lea”, the root abstract class, and are named “Alea”, “Blea”, “Flea”, “Clea”, etc.. I know that this may sound odd but having such short names helped me to reason; these are like members of a fairy family, whom I can summon so they converse together and… play Statues (the game that inspired the algorithm’s name). Needless to say, Lea is the programming thing that I’m the most proud of!
What have you learned creating Python packages?
One lesson I learned is that any new function that you program requires much efforts downstream: documentation, formal tests and tutorial updates. I spent many more time for these activities than for the programming itself. In Lea code, you may find one-liner methods having 20 lines long docstrings!
Also, I must confess that, for my personal projects, I had an inclination to work alone. I’ve learned however that there are several people, highly skilled, ready to help by evolution propositions or direct contributions. What is great with open-source is that there seems to be always someone, somewhere, who can fulfill your lacks on any topic. I think especially to Chris MacLeod, who took over the maintenance of Unum, as well as Paul Moore and Nicky van Foreest, who made great contributions on Lea for optimization, test suite and packaging. Also, I’ve many examples of open discussions I had with knowledgeable persons, which lead to new important ideas. I strongly believe in the virtues of cross-fertilization and serendipity.
Is there anything else you’d like to say?
Last week, my son has written a Python script implementing a genetic algorithm. Basically, individuals learn by evolution/selection how to follow a given road to reach a target zone. After some tuning, his program works very fine. He programmed everything without any third-party library, except PyGame for the display. I’m very impressed and proud of him.