PyDev of the Week: Tony Roberts

This week we welcome Tony Roberts as our PyDev of the Week! Tony is the creator and maintainer of PyXLL, a paid add-in for Microsoft Excel that lets you script Excel using the Python programming language. You can see what else Tony is working on over on GitHub. Tony writes on Medium and the Towards Data Science website.

Tony Roberts

Let’s take some time to get to know Tony better!

Can you tell us a little about yourself (hobbies, education, etc)?

Sure! I’ve been writing software in one form or another from an early age. I was always interested in programming growing up but imagined I’d go into something more hardware based as I was more into electronics. I became more serious about software development while studying at university, and was most interested in computer graphics. After graduating I was lucky enough to get a job working on video games with some really amazing developers. Although I spend a lot of my time in Python these days I still have a soft spot for low level programming.

Later I moved to London and got a job working in a small team in a hedge fund. Coming from video games to finance felt like a huge change and I wasn’t at all sure I was doing the right thing, but that was an incredible team that I thoroughly enjoyed working with. It was then I started to appreciate just how essential Excel is to so many people in their work and that’s what set me on the path to create the Python Excel add-in, PyXLL.

I’m also really into white water kayaking. I’ve been doing it probably almost as long as I’ve been writing code! It’s a great sport and has taken me to some amazing places all over the world, like Zambia, Ecuador and Nepal to name just a few. I’m fortunate to have the UK’s Olympic white water slalom course a short drive away and although it’s not the same as a remote river in an exotic country it’s a fantastic training facility that I take advantage of regularly.

Why did you start using Python?

I was working on a web application for a client while still at university, I guess around 1999 or 2000. This was back in the days when every website was using the LAMP stack and Python was not really on many people’s radar. I went to a developers conference and heard a talk about Python and that’s what first got me interested. I did finish the back end for the application in Perl as it wouldn’t have been worth the extra time and effort to switch languages midway through, but in my spare time I began playing around and doing some hobby projects in Python.

It was a few years later when I found myself working in finance for a hedge fund in London that I really started using Python in a professional capacity. The first large project I worked on was migrating the fund from a legacy system inherited from the parent bank to a newer third party system. This third party system had chosen Python (2.3 at the time) as their scripting language. So, we started using Python more and more until virtually everything new we created was written in Python.

What really struck me was how great Python is for being extensible. There are other languages (arguably) that are as expressive or have features compatible with Python, but not many come close to how easy it is in Python to drop down into C for integrating with native libraries. I know it’s actually relatively niche to write C extensions, but it’s one of my favorite features of Python.

What other programming languages do you know and which is your favorite?

That’s a tough one! I’ve used so many different languages over the years from assembly to Java and everything in between.They all have their strengths and weaknesses. The trick is deciding on which language is right for each task. I think more than knowing programming languages, what I really enjoy is learning new ones. Learning a new language really makes you think about the others you already know. It’s interesting to see how different people and communities approach things. If you only ever work in one language I feel you’re really limiting yourself as a developer.

One language that I keep coming back to over and over again is C++. It might seem surprising to bring up C++ when talking with a Python developer about favorite languages, but I really enjoy it. It’s been fascinating watching C++ evolve and transform. It’s so far away from the language it was 10 years ago it’s staggering (in a really good way!). My Python Excel add-in, PyXLL, is in fact around 80% C++ and the rest is Python (with the odd bit of assembly thrown in too!).

TypeScript is one I’ve been using a lot recently and I’m really enjoying it. I was skeptical to begin with and was extremely tempted to just stick with plain old Javascript. I’m glad I didn’t though. It’s a really impressive language once you get into it, and the tooling around it and React means you can be amazingly productive once you’ve got your head round the basics.

What projects are you working on now?

As you can imagine, running PyXLL is a full time job in itself! We’ve really grown a lot in the last few years and I spend a lot of my time talking to users to understand how they work and what they need from PyXLL. I still do a lot of actual product development and it’s critical for me to understand what people like and don’t like so much so the product can continually improve.

Just recently support for “Custom Task Panes” was added to PyXLL. This lets you write a user interface using one of the Python GUI toolkits like PySide, PyQt, wxPython or Tkinter and then PyXLL will host it seamlessly in a native Excel task pane. This was something a long time PyXLL user had asked for and it was a pretty involved project with some really technical challenges. Once it was ready it occurred to me that Jupyter notebooks run in any browser and Qt has a browser widget… so, using this new feature a Juypyter notebook could be hosted in an Excel task pane! It was then pretty straightforward to hook everything together with the underlying Python kernel also running in Excel. Now you can code in Python, write Excel functions and macros, and move data between Python and Excel, without ever leaving Excel. I still prefer using a Python IDE for writing any substantial amount of code but being able to interactively play around with ideas and try things out in Excel with Python is really cool.

It’s fun how often one idea leads to another like this. Jupyter notebooks in Excel wasn’t ever something that was really planned, it just came about because of the task panes feature.

Which Python libraries are your favorite (core or 3rd party)?

Aside from the usual numpy, scipy, pandas etc. there are a couple I’ll mention that perhaps aren’t so obvious and in my opinion are sometimes under-rated. The pywin32 package has been extremely useful to me for so many different tasks in my career. I’m also continually impressed by SqlAlchemy.

What Mark Hammond did with the pywin32 package is incredible. You could argue that with the addition of ctypes it’s less relevant than it used to be, but it’s still the easiest way to access the Windows APIs from Python. The win32com package is extremely well thought out. Even for COM haters it makes scripting Windows applications about as easy as I think you could hope for!

Whenever I have any database work to do in Python I will use SqlAlchemy. The more I dig into it the more I appreciate the design and how well it works. I love the expression building part of it. It’s so powerful for building complex queries on demand.

I could continue naming Python packages that I have a huge amount of appreciation for, but I guess what I’m most grateful for is the variety of what’s out there and the Python community that’s continually building it. I’ve been involved in many open source projects and lost count of how many meetups and conferences I’ve attended. The Python community is something quite unique.

How did PyXLL, the Python Excel add-in, come about?

I had a career change and moved from video games to finance when I moved to London in around 2004. I’d never really paid that much attention to Excel before, but obviously, I knew what it was. I was really lucky that I was working with some great people and spent a lot of my time working with traders, quants and other people outside of the development team. These guys were Excel gurus and lived and breathed in Excel. Everything we were building was in Python though, so that meant for anything to be really useful to them they needed a way to import and export to and from Excel.

I spent some time during that job figuring out how Python and Excel can interact via COM. You can write COM controls in Python which Excel can use, and you can script Excel using COM from Python. This worked to an extent, but it was always limited to what you can do with Excel via its COM API, and not to mention quite slow.

I had written Excel add-ins in C# and that gave a much better level of integration than was possible using Python and COM, and I figured there should be a better way to do the same in Python.

After I left that company I couldn’t let the idea go and came up with a way to actually embed Python into Excel in such a way that Python functions could be dynamically reflected out to Excel. It involved some pretty hairy code to make it work, but I had shown it was possible! I showed this first version to a few people, one of whom ran the London Python meetup at the time and suggested I show it there. The reaction I got in the first few weeks and months was amazing and I threw myself into building a first version to get it into the hands of some users. I think one of the things I find most satisfying is that those first customers are all still PyXLL users over 10 years later and I’m so grateful to them for their support.

What makes PyXLL better than the other Python Excel packages, like XLWings or OpenPyXL?

These packages are really all very different, but it does often cause confusion as on the surface they can appear to be tackling the same sorts of problems.

PyXLL is the only one of these that actually embeds the Python interpreter into Excel as an Excel add-in. It lets you write Excel add-ins in Python. That’s how you can call Python functions directly from Excel without any need for VBA or COM or anything like that. PyXLL is for people who want to write their own Excel add-ins to expose Python code to Excel users. Often they are also Excel users themselves and find the combination of working in Excel with a library of Python functions works well for them, but just as often it can be a development team delivering functionality to a team of non-developers. For example, many PyXLL users are trading teams in banks and hedge funds who work in Excel but need to access Python analytics. I always like to say that with PyXLL you can use Excel like you use a web browser; it’s just the front end to let users access your Python code. With PyXLL the Excel user often won’t even know what they’re doing is written in Python as everything integrates into Excel so seamlessly.

OpenPyXL is a different thing altogether. It’s a great package and one I often recommend to people. It’s for reading and writing Excel files rather than extending or working with a live instance of Excel. For many tasks that’s exactly what you want. If you’re running an overnight batch process that has to email a report to a team then you really don’t want to be running Excel on a Windows server! OpenPyXL understands Excel’s file formats and works directly with those rather than dealing with the Excel application itself.

Xlwings is different again. Xlwings is intended to be used for scripting Excel, and it does so by using the Excel COM API on Windows and the AppleScript API on macOS. If you need to interact with a running Excel application, for example, to read some values from a live spreadsheet or paste some values in, you can do it via Excel’s COM API. Xlwings provides a higher level abstraction on top of this but it’s not the same as writing a native, high performance, Excel add-in that integrates Python into Excel in the way PyXLL does.

PyXLL is used by a really diverse group of companies and individuals in all different industries. The most common theme or use case is that people are developing, or already have, some Python code that they need to expose to Excel to call interactively as part of some analysis. Sometimes it’s for themselves if they use Excel as part of their workflow, or other times it’s to roll out across an organization to serve all Excel users. For example, financial analytics that traders use to make real-time decisions. That’s not something you can do with any other product that I’m aware of. Just this morning I received an email from someone telling me how helpful PyXLL had been when refinancing their house! I never get tired of hearing about how PyXLL’s being used and often it’s a world away from what I envisaged when I began.

Thanks so much for doing the interview, Tony!