This week we welcome Claudia Regio (@ClaudiaRegio) as our PyDev of the Week! Claudia is a program manager for Python Data Science with a focus on Python Notebooks in Visual Studio Code at Microsoft. She also blogs on Microsoft’s dev blog.
Let’s spend some time getting to know Claudia better!
Can you tell us a little about yourself (hobbies, education, etc):
I am originally from Italy and moved to the Greater Seattle Area when I was 4 years old. Growing up I lived and breathed squash, and had it not been for COVID I still would be! I have been and always will be a huge math nerd and have been tutoring in math for over 10 years now. I attended the University of Washington where I majored in Applied Physics and received two minors in Comprehensive Mathematics and Applied Mathematics while a member of both the Delta Zeta Sorority and the UW Men’s Squash Team.
After graduating I pursued a Data Science Certificate from the University of Washington to enhance my data analysis + data science skills while working at T-Mobile as a Systems Network Architecture Engineer.
Two years after working in that role, I transitioned to Program Manager at Microsoft for the Python Extension in VS Code, focusing on the development of the Data Science & AI components and features.
Why did you start using Python?
The courses in my data science certificate got me started on Python back in 2017.
What other programming languages do you know and which is your favorite?
I learned Java during my time in college and while I enjoyed Java being a strongly typed language, no language beats Python when it comes to data science.
What projects are you working on now?
I am currently managing the Python + Jupyter Extension partnership in VS Code. While our recently released Jupyter Extension provides Jupyter Notebook support for other languages in VS Code Insiders, I focus on the collaboration of the two extensions to create an optimal notebooks experience for data scientists using Python.
Which Python libraries are your favorite (core or 3rd party)?
Scikit-learn will forever have my heart <3 What do you see as the best features of Python Notebooks?
Our best features in Python Notebooks currently include the variable explorer, data viewer, and my personal favorite, Gather (a complimentary VS Code extension).
When experimenting and prototyping in a notebook, it can often become busy as a user explores different approaches. After eventually reaching the desired result (for instance a specific visualization of the data) a user would then need to manually curate the cells involved with this specific flow. This task can be laborious and error-prone, leaving users without a strong approach for aggregating related code. A second scenario that is common among Notebooks users is that software engineers are tasked with turning an existing notebook into a production-ready script. The process of pulling out unneeded imports, graphs, outputs is often highly time consuming and can lead to errors as well. Gather is a complimentary VS Code extension that grabs all the relevant and dependent code required to create the contents of a selected cell and extracts those code dependencies into a new notebook or Python script. This helps save data scientists and developers a lot of notebook cleaning time!
Why should Python developers and data scientists use Visual Studio Code over another editor?
VS Code is a free and open source editor with a family of extensions (from both Microsoft and the open source community), products, and features that aim to make a seamless experience for developers and data scientists. A few examples include:
- Python (Comes with the Jupyter Extension): Includes features such as IntelliSense, linting, debugging, code navigation, code formatting, Jupyter notebook support, refactoring, variable explorer, test explorer, snippets, and more!
- Pylance: Language server that supercharges your Python IntelliSense experience with rich type information, helping you write better code faster.
- Live Share: Enables you to collaboratively edit and debug with others in real-time, regardless of what programming languages you’re using or app types you’re building. It allows you to instantly (and securely) share your current project, and then as needed, share debugging sessions, terminal instances, localhost webapps, voice calls, and more!
- Gather: A code cleaning tool that uses a static analysis technique to find and then copy all of the dependent code that was used to generate that cell’s result into a new notebook or script.
- Coding Pack for Python Installer: An installer pack that helps students and new coders quickly get started by installing VS Code, all of the extensions above, as well as Python and common packages such as numpy and pandas.
- Azure Machine Learning: Easily build, train, and deploy machine learning models to the cloud or the edge with Azure Machine Learning service from the Visual Studio Code interface.
- Over 350+ community contributed Python-related extensions on the VS Code Marketplace!
It is the partnership constructed amongst these extensions and the open-source community as well as the Developer Division mindset to always build for the customer that creates an unmatchable experience for both developers and data scientists in VS Code.
Is there anything else you’d like to say?
I would like to thank the incredible team I get to work with (David Kutugata, Don Jayamanne, Ian Huff, Jim Griesmer, Joyce Er, Rich Chiodo, Rong Lu) who make this tool come to life and a thank you to all the customers who engage with us and are helping us build the best tool for data scientists!
If anyone would like to provide any additional feedback, feature requests, or help contribute back to the product you can do so here!
Thanks for doing the interview, Claudia!