PyDev of the Week: Gláucia Esppenchutz

This week we welcome Gláucia Esppenchutz (@glauesppen) as our PyDev of the Week! Gláucia is the author of Data Ingestion with Python Cookbook.

Let’s spend some time getting to know Gláucia better!

Can you tell us a little about yourself (hobbies, education, etc):

Hi, my name is Glaucia; 31 years old, Brazilian, and living in Portugal.

Married and “mother” of dog beautiful dogs! Last year, I bought a 3D printer and got utterly addicted to it. So, my hobbies include printing random stuff, playing video games, and reading.

I have worked as a Data Engineer for the past eight years and love what I do. I enjoy reading about data, how to optimize ingestion and transformation pipelines, and how to better monitor them.

I’ve been recently allocated to a team focused on Data Operations, which thrills me! Monitoring data and ensuring data quality is challenging.

A fun fact about me is that I have yet to graduate in Science Computing or any engineering grad school. Actually, I graduated in the biomedical field. I changed my career when I met my husband, who is a software engineer.

I am a late diagnosed autistic, and the diagnosis saved my life.

Why did you start using Python?

Python is my mother language! I started using it when I shifted my career path. The language’s simplicity helped me learn it quickly and start working in a small startup.

What other programming languages do you know, and which is your favorite?

I learned how to program in JavaScript and PHP, but it was so long ago that I had no idea how to do it anymore, haha.

I had to learn Scala because of a project in a previous work. It’s not my favorite language, but it helps me a lot when I need to debug something in Spark.

Python will always be the language of my heart <3

What projects are you working on now?

Currently, I am working on two personal projects. One is called Apache Wayang, and it is in the incubator phase at Apache Org. I work with them as a release manager, improving the docs and website.

The other project I am working on is the DE&I initiative in the Apache Org. The idea is to increase the diversity in the open-source community and remove biases we find in the tech area.

Both are long-term projects but very exciting!

Which Python libraries are your favorite (core or 3rd party)?

Hum… that’s a tricky question.. Based on what I work, I will say Pandas. I can’t make a count of how many times this lib saved me when analyzing data. Even when using PySpark, I sometimes invoke the inner compatibility with Pandas (.toPandas()) to analyze something.

On the core side, datetime lib is on my top list. Who didn’t have any problems with date formats when working with data? This core lib always saves me.

How did your book, Data Ingestion with Python Cookbook, come about?

I got an invitation from Packt publisher. They wanted to make a book about Python and Data in a cookbook format. Then, I proposed something for beginners to start in the data world, but with some intermediate topics for the ones who already work with data pipelines.

The book covers the beginning of the data journey, like understanding the data we will work on and how to plan and monitor the pipelines.

What are the top three things you learned writing a book?

The first thing I learned was how to structure and plan a chapter. It seems simple, but creating a content flow and connecting the topics can take a lot of work. Now, I feel more confident to create writing content for my Medium blog, which I started to write posts after the book was released.

Second, my English improved a lot! I had to search for synonyms and different ways to write some things constantly, which made me read a lot of new things.

Third was how to do proper research. All the explanations in the book were made using pieces of code or documentation present in the source codes. Of course, there are citations of other writers and blog posts. Still, I double-checked all the information I needed to make correct assumptions and content.