PyDev of the Week: Daniel Alejandro Mesejo-León

This week we welcome Daniel Alejandro Mesejo-León (@searchsort) as our PyDev of the Week! Daniel is the creator of the trex package, which is used for efficient keyword extraction using RegEx.

You can see what else Daniel is up to by checking out his GitHub profile.

Let’s spend some time getting to know Daniel better!

Can you tell us a little about yourself (hobbies, education, etc):

My name is Daniel Mesejo. I’m a software engineer from Cuba currently living in Spain. I love sharing knowledge, have answered several questions related to Python, pandas, and numpy on StackOverflow, and have occasionally spoken at Python events. I enjoy reading and hiking; a curious fact about me is that I’m Erdos number 3.

Why did you start using Python?

I started using Python in college for some research projects. Once I started, I became obsessed with it. Python makes programming joyful; it gives me the same feeling I had as a kid solving logic puzzles.

What other programming languages do you know, and which is your favorite?

In my daily work, I use JVM-based languages Java, Kotlin, and Groovy. Of those, I prefer Kotlin; for me, it is a mix of Java and Python.

What projects are you working on now?

I want to re-start contributing to open-source projects such as dask, eland, and prefect. I also want to grow the usage of my project: trrex.

Which Python libraries are your favorite (core or 3rd party)?

My favorite libraries are collections, itertools, and operators; the number of things you can get done using only those three libraries is impressive.

How did the trrex package come about?

trrex transforms a list of strings into a regular expression so you can search and replace those words efficiently. I wrote it after reading https://stackoverflow.com/q/42742810/4001592. The question is commonly asked on StackOverflow, so I thought having a library for it might be helpful for others.

What are the top three things you love about trrex?

Its ease of use is just one function with three parameters with a string as the output

The performance of the obtained regular expression pattern matches that of even specialized libraries like flashtext.

It integrates well with other data science libraries: spacy, pandas, regex, scikit-learn.

Ease of use and combining well with others are traits of built-in Python functions, so I’m proud of that fact.

Is there anything else you’d like to say?

For those starting to learn Python and want to use StackOverflow, I suggest you look at https://sopython.com/canon/; it collects pointers to the most common questions. I also encourage everyone to try trrex when using regular expressions for keyword manipulation.

Thanks for doing the interview Daniel!