Python: Create Fake Data with Faker

Every once in a while, I run into a situation where I need dummy data to test my code against. If you need to do tests on a new database or table, you will run into the need for dummy data often. I recently came across an interesting package called Faker. Faker’s sole purpose is to create semi-random fake data. Faker can create fake names, addresses, browser user agents, domains, paragraphs and much more. We will spend a few moments in this article demonstrating some of Faker’s capabilities.


Getting Started

First of all, you will need to install Faker. If you have pip (and why wouldn’t you?), all you need to do is this:

pip install fake-factory

Now that you have the package installed, we can start using it!


Creating Fake Data

Creating fake data with Faker is really easy to do. Let’s look at a few examples. We will start with a couple of examples that create fake names:

from faker import Factory

#----------------------------------------------------------------------
def create_names(fake):
    """"""
    for i in range(10):
        print fake.name()
        
if __name__ == "__main__":
    fake = Factory.create()
    create_names(fake)

If you run the code above, you will see 10 different names printed to stdout. This is what I got when I ran it:

Mrs. Terese Walter MD
Jess Mayert
Ms. Katerina Fisher PhD
Mrs. Senora Purdy PhD
Gretchen Tromp
Winnie Goodwin
Yuridia McGlynn MD
Betty Kub
Nolen Koelpin
Adilene Jerde

You will likely receive something different. Every time I’ve run the script, the results were never the same. Most of the time, I don’t want the name to have a prefix or a suffix, so I created another script that only produces a first and last name:

from faker import Factory

#----------------------------------------------------------------------
def create_names2(fake):
    """"""
    for i in range(10):
        name = "%s %s" % (fake.first_name(),
                          fake.last_name())
        print name
        
if __name__ == "__main__":
    fake = Factory.create()
    create_names2(fake)

If you run this second script, the names you see should not contain a prefix (i.e. Ms., Mr., etc) or a suffix (i.e. PhD, Jr., etc). Let’s take a look at some of the other types of fake data that we can generate with this package.


Creating Other Fake Stuff

Now we’ll spend a few moments learning about some of the other fake data that Faker can generate. The following piece of code will create six pieces of fake data. Let’s take a look:

from faker import Factory

#----------------------------------------------------------------------
def create_fake_stuff(fake):
    """"""
    stuff = ["email", "bs", "address",
             "city", "state",
             "paragraph"]
    for item in stuff:
        print "%s = %s" % (item, getattr(fake, item)())
        
if __name__ == "__main__":
    fake = Factory.create()
    create_fake_stuff(fake)

Here we use Python’s built-in getattr function to call some of Faker’s methods. When I ran this script, I received the following for output:

email = pacocha.aria@kris.com
bs = reinvent collaborative systems
address = 57188 Leuschke Mission
Lake Jaceystad, KY 46291
city = West Luvinialand
state = Oregon
paragraph = Possimus nostrum exercitationem harum eum in. Dicta aut officiis qui deserunt voluptas ullam ut. Laborum molestias voluptatem consequatur laboriosam. Omnis est cumque culpa quo illum.

Wasn’t that fun?


Wrapping Up

The Faker package has many other methods that are not covered here. You should check out their full documentation to see what else you can do with this package. With a little work, you can use this package to populate a database or a report quite easily.

4 thoughts on “Python: Create Fake Data with Faker”

Comments are closed.