Bootstrapping Django Database with Faker


Sept. 26, 2021, 2:23 p.m.


Blog image


Developers and QA engineers need an adequate amount of data in the database to properly implement and enhance the features and test them. This can be very tedious work to manually populate the database. Python Faker package can dynamically populate database thus saving production and testing time. In this blog, we will be discussing the integration of the Faker package with Django. You can find the Faker package documentation here.

Please note- this blog is for intermediate-level learners. He/she should have a fair amount of knowledge in Django core structures and database queries.
 

We will create a custom Django admin command which will add a manage.py action for a Django app. In this case, we will create a custom populate_database command. To do this we will create a  directory- management and a subdirectory- commands inside an app. Then we will create the populate_database.py module inside the subdirectory. Django will register a manage.py command for each Python module in that directory whose name doesn’t begin with an underscore. This custom command will be made available to any project that includes the app. 

 

screenshot of command

 

Install the Faker package-

pip install Faker

 

To implement the command we need to extend the BaseCommand class and define handle method inside that class.

class Command(BaseCommand):

    help="Command Information"

 

    def handle(self, *args, **kwargs):

        print("hello world!")

 

Test the custom command -

python manage.py populate_database

 

This will print hello world. We are ready to implement Faker now but first we need to take a look at the database. I am using the following three models where many to one and many to many relation exists. So it should be enough for reference.

 

 

 

 

import Faker and Providers from faker

from faker import Faker, providers

 we will populate the Musician table first because the album and Concert table has foreignkey reference assciated with it. Inside the handle method instantiate the faker class-

fake=Faker()

 

At this stage, we need to know about providers. Providers simply provides data to us. There are standard providers which are predefined and we can use to get data instantly and there are community providers too which we need to install seperately. In our case we will use standard and custom providers. Feel free to explore the standard and custom providers.

We will populate 15 rows of musician table by the following code-

for _ in range(15):

f_name=fake.first_name()

l_name=fake.unique.last_name()

m_age=random.randint(10,130)

Musician.objects.create(first_name=f_name, last_name=l_name, age=m_age)

 Here first_name and last_name are standard providers. Now to reference the Musician datas from other tables we need the prmary keys. We will collect all the primary keys inside a tuple by the following query-

Muzician_ids=tuple(Musician.objects.values_list('id', flat=True))

 

 We will write code for populating the Album table now. There are no standard providers for album names. So we have to make a custom provider. First, make a list of some album names-

albums=[ 'Justice To All', 'Behind The Scene Mate/s', 'Suffering Is Optional', 'Typical of Friends', 'A dance to my sorrows', 'Cheerful giving', 'Holy Life', 'The Grand Creator', 'Disconnected', 'Love has no ending', 'A Poison To My Heart', 'The Beauty Of Life', 'Plight of a widow', 'It Was Too Good To Be True', 'Its An Old Story', 'Fearing God', 'the little things', 'Be Strong & Courageous', 'Underestimated By Many', 'Talents', 'My Hidden Pain', 'You Were A True Friend' ]

 

Create a new class that extends the providers.BaseProvider. Then create a method inside that class that will return album names -

class Provider(providers.BaseProvider):

def album_names(self):

return self.random_element(albums)

Here random elements() will randomly sample objects from the albums list.



We need to add the custom provider to faker. So inside handle method add the following code after the instantiate of Faker()-

 fake.add_provider(Provider)

 

Now, write the blow snippet of code to populate the Album table-

for _ in range(0, len(albums)):

album_name=fake.unique.album_names()

date=fake.date()

Album.objects.create(name=album_name, release_date=date, artist_id=fake.random_element(elements=Muzician_ids))

 For many to one relationship, each album in the album table can't belong to more than one artist in the Musician table. Here, we are assigning one random id from the muzician_ids tuple to the artist_id field.

Now, we need to populate the last table- Concert.

There is no standard provider for concert names. So we will make a custom provider in the same process of album name. First, we will make a list of concert names-

concerts=[ 'Annual Sin', 'Annual Wonderland', 'Boogie Horizon', 'Boogie Paradise', 'Danceex', 'Dance Paradise', 'Festivscape', 'Festival Oasis', 'Fest Invasion', 'Fest Kingdom', 'Fest VIP', 'Gala Jungle', 'Gala Temple', 'Gigscape', 'Gig Glory', 'Midsummer Splendour', 'Music Dreamland', 'Music Heritage', 'Party Playground', 'Venue Beast']

 

 Then declare a method inside Provide class which will return random concert objects-

class Provider(providers.BaseProvider):

def album_names(self):

return self.random_element(albums)

def concert_names(self):

return self.random_element(concerts)

 

Now write the blow snippet of code to populate theConcert table-

concert_name=fake.unique.concert_names()

concert_location=fake.city()

artists_id=fake.random_choices(elements=Muzician_ids)

c1=Concert.objects.create(name=concert_name, location=concert_location,)

for i in artists_id:

m1=Musician.objects.get(pk=i)

c1.artists.add(m1)

Here, we are using random_choices() method to randomly choose a variable number of ids from the Muzician_ids tuple. It will help us to create many to many relationships.

 

 The last step, run the below command into your terminal to auto-populate the database-

python manage.py populate_database

 

Find the full code here.

Happy codingsmiley!