Dealing with data is essential for any Pythonista,
but sometimes that data is just not very pretty.
Computers don’t care about formatting,
but without good formatting, humans may find
something hard to read.
The output isn’t pretty when you use print() on large dictionaries or
long lists—it’s efficient, but not pretty.
The pprint module in Python
is a utility module that you can use
to print data structures in a readable, pretty way.
It’s a part of the standard library
that’s especially useful
for debugging code dealing with API requests,
large JSON files, and data in general.
By the end of this tutorial, you’ll:
pprint module is necessarypprint(), PrettyPrinter, and their parametersPrettyPrinterAlong the way, you’ll also see an HTTP request to a public API and JSON parsing in action.
The Python pprint module is helpful in many situations.
It comes in handy when making API requests,
dealing with JSON files,
or handling complicated and nested data.
You’ll probably find that using the normal
print() function
isn’t adequate to efficiently explore your data
and debug
your application.
When you use print() with dictionaries
and lists,
the output doesn’t contain any newlines.
Before you start exploring pprint,
you’ll first use urllib to make a request to get some data.
You’ll make a request to
{JSON} Placeholder
for some mock user information.
The first thing to do is to make the HTTP GET request
and put the response into a dictionary:
Python
Here, you make a basic GET request
and then parse the response into a dictionary with json.loads().
With the dictionary now in a variable,
a common next step is to print the contents with print():
Python
Oh dear! One huge line with no newlines. Depending on your console settings, this might appear as one very long line. Alternatively, your console output might have its word-wrapping mode on, which is the most common situation. Unfortunately, that doesn’t make the output much friendlier!
If you look at the first and last characters, you can see that this appears to be a list. You might be tempted to start writing a loop to print the items:
Python
This for loop would print each object on a separate line,
but even then, each object takes up way more space than can fit on a single line.
Printing in this way does make things a bit better,
but it’s by no means ideal.
The above example is a relatively simple data structure,
but what would you do with a deeply nested dictionary 100 times the size?
Sure, you could write a function that uses recursion to find a way to print everything. Unfortunately, you’ll likely run into some edge cases where this won’t work. You might even find yourself writing a whole module of functions just to get to grips with the structure of the data!
Enter the pprint module!
pprintpprint is a Python module made to print data structures in a pretty way.
It has long been part of the Python standard library,
so installing it separately isn’t necessary.
All you need to do is to import its pprint() function:
Python
Then,
instead of going with the normal print(users) approach as you did in the example above,
you can call your new favorite function to make the output pretty:
Python
This function prints users—but in a new-and-improved pretty way:
Python
How pretty! The keys of the dictionaries are even visually indented! This output makes it so much more straightforward to scan and visually analyze data structures.
If you’re a fan of typing as little as possible,
then you’ll be pleased to know that pprint() has an alias, pp():
Python
pp() is just a wrapper around pprint(),
and it’ll behave exactly the same way.
However, even the default output may be too much information to scan at first. Maybe all you really want is to verify that you’re dealing with a list of plain objects. For that, you’ll want to tweak the output a little.
For these situations,
there are various parameters you can pass to pprint() to make even the tersest data structures pretty.
pprint()In this section,
you’ll learn about all the parameters available for pprint().
There are seven parameters that you can use to configure your Pythonic pretty printer.
You don’t need to use them all, and some will be more useful than others.
The one you’ll find most valuable will probably be depth.
depthOne of the handiest parameters to play around with is depth.
The following Python command will only print the full contents of users
if the data structure is at or lower than the specified
depth—all while keeping things pretty, of course.
The contents of deeper data structures are replaced with three dots:
Python
Now you can immediately see that this is indeed a list of dictionaries.
To explore the data structure further, you can increase the depth by one level,
which will print all the top-level keys of the dictionaries in users:
Python
Now you can quickly check whether all the dictionaries share their top-level keys. This is a valuable observation to make, especially if you’re tasked with developing an application that consumes data like this.
indentThe indent parameter
controls how indented each level of the pretty-printed representation will be in the output.
The default indent is just 1,
which translates to one space character:
Python
The most important part of the indenting behavior of pprint()
is keeping all the keys aligned visually.
How much indentation is applied depends on both the indent parameter
and where the key is.
Since there’s no nesting in the examples above,
the amount of indentation is based completely on the indent parameter.
In both examples, note how
the opening curly bracket ({) is counted as a unit of indentation for the first key.
In the first example,
the opening single quote for the first key comes right after { without any spaces in between
because the indent is set to 1.
When there is nesting, however,
the indentation is applied to the first element in-line,
and pprint() then keeps all following elements aligned with the first one.
So if you set your indent to 4 when printing users,
the first element will be indented by four characters,
while the nested elements will be indented by more than eight characters
because the indentation starts from the end of the first key:
Python
This is just another part of the pretty in Python’s pprint()!
widthBy default, pprint() will
only output up to eighty characters per line.
You can customize this value by passing in a width argument.
pprint() will make an effort to fit the contents on one line.
If the contents of a data structure go over this limit,
then it’ll print every element of the current data structure on a new line:
Python
When you leave the width at the default of eighty characters,
the dictionary at users[0]['address']['geo']
only contains a 'lat' and a 'lng' attribute.
This means that taking the sum of the indent
and the number of characters needed to print out the dictionary,
including the spaces in between,
comes to less than eighty characters.
Since it’s less than eighty characters, the default width,
pprint() puts it all on one line.
However,
the dictionary at users[0]['company'] would go over the default width,
so pprint() puts each key on a new line.
This is true of dictionaries, lists, tuples, and sets:
Python
If you set the width to a large value like 160,
then all the nested dictionaries fit on one line.
You can even take it to extremes and use a huge value like 500,
which, for this example, prints the whole dictionary on one line:
Python
Here, you get the effects of setting width
to a relatively large value.
You can go the other way and set width to a low value such as 1.
However, the main effect that this will have is making sure
every data structure will display its components on separate lines.
You’ll still get the visual indentation that lines up the components:
Python
It’s hard to get Python’s pprint() to print ugly.
It’ll do everything it can to be pretty!
In this example, on top of learning about width, you’re also exploring how the printer splits up long lines of text.
Note how users[0]["company"]["catchPhrase"], which was initially
'Multi-layered client-server neural-net', has been split on each space.
The printer avoids dividing this string mid-word because that would make it hard to read.
compactYou might think that compact refers
to the behavior you explored in the section about width—that is,
whether compact makes data structures appear on one line or separate lines.
However, compact only affects the output once a line goes over the width.
If compact is True, then the output will wrap onto the next line.
The default behavior is for each element to appear on its own line
if the data structure is longer than the width:
Python
Pretty-printing this list using the default settings
prints out the abbreviated version on one line.
Limiting width to 40 characters,
you force pprint() to output all the list’s elements on separate lines.
If you then set compact=True, then the list will wrap at forty characters
and be more compact than it would typically look.
compact is useful for long sequences with short elements
that would otherwise take up many lines and make the output less readable.
streamThe stream parameter refers to the output of pprint().
By default, it goes to the same place that print() goes to.
Specifically, it goes to
sys.stdout,
which is actually a
file object
in Python.
However, you can redirect this to any file object,
just like you can with print():
Python
Here you create a file object with
open(),
and then you set the stream parameter in pprint() to that file object.
If you then open the output.txt file,
you should see that you’ve pretty-printed everything in users there.
Python does have its own
logging module.
However, you can also use pprint() to send pretty outputs to files
and have these act as logs if you prefer.
sort_dictsAlthough dictionaries are generally considered unordered data structures, since Python 3.6, dictionaries are ordered by insertion.
pprint() orders the keys alphabetically for printing:
Python
Unless you set sort_dicts to False, Python’s pprint() sorts the keys alphabetically. It keeps the output for dictionaries consistent,
readable, and—well—pretty!
When pprint() was first implemented, dictionaries were unordered. Without alphabetically ordering the keys, a dictionary’s keys could have theoretically differed at each print.
underscore_numbersThe underscore_numbers parameter is a feature introduced in Python 3.10 that makes long numbers more readable.
Considering that the example you’ve been using so far doesn’t contain any long numbers,
you’ll need a new example to try it out:
Python
If you tried running this call to pprint() and got an error,
you’re not alone. As of October 2021, this argument doesn’t work when calling pprint() directly.
The Python community noticed this quickly, and
it’s been fixed
in the December 2021 3.10.1 bugfix release.
The folks at Python care about their pretty printer!
They’ll probably have fixed this by the time you’re reading this tutorial.
If underscore_numbers doesn’t work when you call pprint() directly
and you really want pretty numbers,
there is a workaround:
When you create your own PrettyPrinter object,
this parameter should work just like it does in the example above.
Next, you’ll cover how to create a PrettyPrinter object.
PrettyPrinter ObjectIt’s possible to create an instance of PrettyPrinter
that has defaults you’ve defined.
Once you have this new instance of your custom PrettyPrinter object,
you can use it by calling the .pprint() method on the PrettyPrinter instance:
Python
With these commands, you:
PrettyPrinter, which is a class definitionusersnumber_list, which also demonstrates underscore_numbers in actionNote that the arguments you passed to PrettyPrinter
are exactly the same as the default pprint() arguments,
except that you skipped the first parameter.
In pprint(), this is the object you want to print.
This way, you can have various printer presets—perhaps some going to different streams—and call them when you need them.
pformat()What if you don’t want to send the pretty output of pprint() to a stream?
Perhaps you want to do some
regex
matching and replace certain keys.
For plain dictionaries, you might find
yourself wanting to remove the brackets and quotes
to make them look even more human-readable.
Whatever it is that you might want to do with the string pre-output,
you can get the string by using
pformat():
Python
pformat() is a tool you can use
to get between the pretty printer and the output stream.
Another use case for this might be if you’re building an API and want to send a pretty string representation of the JSON string. Your end users would probably appreciate it!
Python’s pprint() is recursive,
meaning it’ll pretty-print all the contents of a dictionary,
all the contents of any child dictionaries, and so on.
Ask yourself
what happens when a recursive function runs into a recursive data structure.
Imagine that you have dictionary A and dictionary B:
A has one attribute, .link, which points to B.B has one attribute, .link, which points to A.If your imaginary recursive function has no way to handle this circular reference,
it’ll never finish printing!
It would print A and then its child, B. But B also has A as a child,
so it would go on into infinity.
Luckily, both the normal print() function and the pprint() function handle this gracefully:
Python
While Python’s regular print() just abbreviates the output,
pprint() explicitly notifies you of recursion
and also adds the ID of the dictionary.
If you want to explore why this structure is recursive, you can learn more about passing by reference.
You’ve explored
the primary usage of the pprint module in Python
and some ways to work with pprint() and PrettyPrinter.
You’ll find that pprint() is especially handy
whenever you’re developing something that deals with complex data structures.
Maybe you’re developing an application
that uses an unfamiliar API.
Perhaps you have a data warehouse full of deeply-nested JSON files.
These are all situations where pprint can come in handy.
In this tutorial, you’ve learned how to:
pprint for use in your programspprint() in place of the regular print()PrettyPrinterpprint() handles themTo help you get to grips with the function and parameters,
you used an example of a data structure representing some users.
You also explored some situations where you might use pprint().
Congratulations! You’re now better equipped to deal with complex data by using Python’s pprint module.