Start using named tuples instead of regular Python tuples
If you're a Python developer, you surely must've seen code like this:
height = person[2]
This code snippet is not easily understandable. What is person
? Is it a list, a tuple, something else? What does the value stored at index 2
represent? You can probably figure out relatively quickly from skimming over the code you're working with if person
is a tuple. But still, what is it that is stored at index 2
of that tuple? In this particular case we were lucky enough that the variable was named height
, which enabled us to guess that it must be the person's height that is stored at index 2
. But more often than not you'll see nondescript code such as h = person[2]
. Can you still guess correctly that it's the person's height that is stored at index 2
? Can you rule out with absolute certainty that it isn't a Boolean that specifies whether the person is hungry or not? To be sure you'd have to comb through the code to find out. That's why the code snippet above is not easily readable. Wouldn't it be much easier if the code author had written explicitly what that value person[2]
is meant to represent? If the author had written person.height
instead of person[2]
? You can do exactly that with named tuples.
Regular (unnamed) tuples vs named tuples#
Instead of using a regular tuple for an exemplary person called John and then accessing his attributes by indices, which is not immediately understandable and also error-prone …
john = ("John", 21, 180, 80)
height = john[2]
… we instead first define a subclass of NamedTuple
that we name Person
and then create an instance of that class that we name john
:
from typing import NamedTuple
class Person(NamedTuple):
name: str
age: int
height: int
weight: int
john = Person("John", 21, 180, 80)
height = john.height
This way we can access John's third attribute by its name (height
) rather than by its index (2
). Yes, it's more code to do the same thing, but it makes the code so much easier to understand. Another benefit of the NamedTuple
syntax is that we can specify that the attribute name
is supposed to be a string and that the other attributes are all supposed to be a number. This helps immensely with auto-completion and error detection in case we use a type checker such as mypy.
Being able to write john.height
instead of john[2]
is not only easier to read, it also has the potential to save you from hours of frustration. Imagine mistakenly writing john[1]
or john[3]
instead of john[2]
without noticing it and then wondering why your code isn't working as expected. Accidentally writing john.age
when you meant john.height
doesn't happen as easily as confusing indices.
NamedTuple
vs namedtuple
#
NamedTuple
is an improvement of the older namedtuple
which you find in collections
. Just in case you stumble upon namedtuple
in someone else's code, here's what an older version of the same code snippet would be:
from collections import namedtuple
Person = namedtuple("Person", "name age height weight") # (1)!
john = Person("John", 21, 180, 80)
height = john.height
- Notice that
namedtuple
takes only two strings, not five as it might seem at first glance. The second string contains four words that are separated by a space character.
Only line 3 is fundamentally different. The first block imports the appropriate library, the third block is identical.
This older namedtuple
syntax (line 3) is not as intuitively clear as the newer NamedTuple
syntax. It also lacks the benefit of type checking, since all arguments are contained within a single string, so I recommend using NamedTuple
instead of namedtuple
.
Final words#
I really like using NamedTuple
and I hope you'll use it more often too, now that you saw its benefits. It'd help others understand your code more easily.
If you like this tip, I'd love to read your comment in the box down below. Be well.