Skip to content

Start using named tuples instead of regular Python tuples

If you're a Python developer, you surely must've seen code like this:

height = person[2]

This code snippet is not easily understandable. What is person? Is it a list, a tuple, something else? What does the value stored at index 2 represent? You can probably figure out relatively quickly from skimming over the code you're working with if person is a tuple. But still, what is it that is stored at index 2 of that tuple? In this particular case we were lucky enough that the variable was named height, which enabled us to guess that it must be the person's height that is stored at index 2. But more often than not you'll see nondescript code such as h = person[2]. Can you still guess correctly that it's the person's height that is stored at index 2? Can you rule out with absolute certainty that it isn't a Boolean that specifies whether the person is hungry or not? To be sure you'd have to comb through the code to find out. That's why the code snippet above is not easily readable. Wouldn't it be much easier if the code author had written explicitly what that value person[2] is meant to represent? If the author had written person.height instead of person[2]? You can do exactly that with named tuples.

Regular (unnamed) tuples vs named tuples#

Instead of using a regular tuple for an exemplary person called John and then accessing his attributes by indices, which is not immediately understandable and also error-prone …

Regular tuple
john = ("John", 21, 180, 80)
height = john[2]

… we instead first define a subclass of NamedTuple that we name Person and then create an instance of that class that we name john:

Recommended NamedTuple
from typing import NamedTuple

class Person(NamedTuple):
    name: str
    age: int
    height: int
    weight: int

john = Person("John", 21, 180, 80)
height = john.height

This way we can access John's third attribute by its name (height) rather than by its index (2). Yes, it's more code to do the same thing, but it makes the code so much easier to understand. Another benefit of the NamedTuple syntax is that we can specify that the attribute name is supposed to be a string and that the other attributes are all supposed to be a number. This helps immensely with auto-completion and error detection in case we use a type checker such as mypy.

Being able to write john.height instead of john[2] is not only easier to read, it also has the potential to save you from hours of frustration. Imagine mistakenly writing john[1] or john[3] instead of john[2] without noticing it and then wondering why your code isn't working as expected. Accidentally writing john.age when you meant john.height doesn't happen as easily as confusing indices.

NamedTuple vs namedtuple#

NamedTuple is an improvement of the older namedtuple which you find in collections. Just in case you stumble upon namedtuple in someone else's code, here's what an older version of the same code snippet would be:

Outdated namedtuple
from collections import namedtuple

Person = namedtuple("Person", "name age height weight") # (1)!

john = Person("John", 21, 180, 80)
height = john.height
  1. Notice that namedtuple takes only two strings, not five as it might seem at first glance. The second string contains four words that are separated by a space character.

Only line 3 is fundamentally different. The first block imports the appropriate library, the third block is identical.

This older namedtuple syntax (line 3) is not as intuitively clear as the newer NamedTuple syntax. It also lacks the benefit of type checking, since all arguments are contained within a single string, so I recommend using NamedTuple instead of namedtuple.

Final words#

I really like using NamedTuple and I hope you'll use it more often too, now that you saw its benefits. It'd help others understand your code more easily.

If you like this tip, I'd love to read your comment in the box down below. 🙂 Be well.

Comments