python-course.eu

5. Dataclasses In Python

By Bernd Klein. Last modified: 19 Feb 2024.

Journey through Dataclasses: Advanced Techniques with Python Dataclasses

Light at the end of a tunnel

We delve into the powerful world of Python dataclasses in this chapter of our Python tutorial. Dataclasses are an essential feature introduced in Python 3.7 to simplify the creation and management of classes primarily used to store data. The chapter begins by pointing out the the difference of traditional class structures for representing data and introducing dataclasses as an attractive alternative. We will explore the syntax and functionalities of dataclasses and we will demonstrate how code readability will be enhanced.

Readers will gain a comprehensive understanding of the principles underlying dataclasses and will be able to applicate them in their Python projects.

Live Python training

instructor-led training course

Enjoying this page? We offer live Python training courses covering the content of this site.

See: Live Python courses overview

Enrol here

First Examples

In our first example, we remain true to our beloved robot Marvin. So, we start with a "traditional" Python class Robot_traditional representing a Robot:

class Robot_traditional:
    
    def __init__(self, model, serial_number, manufacturer):
        self.model = model
        self.serial_number = serial_number
        self.manufacturer = manufacturer

The boilerplate code within the __init__ method, as seen in various class definitions, follows a similar pattern to this example, with the only variation being the names assigned to attributes. This becomes particularly tedious as the number of attributes grows.

If we have a look at the same example in dataclass notation, we see that the code gets a lot leaner. But before we can use dataclass as a decorator for our class, we have to import it from the module dataclasses.

from dataclasses import dataclass

@dataclass
class Robot:
    model: str
    serial_number: str
    manufacturer: str

Here, the dataclass decorator automates the generation of special methods like __init__, reducing the need for boilerplate code. The class definition is concise, making it clearer and more maintainable, especially as the number of attributes increases.

This example demonstrates how using dataclasses for a class primarily used to store data, such as a robot representation, offers a more streamlined and readable alternative to traditional class structures.

Initializing robots of both classes is the same:

x = Robot_traditional("NanoGuardian XR-2000", "234-76", "Cyber Robotics Co.")
y = Robot("MachinaMaster MM-42", "986-42", "Quantum Automations Inc.")

Yet, there are other differences in these classes. The class decorator dataclass has not only created the special method __init__ but also __repr__, __eq__, __ne__, and __hash__. Methods which you would have to add manually to the call Robot_traditional.

Let's have a look at __repr__ and compare it to the traditional class definition:

print(repr(x))    
print(repr(y))    #  uses __repr__

OUTPUT:

<__main__.Robot_traditional object at 0x7f21c0acb410>
Robot(model='MachinaMaster MM-42', serial_number='986-42', manufacturer='Quantum Automations Inc.')

We can see that __repr__ has been also implicitly installed by dataclass. To get the same result for Robot_traditional, we have to implement the method explicitly:

class Robot_traditional:
    
    def __init__(self, model, serial_number, manufacturer):
        self.model = model
        self.serial_number = serial_number
        self.manufacturer = manufacturer

    def __repr__(self):
        return f"Robot_traditional(model='{self.model}', serial_number='{self.serial_number}', manufacturer='{self.manufacturer}')"

x = Robot_traditional("NanoGuardian XR-2000", "234-76", "Cyber Robotics Co.")

print(repr(x))

OUTPUT:

Robot_traditional(model='NanoGuardian XR-2000', serial_number='234-76', manufacturer='Cyber Robotics Co.')

Immutable Classes

In our chapter "Creating Immutable Classes in Python" of our tutorial, we discuss the rationale behind the need for immutability and explore different methods of creating them.

It's very simple with dataclasses. All you have to do is to call the decorator with frozen set to True:

from dataclasses import dataclass

@dataclass(frozen=True)
class ImmutableRobot:
    name: str
    brandname: str

We can use this class and test two robots for equality. They are equal, if all the attributes are the same. Remember that __eq__ has been automaticalle created.

x1 = ImmutableRobot("Marvin", "NanoGuardian XR-2000")
x2 = ImmutableRobot("Marvin", "NanoGuardian XR-2000")

print(x1 == x2)

OUTPUT:

True

Let's look at the hash values:

print(x1.__hash__(), x2.__hash__())

OUTPUT:

-7581403571593916930 -7581403571593916930

First, we can see that __hash__ was automatically created, and it was implemented correctly. This ensures that robots which are equal will produce identical hash values.

Now, let's implement a similar class without utilizing dataclass. In our previously defined class Robot_traditional, the instances are mutable, as we have the ability to modify the attributes. The following example is an immutable version. We can see immediately that there is a lot more coding involved. We have to implement __init__', the getter properties,eqandhash`:

class ImmutableRobot_traditional:
    
    def __init__(self, name: str, brandname: str):
        self._name = name
        self._brandname = brandname

    @property
    def name(self) -> str:
        return self._name

    @property
    def brandname(self) -> str:
        return self._brandname

    def __eq__(self, other):
        if not isinstance(other, ImmutableRobot_traditional):
            return False
        return self.name == other.name and self.brandname == other.brandname

    def __hash__(self):
        return hash((self.name, self.brandname))
x1 = ImmutableRobot_traditional("Marvin", "NanoGuardian XR-2000")
x2 = ImmutableRobot_traditional("Marvin", "NanoGuardian XR-2000")

print(x1 == x2)

OUTPUT:

True
print(x1.__hash__(), x2.__hash__())

OUTPUT:

-7581403571593916930 -7581403571593916930

Live Python training

instructor-led training course

Enjoying this page? We offer live Python training courses covering the content of this site.

See: Live Python courses overview

Upcoming online Courses

Enrol here

Immutable Classes for mappings

Having an immutable class with a __hash__ method means that we can use our class in sets and dictionaries. We illustrate this in the following example:

from dataclasses import dataclass

@dataclass(frozen=True)
class ImmutableRobot:
    name: str
    brandname: str

robot1 = ImmutableRobot("Marvin", "NanoGuardian XR-2000")
robot2 = ImmutableRobot("R2D2", "QuantumTech Sentinel-7")
robot3 = ImmutableRobot("Marva", "MachinaMaster MM-42")

# we create a set of Robots:
robots = {robot1, robot2, robot3}

print("The robots in the set robots:")
for robo in robots:
    print(robo)

# now a dictionary with robots as keys:
activity = {robot1: 'activated', robot2: 'activated', robot3: 'deactivated'}

print("\nAll the activated robots:")
for robo, mode in activity.items():
    if mode == 'activated':
        print(f"{robo} is activated")

OUTPUT:

The robots in the set robots:
ImmutableRobot(name='Marva', brandname='MachinaMaster MM-42')
ImmutableRobot(name='R2D2', brandname='QuantumTech Sentinel-7')
ImmutableRobot(name='Marvin', brandname='NanoGuardian XR-2000')

All the activated robots:
ImmutableRobot(name='Marvin', brandname='NanoGuardian XR-2000') is activated
ImmutableRobot(name='R2D2', brandname='QuantumTech Sentinel-7') is activated

Summary

Using a dataclass provides several benefits over traditional class definitions in Python:

Overall, dataclass simplifies the process of creating classes in Python, making code more concise, readable, and maintainable, especially for classes that primarily serve as containers for data.

Live Python training

instructor-led training course

Enjoying this page? We offer live Python training courses covering the content of this site.

See: Live Python courses overview

Enrol here

Exercises

Exercise 1: Book Information

Create a Book class using dataclass to represent information about books. Each book should have the following attributes:

Write a program that does the following:

  1. Define the Book class using dataclass.
  2. Create instances of several books.
  3. Print out the details of each book, including its title, author, ISBN, publication year, and genre.

You can use this exercise to practice defining dataclass, creating instances, and accessing attributes of dataclass objects. Additionally, you can explore how to add methods or customizations to the Book class, such as implementing a method to calculate the age of the book based on the publication year or adding validation for ISBN numbers.

Solutions

Solution 1

from dataclasses import dataclass

@dataclass
class Book:
    title: str
    author: str
    isbn: str
    publication_year: int
    genre: str

# Create instances of several books
book1 = Book("The Great Gatsby", "F. Scott Fitzgerald", "9780743273565", 1925, "Fiction")
book2 = Book("To Kill a Mockingbird", "Harper Lee", "9780061120084", 1960, "Fiction")
book3 = Book("1984", "George Orwell", "9780451524935", 1949, "Science Fiction")

# Print out the details of each book
print("Book 1:")
print("Title:", book1.title)
print("Author:", book1.author)
print("ISBN:", book1.isbn)
print("Publication Year:", book1.publication_year)
print("Genre:", book1.genre)

print("\nBook 2:")
print("Title:", book2.title)
print("Author:", book2.author)
print("ISBN:", book2.isbn)
print("Publication Year:", book2.publication_year)
print("Genre:", book2.genre)

print("\nBook 3:")
print("Title:", book3.title)
print("Author:", book3.author)
print("ISBN:", book3.isbn)
print("Publication Year:", book3.publication_year)
print("Genre:", book3.genre)

OUTPUT:

Book 1:
Title: The Great Gatsby
Author: F. Scott Fitzgerald
ISBN: 9780743273565
Publication Year: 1925
Genre: Fiction

Book 2:
Title: To Kill a Mockingbird
Author: Harper Lee
ISBN: 9780061120084
Publication Year: 1960
Genre: Fiction

Book 3:
Title: 1984
Author: George Orwell
ISBN: 9780451524935
Publication Year: 1949
Genre: Science Fiction

Live Python training

instructor-led training course

Enjoying this page? We offer live Python training courses covering the content of this site.

See: Live Python courses overview

Upcoming online Courses

Enrol here