Why You Should Use Python Data Classes

Introduction

If you just started or already coded using Python and like Object Oriented Programming but aren't familiar with the dataclasses module, you came to the right place!

In this article, we will learn:

  • What are data classes, and what are their benefits.
  • How exactly they are different from regular Python classes.
  • And when you should use them.

Data Classes Background

Data classes are used mainly to model data in Python. It decorates regular Python classes and has no restrictions, which means it can behave like a typical class.

A small example of a Data Class:

from dataclasses import dataclass

@dataclass
class Car:
   color: str
   manufacturer: str
   top_speed_km: int
dataclasses was introduced in Python 3.7 as part of PEP 557

Let's dive into some code examples

The Benefits of Data Class

Special methods build-in implementation

When using the @dataclass decorator we don't have to implement special methods ourselves, which helps us avoiding boilerplate code, like the init method (_init_ ), string representation method (_repr_ ), methods which are used for ordering objects (e.g. lt, le, gt, and ge), these compare the class as if it were a tuple of its fields, in order.
Read about few other extra built-in methods in the official documentation.

How will it look with a regular class:

class Car:
  color: str
  manufacturer: str
  top_speed_km: int

  def  __init__(self, color: str, manufacturer: str, top_speed_km: bool):
    self.color = color
    self.manufacturer = manufacturer
    self.top_speed_km = top_speed_km

  def __lt__(self, other_car):
      return self.top_speed_km < other_car.top_speed_km

red_ferrari = Car(color='red', manufacturer='Ferrari', top_speed_km=320)
print(red_ferrari) # <__main__.Car object at 0x7f218789ca00>
black_ferrari = Car(color='red', manufacturer='Ferrari', top_speed_km=347)
print(red_ferrari < black_ferrari) # True

Note those two points:

  • Because we didn't implement the _repr_ special method, when we print the Car instance, we get the name of the class and the object address.
  • To compare between 2 Car instances, I had to implement the "less than" (_lt_) method by myself.

Example with dataclass decorator:

from dataclasses import dataclass

@dataclass(order=True)
class Car:
  color: str
  manufacturer: str
  top_speed_km: int

slow_tesla = Car(top_speed_km=261, color='white', manufacturer='Tesla')
print(slow_tesla) # Car(color='white', manufacturer='Tesla', top_speed_km=261)
fast_tesla = Car(top_speed_km=280, color='white', manufacturer='Tesla')
print(slow_tesla < fast_tesla) # True
It's necessary to set order=True if we want special order methods implementation to be included in the dataclass (e.g. lt)
  • When we try to print the slow_tesla object, we see the actual values of the object, not the object's address, unlike the previous example.
  • We can compare two objects without any need for us to implement special methods.

Inheritance

Same as regular python classes, inheritance can come to our advantage here too, no need to deal with the parent class construction:

from dataclasses import dataclass

@dataclass
class Car:
  color: str
  manufacturer: str
  top_speed_km: int

@dataclass
class ElectricCar(Car):
  battery_capacity_kwh: int
  maximum_range_km: int

white_tesla_model_3 = ElectricCar(color='white', manufacturer='Tesla', top_speed_km=261, battery_capacity_kwh=50, maximum_range_km=455)

print(white_tesla_model_3)
# ElectricCar(color='white', manufacturer='Tesla', top_speed_km=261, battery_capacity_kwh=50, maximum_range_km=455)

Just for reference, here is how it will look like using a regular class:

class Car:
  color: str
  manufacturer: str
  top_speed_km: int

  def  __init__(self, color: str, manufacturer: str, top_speed_km: int):
    self.color = color
    self.manufacturer = manufacturer
    self.top_speed_km = top_speed_km

class ElectricCar(Car):
  battery_capacity_kwh: int
  maximum_range_km: int

  def __init__(self, color: str, manufacturer: str, top_speed_km: int, battery_capacity_kwh: int, maximum_range_km: int):
      super().__init__(color, manufacturer, top_speed_km)
      self.battery_capacity_kwh = battery_capacity_kwh
      self.maximum_range_km: maximum_range_km

white_tesla_model_3 = ElectricCar(color='white', manufacturer='Tesla', top_speed_km=261, battery_capacity_kwh=50, maximum_range_km=455)
print(white_tesla_model_3)

I hope you can see that we saved much boilerplate code even in this small code snippet and didn't repeat every parameter initiation.

Frozen Instances

By passing frozen=True to the data class decorator, it lets us create immutable Python objects.

from dataclasses import dataclass

@dataclass(frozen=True)
class Car:
  color: str
  manufacturer: str
  top_speed_km: int


white_tesla = Car(color='white', manufacturer='Tesla', top_speed_km=261)
white_tesla.color = 'Red'

Trying to modify white_tesla to a red tesla, will give us a FrozenInstanceError error message:

dataclasses.FrozenInstanceError: cannot assign to field 'color'
Note: Using Frozen Instances will hurt the performance a bit so use it carefully

Now, you:

  • Are familiar with what data class is and how to use it.
  • Learned about the benefits and use cases of data classes.
  • Have some code examples which can help you get started.
  • Can start using it in your projects.

Conclusion

dataclasses is a powerful module that helps us, Python developers, model our data, avoid writing boilerplate code ,and write much cleaner and elegant code.
I encourage you to explore and learn more about data class special features, I use it in all of my projects, and I recommend you to do it too.

Extra Resources

17