18
Why You Should Use Python Data Classes
If you just started or already coded using Python and like Object Oriented Programming but aren't familiar with the dataclasses
module, you came to the right place!
- What are data classes, and what are their benefits.
- How exactly they are different from regular Python classes.
- And when you should use them.
Data classes are used mainly to model data in Python. It decorates regular Python classes and has no restrictions, which means it can behave like a typical class.
A small example of a Data Class:
from dataclasses import dataclass
@dataclass
class Car:
color: str
manufacturer: str
top_speed_km: int
When using the @dataclass decorator we don't have to implement special methods ourselves, which helps us avoiding boilerplate code, like the init method (_init_ ), string representation method (_repr_ ), methods which are used for ordering objects (e.g. lt, le, gt, and ge), these compare the class as if it were a tuple of its fields, in order.
Read about few other extra built-in methods in the official documentation.
How will it look with a regular class:
class Car:
color: str
manufacturer: str
top_speed_km: int
def __init__(self, color: str, manufacturer: str, top_speed_km: bool):
self.color = color
self.manufacturer = manufacturer
self.top_speed_km = top_speed_km
def __lt__(self, other_car):
return self.top_speed_km < other_car.top_speed_km
red_ferrari = Car(color='red', manufacturer='Ferrari', top_speed_km=320)
print(red_ferrari) # <__main__.Car object at 0x7f218789ca00>
black_ferrari = Car(color='red', manufacturer='Ferrari', top_speed_km=347)
print(red_ferrari < black_ferrari) # True
Note those two points:
- Because we didn't implement the _repr_ special method, when we print the Car instance, we get the name of the class and the object address.
- To compare between 2 Car instances, I had to implement the "less than" (_lt_) method by myself.
Example with dataclass decorator:
from dataclasses import dataclass
@dataclass(order=True)
class Car:
color: str
manufacturer: str
top_speed_km: int
slow_tesla = Car(top_speed_km=261, color='white', manufacturer='Tesla')
print(slow_tesla) # Car(color='white', manufacturer='Tesla', top_speed_km=261)
fast_tesla = Car(top_speed_km=280, color='white', manufacturer='Tesla')
print(slow_tesla < fast_tesla) # True
It's necessary to set order=True
if we want special order methods implementation to be included in the dataclass (e.g. lt)
- When we try to print the
slow_tesla
object, we see the actual values of the object, not the object's address, unlike the previous example. - We can compare two objects without any need for us to implement special methods.
Same as regular python classes, inheritance can come to our advantage here too, no need to deal with the parent class construction:
from dataclasses import dataclass
@dataclass
class Car:
color: str
manufacturer: str
top_speed_km: int
@dataclass
class ElectricCar(Car):
battery_capacity_kwh: int
maximum_range_km: int
white_tesla_model_3 = ElectricCar(color='white', manufacturer='Tesla', top_speed_km=261, battery_capacity_kwh=50, maximum_range_km=455)
print(white_tesla_model_3)
# ElectricCar(color='white', manufacturer='Tesla', top_speed_km=261, battery_capacity_kwh=50, maximum_range_km=455)
Just for reference, here is how it will look like using a regular class:
class Car:
color: str
manufacturer: str
top_speed_km: int
def __init__(self, color: str, manufacturer: str, top_speed_km: int):
self.color = color
self.manufacturer = manufacturer
self.top_speed_km = top_speed_km
class ElectricCar(Car):
battery_capacity_kwh: int
maximum_range_km: int
def __init__(self, color: str, manufacturer: str, top_speed_km: int, battery_capacity_kwh: int, maximum_range_km: int):
super().__init__(color, manufacturer, top_speed_km)
self.battery_capacity_kwh = battery_capacity_kwh
self.maximum_range_km: maximum_range_km
white_tesla_model_3 = ElectricCar(color='white', manufacturer='Tesla', top_speed_km=261, battery_capacity_kwh=50, maximum_range_km=455)
print(white_tesla_model_3)
I hope you can see that we saved much boilerplate code even in this small code snippet and didn't repeat every parameter initiation.
By passing frozen=True
to the data class decorator, it lets us create immutable Python objects.
from dataclasses import dataclass
@dataclass(frozen=True)
class Car:
color: str
manufacturer: str
top_speed_km: int
white_tesla = Car(color='white', manufacturer='Tesla', top_speed_km=261)
white_tesla.color = 'Red'
Trying to modify white_tesla
to a red tesla, will give us a FrozenInstanceError error message:
dataclasses.FrozenInstanceError: cannot assign to field 'color'
- Are familiar with what data class is and how to use it.
- Learned about the benefits and use cases of data classes.
- Have some code examples which can help you get started.
- Can start using it in your projects.
dataclasses
is a powerful module that helps us, Python developers, model our data, avoid writing boilerplate code ,and write much cleaner and elegant code.
I encourage you to explore and learn more about data class special features, I use it in all of my projects, and I recommend you to do it too.
18