Introducing Python Classes and Dataclasses
Quick Success Data Science

If you're going to do any serious Programming with Python, you'll need to understand object-oriented programming and the concept of a class and a dataclass. In this Quick Success Data Science article, you'll get a quick and painless introduction to all three, including what they're for, how you use them, and why you need them.
A Brief Introduction to Classes and OOP
Object-oriented programming (OOP) is a language model that reduces code duplication and makes code easier to update, maintain, and reuse. As a result, most commercial software is now built using OOP.
Whereas procedural programming is built around actions and logic, OOP is built around data structures, known as objects, that consist of data and functions (called methods) that act on the data. Objects are built from classes, which are like blueprints for the objects.
A class is a data type, and when you create an object of that data type, it is also known as an instance of that class. The process of setting the initial values and behaviors of the instance is called instantiation.
As instances of a class, objects allow you to create multiple copies with the same structure but potentially different data. For example, if you're building a space combat game, you can conveniently bundle the attributes of a certain spaceship, like its size, speed, and armament, with the methods that control its flight and weapons operation. Then, when you create a new spaceship of that type, you only need to worry about giving it a unique name.
Because Python is an object-oriented programming language, you've already been using objects and methods defined by other people. But unlike languages such as Java, Python doesn't force you to use OOP for your programs. It provides ways to encapsulate and separate abstraction layers using other approaches such as procedural or functional programming.
Having this choice is important. If you implement OOP in small programs, most of them will feel over-engineered. To paraphrase computer scientist Joe Armstrong, "The problem with object-oriented languages is they've got all this implicit environment that they carry around with them. You wanted a banana, but what you got was a gorilla holding the banana and the whole damn jungle!"
If you're a scientist or engineer, you can get a lot done without OOP, but that doesn't mean you should ignore it. OOP makes it easy to simulate many objects at a time, such as a flock of birds, a network of power plants, or a cluster of galaxies. It's also important when things that are manipulated, like a GUI button or window, must persist for a long time in the computer's memory.
Since it's easier to demonstrate OOP than it is to talk about it, let's look at an example using a Dungeons and Dragons–type board game in which players can be different characters, such as dwarves, elves, and wizards. These games use character cards to list important information for each character type. If you let your playing piece represent a dwarf, it inherits the characteristics on the card.

The Dwarf and Elf Classes
The following code reproduces board game–style play, letting you create virtual cards for a dwarf and an elf, name your characters, and have them fight. The outcome of the fight will impact one of the character's body points, which represents the character's health. Be sure to note how OOP allows you to easily create many identical objects – in this case, dwarves or elves – by "stamping" them out of the predefined template, called a class.
import random
class Dwarf(object):
def __init__(self, name):
self.name = name
self.attack = 3
self.defend = 4
self.move = 2
self.body = 5
def talk(self):
print("I'm a blade-man, I'll cut ya!!!")
Here's the same code annotated with key components we'll discuss below:

Dwarf
class (by the author)We started by importing random
to simulate rolling a die; this is how your character will fight. Then we defined a class for a Dwarf character, capitalizing the first letter of the class name, and passed it an object
argument. This object
argument represents the base class of all types in Python.
TIP: Because
object
is the default parameter, you don't have to state it explicitly when defining a class. I used it here for clarity.
As mentioned previously, a class is a template for creating objects of a certain type. For example, when you create a list or dictionary in Python, you are creating them from a class.
The Dwarf
class definition is like the card in the previous figure; it's the "genetic" blueprint for a dwarf. It will assign attributes, like strength and vitality, and methods, like how the character moves or talks. Attributes are variables associated with an object, and methods are attributes that also happen to be functions, which are passed a reference to their instance when they run.
Immediately after the class
definition, we defined a constructor method, also referred to as the initialization method. It sets up the initial attribute values for the object. The __init__()
method is a special built-in method that Python automatically invokes as soon as a new object is created. In this case, we passed two arguments: self
and the name
of the object.
The
__init__()
method is a dunder (double underscore) method, meaning its name is preceded and followed by double underscores. Also called magic or special methods, they let you create classes that behave like native Python data structures such as lists, tuples, and sets.
The self
parameter is a reference to the instance of the class that is being created, or a reference to the instance a method was invoked on, technically referred to as a context instance. You can think of it as a placeholder for the actual name you will give the object.
If you create a new dwarf and name it "Steve," self
will become Steve behind the scenes. For example, self.attack
becomes "Steve's attack." If you create another dwarf named "Flint," self
for that object will become "Flint." This way, the scope of Steve's health attribute is kept separate from Flint's.
Next, we listed some attributes for a dwarf beneath the constructor definition. We added a name so you can tell one dwarf from another, as well as the value of key combat characteristics. Notice how this list resembles the character card.
TIP: While it's possible to use methods to assign new attributes later, it's best to initialize them all within the
__init__
method. This way, all the available attributes are conveniently listed in an easy-to-find location.
We next defined a talk()
method and passed it self
. By passing it self
, you linked the method to the object. In more comprehensive games, methods might include behaviors like movement and the ability to disarm traps.
With the class definition complete, we'll use the code below to create an instance of the Dwarf
class and assign this object to the local variable lenn
, the dwarf's name. We'll print the name and attack attributes to demonstrate that we have access to them, and finish by invoking the talk()
method. Note how both attributes and methods are invoked with dot notion syntax (such as lenn.attack
and lenn.talk()
):
lenn = Dwarf("Lenn")
print(f"Dwarf name = {lenn.name}")
print(f"Lenn's attack strength = {lenn.attack}")
lenn.talk()
Dwarf name = Lenn
Lenn's attack strength = 3
I'm a blade-man, I'll cut ya!!!
Now we'll create an elf character, using the same process, and have it fight the dwarf. The elf's body attribute will be updated to reflect the outcome of the battle.
class Elf(object):
pointed_ears = True
def __init__(self, name):
self.name = name
self.attack = 4
self.defend = 4
self.move = 4
self.body = 4
esseden = Elf("Esseden")
print(f"Elf name = {esseden.name}")
print(f"Esseden body value = {esseden.body}")
Elf name = Esseden
Esseden body value = 4
First, we defined an Elf
class and passed it object
, as we did with the Dwarf
class. Next, for instructional purposes, we did something different. We added an attribute, pointed_ears
, immediately after defining the class and before defining the initialization method.
Classes are objects, too, so they can have their own attributes. Class attributes are common to all objects made from the class and behave sort of like global variables.
In this case, all the elves you build will have pointed ears. By placing this at the class level, you don't need to include it at the object level. Likewise, you can define class methods, that act on all objects. For example, all pirate characters might say "Arrgh!" before they speak.
Next, we defined the initialization method and assigned some attributes. We made them slightly different from the dwarf's and well-balanced, like an elf. We then instantiated an elf named Esseden and accessed his name and body attributes using print()
.
Next, we'll have our two characters interact using the roll of a virtual die with a maximum value equal to the character's attack or defend value. We'll use the random module to choose a roll value in a range of 1 to Lenn's attack attribute plus 1, then repeat this process to get Esseden's defense.
We'll calculate the damage to Esseden by subtracting Esseden's roll value from Lenn's roll value, and if the damage is a positive number, subtract it from Esseden's body attribute. We'll use print()
to confirm the elf's current health.
lenn_attack_roll = random.randrange(1, lenn.attack + 1)
print(f"Lenn attack roll = {lenn_attack_roll}")
esseden_defend_roll = random.randrange(1, esseden.defend + 1)
print(f"Esseden defend roll = {esseden_defend_roll}")
damage = lenn_attack_roll - esseden_defend_roll
if damage > 0:
esseden.body -= damage
print(f"Esseden current body value = {esseden.body}")
Lenn attack roll = 3
Esseden defend roll = 1
Esseden body value = 2
The roll results here are random, so you may get a different outcome.
As you can imagine, building many similar characters and keeping track of their changing attributes could quickly get complicated with procedural programming. OOP provides a modular structure for your program, makes it easy to hide complexity and ownership of scope with encapsulation, permits problem-solving in bite-sized chunks, and produces sharable templates that can be modified and used elsewhere.
Adding a Class with Inheritance
Inheritance, a key concept in OOP, lets you define a new child class based on an existing parent or ancestor class. (Technically, the original class is called a base class or superclass. The new class is called a derived class or subclass.)
The new subclass inherits all of the attributes and methods of the existing superclass. This makes it easy to copy and extend an existing base class by adding new attributes and methods specific to the subclass.
Let's make a new elf class called Elden
that inherits from and modifies our current Elf
class. We'll assume the Elden are "high elves" which are archers and come with a quiver of arrows. Otherwise, they have the same attributes as a common elf.
class Elden(Elf):
def __init__(self, name):
Elf.__init__(Elf, name)
self.arrows = 24
def fire_arrow(self):
if self.arrows > 0:
self.arrows -= 1
print(f"nPphssssstttttt!")
print(f"{self.name} arrows remaining = {self.arrows}")
else:
print("nArrows depleted")
To create a child class, we passed the class statement the name of the parent, or superclass, which in this case is Elf
. Remember that, when you first defined Elf
, you passed it object
. This meant that the Elf
class was inherited from the object
class, which is the root of all Python objects. The object
class provides the default implementation of common methods that all derived classes might need. By passing Elf
instead of object
, you got the attributes and methods under object
as well as the new ones you added to the Elf
class.
Next, we defined the __init__()
initialization method for the Elden
class, which, like the Elf
class, has a self
and name
parameter. Immediately beneath it, we called the initialization method from the Elf
class and passed it Elf
instead of self
, along with a name
parameter. Passing in the Elf
class gives you access to all the attributes in the Elf.__init__ ()
method, such as attack
, defend
, and body
attributes, so you don't need to duplicate any code.
If you don't define an __init__()
method for a child class, it will use the __init__()
method from the parent class. If you want to override some of the attribute values in the parent class or add new attributes, you'll need to include an __init__()
method for the child class, as we did in this example.
Our original Elf
class did not allow for arrows, so we added a new self. arrows
attribute. We set the complement of arrows to 24. The Elden elf will need a way to fire the arrows, so we defined a new method called fire_arrow()
. If we were writing a complete game this would include an advantage such as rolling an extra attack die or being able to attack from a distance.
Now, let's instantiate an Elden named Legolas, access their name, and shoot an arrow.
legolas = Elden("Legolas")
print(f"Elden name = {legolas.name} ")
legolas.fire_arrow()
Elden name = Legolas
Pphssssstttttt!
Legolas arrows remaining = 23
By using inheritance, we reduced the amount of code we needed to write for the new Elden
class by "borrowing" from the existing Elf
class. And it gets better. In the next section, we'll look at another way to reduce the amount of code needed to define classes.
Using the super() Function for Inheritance
The super()
built-in function removes the need for an explicit call to a base class name when invoking base class methods. It works with both single and multiple inheritance.
For example, in the Elden
class definition, you called the Elf
class's __init__()
method within the Elden
class's __init__()
method, as follows:
class Elden(Elf):
def __init__(self, name):
Elf.__init__(Elf, name)
This lets the Elden
class inherit from Elf
. Alternatively, you could have used the super()
function, which returns a proxy object that allows access to methods of the base class:
class Elden(Elf):
def __init__(self, name):
super().__init__(name)
In this case, super()
removes the need for an explicit call to the Elf
class. When using single inheritance, super()
is just a fancier way to refer to the base type. It makes the code a bit more maintainable.
For example, if you are using super()
everywhere and want to change the name of the base class (such as from Elf
to CommonElf
) you need to change the name only once when defining the base class.
Another use for super()
is to access inherited methods that have been overridden in a new class. Let's assume we've made a new HighElf
class where the elf character uses the inherited Elden
class's fire_arrow()
method to fire two arrows at a time instead of one. We've overridden the method, but if we run into a situation where we want to fire a single arrow, we can call the base class's method by using super().fire_arrow()
. This references the original method, which fires a single arrow.
The use of super()
is somewhat controversial. On one hand, it makes code more maintainable. On the other, it makes it less explicit, which violates the Zen of Python edict "Explicit is better than implicit."
The Dataclass
The built-in __ dataclass
module introduced in Python 3.7 provides a convenient way to reduce code redundancy by making classes less verbose. Although primarily designed for classes that _store dat_a, data classes work just like regular classes and can include methods that interact with the data. Some use cases include classes for bank accounts, the content of scientific articles, and employee information.
A dataclass comes with basic "boilerplate" functionality already implemented. You can instantiate, print, and compare dataclass instances straight out of the box. Many of the common things you do in a class can be reduced to a few basic instructions.
Dataclasses are implemented using a helpful and powerful Python tool called a decorator. A decorator is a function designed to wrap around (encapsulate) another function or class to alter or enhance the wrapped object's behavior. It lets you modify the behavior without permanently changing the object.
Decorators also let you avoid duplicating code when you're running the same process on multiple functions, such as checking memory use, adding logging, or testing performance.
Decorator Basics
To see how decorators work, let's define a function that squares a number. Then, we'll define a decorator function that squares that result. Enter the following in a text editor:
def square_it(x):
return x**2
def square_it_again(func):
def wrapper(*args, **kwargs):
result = (func(*args, **kwargs))**2
return result
return wrapper
The first function, square_it()
, takes a number, represented by x, and returns its square. The second function, square_it_again()
, will serve as a decorator to the first function and is a little more complicated.
The decorator function has a func
parameter, representing a function. Because functions are objects, you can pass a function to another function as an argument and even define a function within a function. When we call this decorator function, we'll pass it the square_it()
function as an argument.
Next, we defined an inner function, which we called wrapper()
. Because square_it()
takes an argument, we need to set up the inner function to handle arguments by using the special positional and keyword arguments *args
and **kwargs
.
The
*args
and**kwargs
syntax provide flexibility in handling variable numbers of arguments, both positional and keyword, in Python functions. The*args
syntax in a function definition allows the function to accept any number of positional arguments. The**kwargs
syntax allows a function to accept any number of keyword arguments. Combining both*args
and**kwargs
allows a function to accept any combination of positional and keyword arguments.
Within the wrapper()
function, we called the function we passed to the decorator (func
), squared its output, assigned the resulting number to the result
variable, and returned result
. Finally, we returned the wrapper()
function.
To use the square_it_again()
decorator, call it, pass it the function that you want to decorate (square_it()
), and assign the result to a variable (square
), which also represents a function:
square = square_it_again(square_it)
print(type(square))
You can now call the new function and pass it an appropriate argument:
print(square(3))
81
In this example, we manually called the decorator function. This demonstrated how decorators work, but it's a bit verbose and contorted. In the next section, we'll look at a more convenient method for using a decorator.
Decorator Syntactic Sugar
In computer science, syntactic sugar is clear, concise syntax that simplifies the language and makes it "sweeter" for human use. The syntactic sugar for a decorator is the @ symbol, which must be immediately followed by the name of the decorator function. The next line must be the definition statement for the function or class being wrapped, as follows:
@decorator_func_name
def new_func():
do something
In this case, decorator_func_name
represents the decorator function, and new_func()
is the function being wrapped. A class definition can be substituted for the def
statement.
To see how it works, let's re-create our number-squaring example from the beginning. We'll leave off assigning the square
variable, as we don't need it anymore:
def square_it_again(func):
def wrapper(*args, **kwargs):
result = (func(*args, **kwargs))**2
return result
return wrapper
@square_it_again
def square_it(x):
return x**2
print(square_it(3))
81
After defining our square_it_again()
function again, we added the decorator and defined the square_it()
function. After that, we called the square_it()
function the same way we would if the decorator didn't exist.
NOTE: when using the @ symbol, use the decorator function name without parentheses.
If decorators make your head spin a little, don't worry. If you can type @ dataclass
, you can use dataclasses. This decorator modifies regular Python classes so that you can define them using shorter and sweeter syntax.
Demonstrating Dataclasses
To see the benefits of dataclasses, let's define a regular class and then repeat the exercise using a dataclass. Our goal will be to make generic ship objects that we can track on a simulation grid. For each ship, we'll need to supply a name, a classification (like "frigate"), a country of registry, and a location.

Defining Ship as a Regular Class
To define a regular class called Ship
, in a text editor, enter the following and then save it as _shiptracker.py:
class Ship:
def __init__(self, name, classification, registry, location):
self.name = name
self.classification = classification
self.registry = registry
self.location = location
self.obj_type = 'ship'
self.obj_color = 'black'
The initialization method contains multiple parameters, such as a name
and registry
. These will need to be passed as arguments when instantiating an object based on this class.
Note how we're forced to duplicate code by repeating each parameter name, like classification
, three times: once as a parameter and twice when assigning the instance attribute. The more data you need to pass to the method, the greater this redundancy.

In addition to the parameters passed to the initialization method, the Ship
class includes two "fixed" attributes representing the object type and color. These are assigned using an equal sign, as with a regular class. Because these attributes are always the same for a given object, there's no need to pass them as arguments. Now, let's instantiate a new ship object. Enter the following, save the file, and run it:
garcia = Ship('Garcia', 'frigate', 'USA', (20, 15))
print(garcia)
This created a US frigate named garcia
at grid location (20, 15)
. But when you print the object, the output isn't very helpful:
<__main__.Ship object at 0x0000021F5FF501F0>
The issue here is that printing information on an object requires you to define additional dunder methods, like __str__
and __repr__
, that return string representations of objects for informational and debugging purposes.
Another useful method is __eq__
, which lets you compare instances of a class. The list of special methods in Python is long, but a few basic examples are listed in the following table:

Defining these methods for each class you write can become a burden, which is where dataclasses come in. Dataclasses automatically handle the redundancy issues around attributes and dunder methods.
Defining Ship as a Dataclass
Now, let's define the Ship
class again as a dataclass. Do this in a new file named _ship_trackerdc.py (for "ship tracker dataclass"):
from math import dist
from dataclasses import dataclass
@dataclass
class Ship:
name: str
classification: str
registry: str
location: tuple
obj_type = 'ship'
obj_color = 'black'
We started by importing the math
and dataclass
modules. We'll use the dist
method from math
to calculate the distance between ships, and dataclass
to decorate our Ship
class. (To use dist
, you'll need Python 3.8 or higher).
Next, we prefixed dataclass
with the @ symbol, to make it a decorator, and started defining the Ship
class on the following line.
Normally, the next step would be to define the __init__()
method with self
and other parameters, but dataclasses don't need this. The initialization is handled behind the scenes, removing the need for this code. You'll still need to list the attributes, however, but with a lot less redundancy than before.
For each attribute that must be passed as an argument, we entered the attribute name, followed by a colon, followed by a type hint. A type hint, or type annotation, tells people reading your code what types of data to expect. Static analysis tools can also use type hints to check your code for errors.
A class variable with a type hint is called a field. The @dataclass
decorator examines classes to find fields. Without a type hint, the attribute won't become a field in the dataclass. In this example, all the fields in the Ship
class use the string data type (str
), except for location, which uses a tuple (for a pair of x and y coordinates).
TIP: You can use default values with the type annotations. For example, location:
tuple = (0, 0)
will place newShip
objects at coordinatesx = 0, y = 0
if none are specified when the object is created. When you use a default parameter, however, all subsequent parameters must have default values.
Because we don't need to pass the obj_type
and obj_color
attributes as arguments when creating a new object, we defined them using an equal sign rather than a colon, and with no type hints. By assigning them as we would in a regular class, every Ship
object will, by default, be designated a "ship" and have a consistent color attribute for plotting.
Dataclasses can have methods, just like regular classes. Next, we'll define a method that calculates the Euclidian distance between two ships. Note that the def
statement below is indented four spaces relative to the class definition:
def distance_to(self, other):
distance = round(dist(self.location, other.location), 2)
return str(distance) + ' ' + 'km'
The distance_to()
method takes the current ship object and another ship object as arguments. It then uses the built-in dist
method to get the distance between them. This method returns the Euclidean distance between two points (x and y), where x and y are the coordinates of that point. The distance is returned as a string, so we can include a reference to kilometers.
Now, in the global scope with no indentation, create three ship objects, passing them the following information:
garcia = Ship('Garcia', 'frigate', 'USA', (20, 15))
ticonderoga = Ship('Ticonderoga', 'destroyer', 'USA', (5, 10))
kobayashi = Ship('Kobayashi', 'maru', 'Federation', (10, 22))
If you're working in an IDE, such as Spyder, as soon as you begin entering the Ship()
class arguments, a window should appear, prompting you on the proper inputs.

Because classes you create are legitimate datatypes in Python, they behave like built-in datatypes. As a result, IDEs like Spyder will use the type hints to guide you when creating the ship objects.
It's also worth noting that you don't need to use the correct data type for a parameter. Because Python is a dynamically typed language (meaning that variable types are inferred at runtime, not at compile-time, based on the value assigned) you can assign an integer as the classification argument, and the program will still run.
TIP: Even though the Python interpreter ignores type hints, you can use third-party static type-checking tools, like Mypy, to analyze your code and check for errors before the program runs.
The @dataclass
decorator is a code generator that automatically adds methods under the hood. This includes the __repr__
method. This means that you now get useful information when you call print(garcia)
:
print(garcia)
Ship(name='Garcia', classification='frigate', registry='USA', location=(20, 15))
Now, let's check that our data is there and that the method works. Add the following lines and rerun the script:
ships = [garcia, ticonderoga, kobayashi]
for ship in ships:
print(f"The {ship.classification} {ship.name} is visible.")
print(f"{ship.name} is a {ship.registry} {ship.obj_type}.")
print(f"The {ship.name} is currently at grid position {ship.location}n")
print(f"Garcia is {garcia.distance_to(kobayashi)} from the Kobayashi")
The frigate Garcia is visible.
Garcia is a USA ship.
The Garcia is currently at grid position (20, 15)
The destroyer Ticonderoga is visible.
Ticonderoga is a USA ship.
The Ticonderoga is currently at grid position (5, 10)
The maru Kobayashi is visible.
Kobayashi is a Federation ship.
The Kobayashi is currently at grid position (10, 22)
Garcia is 12.21 km from the Kobayash'
By putting the ship objects in a list, we were able to loop through the list, access attributes using dot notation, and print the results.
The Ship
dataclass lets you instantiate a ship object and store data such as the ship's name and location in type-annotated fields. By reducing redundancy and automatically generating required class methods such as __init__()
and __repr__()
, the @dataclass
decorator lets you produce code that's easier to read and write.
FYI: The
@classmethod
and@staticmethod
decorators let you define methods inside a class namespace that are not connected to a particular instance of that class. Neither of these are commonly used and can often be replaced with regular functions. You should be aware of their existence, however, as they're often mentioned in OOP tutorials and can be useful in some cases.
Plotting with the Ship Dataclass
To get a better feel for how you might use OOP, let's take this project a step further and plot our ship objects on a grid. To plot the ships, we'll use the Matplotlib plotting library (you can find installation instructions here).
In a text editor, save or copy your _ship_trackerdc.py file to a new file called _shipdisplay.py and edit it as follows:
from math import dist
from dataclasses import dataclass
import matplotlib.pyplot as plt
@dataclass
class Ship:
name: str
classification: str
registry: str
location: tuple
obj_type = 'ship'
obj_color = 'black'
def distance_to(self, other):
distance = round(dist(self.location, other.location), 2)
return str(distance) + ' ' + 'km'
garcia = Ship('Garcia', 'frigate', 'USA', (20, 15))
ticonderoga = Ship('Ticonderoga', 'destroyer', 'USA', (5, 10))
kobayashi = Ship('Kobayashi', 'maru', 'Federation', (10, 22))
VISIBLE_SHIPS = [garcia, ticonderoga, kobayashi]
def plot_ship_dist(ship1, ship2):
sep = ship1.distance_to(ship2)
for ship in VISIBLE_SHIPS:
plt.scatter(x=ship.location[0],
y=ship.location[1],
marker='d',
color=ship.obj_color)
plt.text(ship.location[0], ship.location[1], ship.name)
plt.plot([ship1.location[0], ship2.location[0]],
[ship1.location[1], ship2.location[1]],
color='gray',
linestyle="--")
plt.text((ship2.location[0]),
(ship2.location[1] - 2),
sep,
c='gray')
plt.xlim(0, 30)
plt.ylim([0, 30])
plt.show()
plot_ship_dist(kobayashi, garcia)
We started by adding a line to import Matplotlib. After instantiating the three ship objects, we replaced the remaining code starting at VISIBLE_SHIPS
. This line assigned a list of the three ship objects that represent the ships you can see on the simulation grid. We treated this as a constant, hence the all-caps format.
Next, we defined a function for calculating the distance between two ships (ship1
and ship2
) and for plotting all the visible ships. We called the Ship
class's distance_to()
method on the two ships, assigned the result to a variable named sep
(for separation), and then looped through the VISIBLE_LIST
, plotting each ship in a scatterplot. For this, Matplotlib needs the ship's x and y locations, a marker style (‘d' represents a diamond shape), and a color (the ship.obj_color
attribute).
Next, we used Matplotlib's plt.plot()
method to draw a dashed line between the ships used for the distance measurement. This method takes the x–y locations of each ship, a color, and a line style. We followed this with the plt.text()
method, for adding text to the plot. As arguments, we passed it a location, the sep
variable, and a color.
We completed the function by setting x and y limits to the plot and then calling the plt.show()
method to display the plot.
Back in the global scope, we called the plot_ship_dist()
function and passed it thekobayashi
and garcia
ship objects.
After saving and running the file, you should see the plot shown below:

Bundling data and methods into dataclasses produces compact, intuitive objects that you can manipulate en masse. Thanks to OOP, we could easily generate and track a multitude of ship objects on our grid.
Using Fields and Post-Init Processing
Sometimes you'll want to initialize an attribute that depends on the value of another attribute. Because this other attribute must already exist, you'll need to initialize the second attribute outside the __init__
function. Fortunately, Python comes with the built-in __post_init__
function that's expressly designed for this purpose.
Let's look at an example based on a naval war game simulation. Because alliances can change through time, a ship registered to a certain country might switch from ally to enemy. Although the registry attribute is fixed, its allegiance is uncertain, and you might want to evaluate its friend-or-foe status post-initialization.
To create a version of the Ship
dataclass that accommodates this need, in the text editor, enter the following and then save it as _ship_allegiance_post init.py:
from dataclasses import dataclass, field
@dataclass
class Ship:
name: str
classification: str
registry: str
location: tuple
obj_type = 'ship'
obj_color = 'black'
friendly: bool = field(init=False)
def __post_init__(self):
unfriendlies = ('IKS')
self.friendly = self.registry not in unfriendlies
In this case, we started by importing both dataclass
and field
from the dataclasses
module. The field
method helps you change various properties of attributes in the dataclass, such as by providing them with default values.
Next, we initialized the Ship
class like we did in the _ship_trackerdc.py program, except that we added a new attribute, friendly
, that's set to a Boolean data type with a default value of False
. Note that we set this default value by calling the field
method and using the keyword argument init.
We defined the __post_init__()
method with self
as a parameter. We then assigned a tuple of unfriendly registry designations to a variable named unfriendlies
.
Finally, we assigned True
or False
to the self.friendly
attribute by checking whether the current object's self.registry
attribute is present in the unfriendlies
tuple.
Let's test it out by making two ships, one friendly and one unfriendly. Note that you don't pass the Ship
class an argument for the friendly
attribute; this is because it uses a default value and is ultimately determined by the __post_init__()
method:
homer = Ship('Homer', 'tug', 'USA', (20, 9))
bortas = Ship('Bortas', 'D5', 'IKS', (15, 25))
print(homer)
print(bortas)
This produces the following result:
Ship(name='Homer', classification='tug', registry='USA', location=(20, 9), friendly=True)
Ship(name='Bortas', classification='D5', registry='IKS', location=(15, 25), friendly=False)
You may have noticed that you didn't need to explicitly call the __post _init__()
method. This is because the dataclass-generated __init__()
code calls the method automatically if it's defined in the class.
TIP: Inheritance (mostly) works the same with dataclasses as with regular classes. One thing to be careful of is that dataclasses combine attributes in a way that prevents the use of attributes with defaults in a parent class when a child contains attributes without defaults. So, you'll want to avoid setting field defaults on classes that are to be used as base classes.
Optimizing Dataclasses with slots
If you're using a dataclass for storing lots of data, or if you expect to instantiate thousands to millions of objects from a single class, you should consider using the class variable __slots__
. This special attribute optimizes the performance of a class by decreasing both memory consumption and the time it takes to access attributes.
A regular class stores instance attributes in an internally managed dictionary named __dict__
. The __slots__
variable stores them using highly efficient, array-related data structures implemented in the C programming language.
Here's an example using a standard dataclass called Ship
, followed by a ShipSlots
dataclass that uses __slots__
. Enter this code in a text editor and save it as _shipslots.py:
from dataclasses import dataclass
@dataclass
class Ship:
name: str
classification: str
registry: str
location: tuple
@dataclass
class ShipSlots:
__slots__ = 'name', 'classification', 'registry', 'location'
name: str
classification: str
registry: str
location: tuple
The only difference between the two class definitions is the assignment of a tuple of attribute names to the __slots__
variable. This variable lets you explicitly state which instance attributes you expect your objects to have.
Now, instead of having a dynamic dictionary ( __dict__
) that permits you to add attributes to objects after the creation of an object, you have a static structure that saves the overhead of one dictionary for every object that uses __slots__
. Because it's considered good practice to initialize all of an object's attributes at once, the inability to dynamically add attributes with __slots__
is not necessarily a detriment.
Using __slots__
with multiple inheritance can become problematic, however. Likewise, you'll want to avoid using it when providing default values via class attributes for instance variables. You can find more caveats in the official docs and this Stack Overflow [answer](https://stackoverflow.com/questions/472000/ usage-of-slots/).
The Recap
Object-oriented programming helps you organize code while reducing its redundancy. Classes let you combine related data – and functions that act on that data – into new custom data types.
Functions in OOP are called methods. When you define a class using a class statement, you couple related elements together so that the relationship between the data and the methods is clear, and so the proper methods are used with the appropriate data. Consequently, you'll want to consider using classes when you have multiple kinds of data, multiple functions that go with each kind of data, and a growing codebase that's becoming increasingly complex.
A class serves as a template or factory for making objects, also called instances of a class. You create objects by calling the class's name using function notation. As with regular functions, this practice introduces a new local name scope, and all names assigned in the class statement generate object attributes shared by all instances of the class. Attributes store data, and each object's attributes might change over time to reflect changes in the object's state.
Classes can inherit attributes and methods from other classes, letting you reuse code. In this case, the new class is a child or subclass, and the preexisting class is the parent or base class. Inherited attributes and methods can be overwritten in the subclass to modify or enhance the inherited behaviors.
The built-in super()
function is a shorthand way to create subclasses that are easy to maintain. With super()
, you can also call original methods from a base class if they've been modified in the subclass. Because Python lets classes inherit from multiple parents, this can result in complex code that's difficult to understand, so use super()
with caution.
Decorators are functions that modify the behavior of another function without permanently changing the modified function. They also help you avoid duplicating code.
The @dataclass
decorator decorates class statements and makes them more concise. Although dataclasses were designed for classes that mainly store data, they can still be used as regular classes. A nice feature is that IDEs like Spyder will use the dataclass fields to prompt users with the proper class names, arguments, methods, and documentation, removing the need to see all of the class definition code. A downside, however, is that the use of multiple inheritance can be more difficult with dataclasses than with regular classes.
The __slots__
class variable optimizes both memory usage and attribute access speeds. It comes with some limitations, however, such as, but not limited to, the inability to dynamically create attributes after initialization and increased complexity when using multiple inheritance.
One thing we didn't touch on here is that you can combine related class statements and save them as Python files. These class libraries then can be imported into other programs as modules, just like you imported the dataclass module.
T[[here](https://peps.python.org/pep-0557/)](https://docs.python.org/3/library/dataclasses.html)‘s a lot more to OOP than what we've covered here; it is the whole damn jungle[,](https://docs.python.org/3/library/dataclasses.html,) after all. If you think your projects would benefit from OOP and want to explore the topic further, you can find the official Python tutorial on classes here, the official dataclass documentation here, and the PEP 557 dataclass enhancement proposal here.
Thanks!
Thanks for reading and please follow me for more Quick Success Data Science projects in the future.