Class 3¶
Object Oriented Programming: Part 2¶
Inheritance¶
One of the most interesting aspects or features of object-oriented programming is inheritance. Inheritance is a specific type of relation between the objects we create that signifies that a “child” object inherits from its “parent”. In other words, the child object “is a” parent object. Our examples so far were of “real-life” objects, and so to continue this line we’d perhaps wish to create a class of a “Person”, which would be the parent class, and a child class of a “Student”. In this examples it’s clear why a “Student” is a Person, so the inheritance holds. All Persons have a name, age, gender, etc - but only students have an associated school and a final grade. This means that the Student will have all attributes associated with all Persons - name, age, gender - and they’ll also have a school and a grade. This is the meaning of “inheritance” in our lingo.
To solidify this point we’ll not use a contrived Person-Student example. Instead, we’ll use something a bit more typical to the work we do in our day to day life - a data reader. Say that we have some computational pipeline, that reads some data in, pre-processes it (denoising, filtering, changes its dimensions), then it computes some measurement of that dataset (number of cells in the image, mean calcium activity, min-max of electrical activity) and finally it writes it to disk. Let’s focus on the first step of this pipeline - reading in the data.
Our approach will be the following - we’ll build a parent class called DataReader
, and its child classes will know how to read in specific data formats only. In the graph above we’ll have two child implementations: one that can read .tif
images, and one for .png
. This will allow us to “plug in” different classes into our pipeline, depending on the input, and the downstream functions won’t see a difference.
from pathlib import Path
class DataReader:
"""Parent class of DataReader objects.
This class provides an "interface" for child classes to implement,
if they wish to be a part of the larger data processing pipeline of this project.
"""
def __init__(self, filename: Path):
self.filename = filename
self.data = None
def read(self):
"""Reads data into disk and returns it. Should populate self.data."""
raise NotImplementedError
def summarize(self):
raise NotImplementedError
This parent class could be initialized with:
data_reader = DataReader("/some/path.tiff")
and no harm will be done, but in practice there’s little one can achieve with this class. Any class that inherits from this base class has all methods the parent class has, even though theu currently simply raise an error (an exception) called NotImplementedError
.
fname = 'a.png'
data_reader = DataReader(fname)
print(f"data_reader: {data_reader}")
# Using ends_with works
extension = data_reader.ends_with()
print(f"Current extension: {extension}")
# Using read doesn't
data_reader.read()
data_reader: <__main__.DataReader object at 0x7f25b1a6f6d0>
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-3-9c623f8826a2> in <module>
4
5 # Using ends_with works
----> 6 extension = data_reader.ends_with()
7 print(f"Current extension: {extension}")
8
AttributeError: 'DataReader' object has no attribute 'ends_with'
This NotImplementedError
tells the users of this class that they’ll have to implement this method when they derive the child class if they wish to have anything done with this interface. Let’s see how we write a child class:
import tifffile # Python's most popular tif processing library
class TifDataReader(DataReader):
"""A DataReader designated to read .tif files.
Uses the tifffile package to do the heavy lifting.
"""
def __init__(self, filename):
super().__init__(filename)
# self.name = 'abc'
def verify_input(self):
"""Verifies that the given filename is valid."""
pass
def read(self):
"""Reads a tif image to self.data and returns it as well."""
self.verify_input()
self.data = tifffile.imread(file)
return self.data
def summarize(self) -> float:
"""A simple mean of the data."""
return self.data.mean()
Inheritance happens when you write a different class’ name between the braces at the top line. This tells Python to transfer attributes and methods from the parent to this child. The instatiation (=magic) happens using the special syntax of super().__init__(argumets)
- what it says is “create the super (i.e. parent) class and call its __init__
method with the given arguments.
Besides that the only things that might catch the eye is the new method verify_input()
, which the parent class didn’t require. This function, which I left unimplemented, checks that the given filename exists and that it’s indeed a tif file. The parent class didn’t mandate it, but I, as the implementer of the child class, deemed it necessary.
Let’s see how to use this class. Don’t worry, everything should seem straight-forward from now on:
fname = 'a.tif'
tif_data_reader = TifDataReader(fname)
print(tif_data_reader.filename)
tif_data_reader.ends_with()
a.tif
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-5-5a9cdcd3ad5f> in <module>
2 tif_data_reader = TifDataReader(fname)
3 print(tif_data_reader.filename)
----> 4 tif_data_reader.ends_with()
AttributeError: 'TifDataReader' object has no attribute 'ends_with'
tif_data_reader
has the .read()
and .summarize()
method ready to use. We won’t read an actual .tif
file but you can see that nothing is really special about this class and instance. Inheritance in this case helped us give a tag to this class, signaling to its users that it’s “special” and should be used in some specific contexts.
For completeness, let’s write the other child class of DataReader
. A single parent can obviously have as many childs as you wish.
import imageio
class PngDataReader(DataReader):
"""A DataReader designated to read .tif files.
Uses the imageio package to do the heavy lifting.
"""
def __init__(self, filename):
super().__init__(filename)
def read(self):
"""Reads a PNG image to self.data and returns it as well.
Also populates self.metadat with the image's metadata.
"""
self.data = imageio.imread(file)
self.metadata = self.data.meta # imageio reads metadata for us
return self.data
def summarize(self) -> float:
"""A simple mean of the data."""
return self.data.mean()
Since imageio
does input verification for me (kinda) we don’t need our own method to do it for us. Besides that, the read method also creates a self.metadata
attribute, since imageio
allows it. Again, this wasn’t required by the parent.
Inheritance - Why and When¶
Inheritance can facilitate code re-use and simpler, clearer mental models of the problem at hand. In the example above, we generated very clear and concise classes that do one thing, and do it well. People reading this code - including us in three months - will understand everything about it without any hassle.
A possible issue with inheritance is readability - finding the methods that are associated with the base class can be cumbersome when we start working with tens of attributes and multiple methods. This is why usually people try to avoid more than a single layer of inheritance.
Exercise¶
Smartphones¶
Model both a smartphone and a label-specific phone - an iPhone in our case - by using a parent and child class. Have at least one method and one attribute for the base class, and at least one unique method for the child class.
One of the methods has to be a call(phone)
method, designed to call from one phone to the other. When call()
ing between iPhones, the method should use the FaceTime interface of the two iPhones. Make sure to keep a log of the calls on both phones.
Exercise solution below…¶
import time
class Phone:
""" Base class for all types of mobile phones """
def __init__(self, name, screen_size, num_camera=2):
self.name = name
self.screen_size = screen_size
self.num_of_camera = num_camera
self.is_on = True # power switch
self.photos = [] # list of pictures taken
self.calls = {} # call log
def switch_power(self):
# self.is_on = False if self.is_on else True
if self.is_on:
self.is_on = False
else:
self.is_on = True
def take_photo(self):
""" Take a photo and append it to the photo album """
self.photos.append([[1, 0], [0, 1]])
def call(self, other):
""" Call another Phone instance """
if self.is_on and other.is_on:
self.calls[other.name] = time.time()
other.calls[self.name] = time.time()
else:
print(f"Phone {other.name} is off.")
return other
class IPhone(Phone):
""" A more expensive phone, that can call other iPhones using a special call method """
def __init__(self, name, screen_size, num_camera, apple_id):
super().__init__(name, screen_size, num_camera)
self.apple_id = apple_id
self.facetime_calls = {} # FaceTime call log
def call(self, other):
""" Overrides the call method from the parent class """
if self.is_on and other.is_on:
try:
self.facetime_calls[other.apple_id] = time.time()
except AttributeError:
self.calls[other.name] = time.time()
other.calls[self.name] = time.time()
else:
other.facetime_calls[self.apple_id] = time.time()
else:
print(f"Phone {other.name} is off.")
return other
regular = Phone(name='lg_v10', screen_size=6)
iphone = IPhone(name='iphone_8', screen_size=5.5, num_camera=3, apple_id='first_iphone_8')
iphone2 = IPhone(name='iphone_X', screen_size=6, num_camera=3, apple_id='second_iphone_X')
# Call from regular phone to iPhone
print(f"Before calling, the log for regular shows: {regular.calls}")
iphone = regular.call(iphone)
print(f"After the call, regular shows {regular.calls} and the iPhone shows {iphone.calls}")
Before calling, the log for regular shows: {}
After the call, regular shows {'iphone_8': 1614777148.970501} and the iPhone shows {'lg_v10': 1614777148.9705017}
# Two iPhones:
print(f"Before calling, the log for first iPhone shows: {iphone.facetime_calls}")
iphone2 = iphone.call(iphone2)
print(f"After the call, the first iPhone shows {iphone.facetime_calls} and the second shows {iphone2.facetime_calls}")
Before calling, the log for first iPhone shows: {}
After the call, the first iPhone shows {'second_iphone_X': 1614777148.9768183} and the second shows {'first_iphone_8': 1614777148.97682}
Object-oriented design requires you to think about the code you’re about to write - how to model each object, how to deal with the interfaces between them, how to verify the types of each input, etc.
Because we’re trying to model a complex structures, we usually don’t succeed in the first try. That’s because we become smarter and understand our needs from the model better only after we’ve used it. Premeditating and debating on the exact way through which two Phones()
will call each other is important, but we’ll usually just refactor our initial model in favor of a better one after a few days of “usage”. That’s the underlying reason for “alpha” and “beta” versions of software.
In short, rewriting large parts of an application you designed is expected, since it’s a natural and important part of software design - a luxury other engineers rarely have.
Errors and Exceptions¶
A very debated feature of Python (and other scripting languages) is its fear of failing. Python tries to coerce unknown commands into something familiar that it can work with. For example, addition of bool
s and other types is fully supported, since bool
types are treated as 0 (for False
) and 1 (for True
).
True - 1.0 # True is also 1
0.0
False + 10 # False is also 0
10
However, many other statements will result in an error, or Exception in Python’s terms:
'3' + 3 # TypeError
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-12-977675d93a78> in <module>
----> 1 '3' + 3 # TypeError
TypeError: can only concatenate str (not "int") to str
a = [0, 1, 2]
a[3] # IndexError
---------------------------------------------------------------------------
IndexError Traceback (most recent call last)
<ipython-input-13-4076b5525985> in <module>
1 a = [0, 1, 2]
----> 2 a[3] # IndexError
IndexError: list index out of range
camel # NameError
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
<ipython-input-14-30dc50b6b31d> in <module>
----> 1 camel # NameError
NameError: name 'camel' is not defined
There are many built-in exceptions in Python, and most modules you’ll use created their designated exceptions. Modules and packages do this because the exception is meaningful - each exception conveys information about what went wrong during the runtime. Since it’s not a simple error, we can use this information by predetermining the course of action when an excpetion occurs. This is called catching an exception.
The keywords involved are: try
, except
, else
and finally
. An example might consist of interacting with the file system:
try:
# Do something that might fail
file.write()
except PermissionError:
# If we don't have permission to do the operation (e.g. write to protected disk), do the following
# ...
except IsADirectoryError:
# Trying to do a file operation on a directory - so do the following
# ...
except (NameError, TypeError):
# If we encouter either a non-existent variable or operation on variables, do the following
# ...
except Exception:
# General error, not caught by previous exceptions
# ...
else:
# If the operation under "try" succeeded, do the following
# ...
finally:
# Regardless of the result - success or failure - do this.
# ...
Let’s break it down:
# Simplest form of exception handling:
a = 2
try:
b = a + 1
except NameError: # a or b isn't defined
a = 1
b = 2
# We could catch other exceptions
a = '3'
try:
b = a + 1
except TypeError: # a isn't a float\int
a = int(a)
b = a + 1
# With the else clause
current_key = 'Mike'
default_val = 'Cohen'
dict_1 = {'John': 'Doe', 'Jane': 'Doe'}
try:
johns = dict_1.pop(current_key)
except KeyError: # Non-existent key
dict_1[current_key] = default_val
print(f"{len(dict_1)} remaining key(s) in the dictionary")
else:
print(f"{len(dict_1)} remaining key(s) in the dictionary")
print(dict_1)
3 remaining key(s) in the dictionary
{'John': 'Doe', 'Jane': 'Doe', 'Mike': 'Cohen'}
# Another else example
tup = (1,)
try:
a, b = tup[0], tup[1]
except IndexError as e:
print("IndexError")
print(f"Exception: {e}; tup: {tup}")
raise
else:
# process_data(a, b)
print(a, b)
IndexError
Exception: tuple index out of range; tup: (1,)
---------------------------------------------------------------------------
IndexError Traceback (most recent call last)
<ipython-input-17-79278fbd754c> in <module>
2 tup = (1,)
3 try:
----> 4 a, b = tup[0], tup[1]
5 except IndexError as e:
6 print("IndexError")
IndexError: tuple index out of range
We use the else
clause because we wish to catch a specific IndexError
during the tuple unpacking (a, b = tup[0], tup[1]
). The process_data(a, b)
can raise other IndexError
s which we’ll deal with inside the function. But the relevant IndexError
to catch is the tuple destructuring one.
# With the finally clause
def divisor(a, b):
"""
Divides two numbers.
a, b - numbers (int, float)
returns a tuple of the result and a possible error.
"""
try:
ans = a / b
except ZeroDivisionError as e:
ans = None
err = e
except TypeError as e:
ans = None
err = e
else:
err = None
finally:
return ans, err
# Should work:
ans, err = divisor(1, 2)
print(ans, " ----", err)
# ZeroDivisionError:
ans, err = divisor(1, 0)
print(ans, "----", err)
# TypeError
ans, err = divisor(1, 'a')
print(ans, "----", err)
0.5 ---- None
None ---- division by zero
None ---- unsupported operand type(s) for /: 'int' and 'str'
Exception handling is used almost everywhere in the Python world. We always expect our operations to fail, and catch the errors as our backup plan. This is considered more Pythonic than other options. Here’s a “real-world” example:
# Integer conversion. We check before doing it to make sure it won't raise errors
def int_conversion(s):
""" Convert a string to int """
if not isinstance(s, str) or not s.isdigit:
return None
elif len(s) > 10: #too many digits for int conversion
return None
else:
return int(s)
# Same purpose - more Pythonic
def pythonic_int_conversion(s):
""" Convert a string to int """
try:
return int(s)
except (TypeError, ValueError, OverflowError):
return None
# This is also sometimes phrased as "easier to ask for forgiveness than permission"
Exercise - User Input Verification¶
The user’s input is always a very error-prone area in an application. A famous joke describes this situation in the following manner:
A Quality Assurance (QA) Engineer walks into a bar. Orders a beer. Orders 0 beers. Orders 999999999 beers. Orders a lizard. Orders -1 beers. Orders a sfdeljknesv.
A decent application should not only handle all possible incoming inputs, but should also convey back to the user the information of what went wrong. In this exercise you’ll write a verify_input
function that handles file and folder names.
Short Intro - pathlib
¶
For file I/O and other disk operations, some of which are required in this exercise, Pythonistas use pathlib
, a module in the Python standard library designated to work with files and folders (pathlib2
in Python 2). Its basic premise is that files and folders are objects themselves, and certain operations are allowed between these objects.
from pathlib import Path
p_win = Path(r'C:/Users/Hagai/Documents\Classes\python-course-for-students') # notice the "raw" string r'',
# it forces Python to not duplicate backslashes
p1 = Path('/home/hagai/Teaching/python_students')
p1
PosixPath('/home/hagai/Teaching/python_students')
p1.parent
PosixPath('/home/hagai/Teaching')
list(p1.parents)
[PosixPath('/home/hagai/Teaching'),
PosixPath('/home/hagai'),
PosixPath('/home'),
PosixPath('/')]
p1.exists() # is it actually a folder\file?
False
p1.parts
('/', 'home', 'hagai', 'Teaching', 'python_students')
p1.name
'python_students'
for file in p1.iterdir():
print(file)
---------------------------------------------------------------------------
FileNotFoundError Traceback (most recent call last)
<ipython-input-30-62f9f5df060a> in <module>
----> 1 for file in p1.iterdir():
2 print(file)
/opt/hostedtoolcache/Python/3.8.7/x64/lib/python3.8/pathlib.py in iterdir(self)
1119 if self._closed:
1120 self._raise_closed()
-> 1121 for name in self._accessor.listdir(self):
1122 if name in {'.', '..'}:
1123 # Yielding a path object for these makes little sense
FileNotFoundError: [Errno 2] No such file or directory: '/home/hagai/Teaching/python_students'
# Traversing the file system
p2 = Path('C:/Users/Hagai/Documents')
p2 / 'Classes' / 'python-course-for-students'
# Operator overloading
PosixPath('C:/Users/Hagai/Documents/Classes/python-course-for-students')
The exercise:¶
class UserInputVerifier:
"""
Assert that the input from a user is a valid folder name. A valid folder is a folder
containing the following files: "a.py", "b.py", "c.py", and the data file "data.txt". However, the class
should be able to deal with any arbitrary filename, or an iterable of which.
If the given folder doesn't contain it, it's possible the user gave us a parent folder of the
folder that contains these Python files. Look into any sub-folders for these files, and return the
"actual" true folder, i.e. the top-most folder containing all the files.
Input - Foldername, string
Output - A pathlib object. If the input isn't valid, i.e. the files weren't found,
the class should raise an exception.
"""
Exercise solution below…¶
class UserInputVerifier:
"""
Assert that the given foldername contains files in "filenames".
"""
def __init__(self, foldername, filenames=['a.py', 'b.py', 'c.py', 'data.txt']):
self.raw_folder = Path(str(foldername)) # first possible error
self.filenames = self._verify_filenames(filenames)
def _verify_filenames(self, filenames):
""" Verify the input filenames, and return it as an iterable. """
typ = type(filenames)
if typ not in (str, Path, list, tuple, set):
raise TypeError("Filenames should be an iterable, a Path object or a string.")
if typ in (str, Path):
return [filenames]
return filenames
def check_folder(self):
""" Assert that the files are indeed in the folder or in one of its subfolders """
existing_files = []
missing_files = []
if not self.raw_folder.exists():
raise UserWarning(f"Folder {self.raw_folder} doesn't exist.")
# Make sure that each file we're looking for doesn't
for file_to_look in self.filenames:
found_files = [str(file) for file in self.raw_folder.rglob(file_to_look)]
if len(found_files) == 0:
raise UserWarning(f"File '{file_to_look}' was missing from folder '{self.raw_folder}'.")
if len(found_files) > 1:
raise UserWarning(f"More than one file named '{file_to_look}' was found in '{self.raw_folder}'.")
return True
foldername = r'./mock'
verifier = UserInputVerifier(foldername)
verifier.check_folder()
---------------------------------------------------------------------------
UserWarning Traceback (most recent call last)
<ipython-input-34-57c6abe60b2f> in <module>
1 foldername = r'./mock'
2 verifier = UserInputVerifier(foldername)
----> 3 verifier.check_folder()
<ipython-input-33-803863538e46> in check_folder(self)
23 missing_files = []
24 if not self.raw_folder.exists():
---> 25 raise UserWarning(f"Folder {self.raw_folder} doesn't exist.")
26
27 # Make sure that each file we're looking for doesn't
UserWarning: Folder mock doesn't exist.