# Class 1


## Conditionals and Iteration

Even though you've learned it by yourself in the first class and while doing homework, we'll go through these subjects (quickly) one more time, for the sake of completeness.

### The `if` Statement

Minimal example:

In [1]:
x = 10
if x > 10:
    print("x is bigger than 10")

Using `else`:

In [2]:
y = 11
if x > y:
    print("x")
else:
    print("y")

y


Multiple conditions (`elif`):

In [3]:
z = 12
if x > y:
    print("x")
elif x > z:
    print("x > z")
elif z < y:
    print("z is small")
    if z < x:
        print("wow, z IS small")

### Iteration

One of Python's strongest features.

### `for` loop

We'll compare a naive script adding up numbers in an array using MATLAB and Python:

```matlab
% Sum all items in array
data = [10, 20, 30];
result = 0;
for idx = 1:size(data, 1)
    result = result + data(idx);
end
```
    

In Python we can iterate over the values themselves using the `in` operator:

In [4]:
data = [10, 20, 30]
result = 0
for value in data:
    print(value)
    result += value

print("The result is:", result)

10
20
30
The result is: 60


:::{note}
Newer versions of MATLAB technically enable iterating values, but it's still very clunky (row vs. column vectors) and doesn't work in some important use-cases (e.g. ``parfor``).
:::

We can iterate over nearly anything:


In [5]:
tup = (1, 2, True, 3.0, 'four')
for item in tup:
    print(item)

1
2
True
3.0
four


In [6]:
string = "abcdef"
for letter in string[::-2]: # Strings are also sliceable!
    print(letter)

f
d
b


To get both the index and the value of some sequence, use the `enumerate` keyword:

In [7]:
my_tuple = 1, 2, True, 3.0, 'four'
for index, item in enumerate(my_tuple):
    print(index, item)
    

0 1
1 2
2 True
3 3.0
4 four


By default, iterating a dictionary returns it's keys:

In [8]:
print('Key iteration:')
dict1 = {'a': 1, 'b': 2, 'c': 3}
for key in dict1:
    print(key)

Key iteration:
a
b
c


However, iterating a dictionary's values or items (key and value pairs represented as tuples) is just as easy:

In [9]:
print('Value iteration:')
for val in dict1.values():
    print(val)

print('Pairs iteration:')
for key, val in dict1.items():
    print(key, val)

Value iteration:
1
2
3
Pairs iteration:
a 1
b 2
c 3


### `while` Loop

In [10]:
def countdown(n):
    """ Explodes a bomb when n is zero. """
    while n > 0:
        print("{}...".format(n))
        n = n - 1
    print("BOOM!")
    return True

In [11]:
n = 10
val = countdown(n)

print(val)

10...
9...
8...
7...
6...
5...
4...
3...
2...
1...
BOOM!
True


:::{note}
Generally speaking, the usage of `while` loops in Python is discouraged. Whenver you're tempted to use one, consider the possibility of replacing it with a `for` loop. If at all possible, this solution is probably preferable.
:::

### `break` and `continue`

Stopping a loop is done with `break`, and in order to continue execution from the start of the loop we use `continue`:

In [12]:
data = [1., 2., 1., 1., 4., 1.]
for datum in data:
    if datum == 2.:
        continue
    if datum != 1.:
        print(datum)
        break
    print("Still 1...")

Still 1...
Still 1...
Still 1...
4.0


## Formatting

We can print together variables and text in several different manners:

In [13]:
# Not recommended as it doesn't allow for customizations
a = 42
print("The value of a is", a)

The value of a is 42


In [14]:
# Older version, similar to other languages
a = 42
b = 32
print("The value of a is %d, while the value of b is %d" % (a, b))

The value of a is 42, while the value of b is 32


In [15]:
# A decent option
a = 42
b = 32
print("The value of a is {}, while the value of b is {}".format(a, b))

The value of a is 42, while the value of b is 32


In [16]:
# Only for Python 3.6+ - but it's pretty cool. It's called f-strings
a = 42
b = 32
print(f"The value of a is {a}, while the value of b is {b}")

The value of a is 42, while the value of b is 32


In [17]:
# Another example
a = 42
b = 32
print(f"The value of a is {a:.2f}, while the value of b is not {b + 1}.")
# You can write any expression you'd like inside the curly brackets.

The value of a is 42.00, while the value of b is not 33.


Throughout the course you'll see me using mostly the f-string option, which is the most readable assuming you only work on Python 3.6+.

## Comprehensions

Comprehensions are a set of fast and delightfully readable methods for creating different types of iterables. 

Assume we wish to create a list with the squared values of the numbers in the range \[0, 10). Instead of the old:

In [18]:
squares = []
for item in range(10):
    squares.append(item ** 2)
    
print(squares)

[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]


we could use a list comprehension:

In [19]:
squares = [x ** 2 for x in range(10)]

print(squares)

[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]


This amazing piece of software iterates of the range with the variable `x`, and places its square inside a list, which is then allocated to the `squares` variable.

What other goodies do comprehensions allow? Filtering.

In [20]:
squares = [
    x ** 2 
    for x in range(10) 
    if x != 8
]

print(squares)

[0, 1, 4, 9, 16, 25, 36, 49, 81]


The `if` statement is evaluated on each iteration of `x`. Notice how similar to English this expression is?

Funnily enough, performance isn't hindered:

In [21]:
%timeit squares = [x ** 2 for x in range(100) if x != 8]

22.8 µs ± 278 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)


In [22]:
%%timeit
squares = []
for item in range(100):
    if item != 8:
        squares.append(item ** 2)

25.6 µs ± 324 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)


Another example:

In [23]:
strings = [cons.upper()
           for cons in 'abcdefghij' 
           if cons not in 'aeiou']

print(strings)

['B', 'C', 'D', 'F', 'G', 'H', 'J']


Comprehensions aren't limited to lists! You can also comprehend dictionaries, tuples and sets. To demonstrate a `dict` comprehension easily we will usethe built-in `zip` function:

In [24]:
keys = 'abcde'
vals = [1, 2, 3, 4, 5]
bools = [True, True, False]

# Zip bundles up these iterables together, allowing iteration
for a, b, c in zip(keys, vals, bools):
    print(a, b, c)

a 1 True
b 2 True
c 3 False


With `zip` we can comfortably bundle up the key-value pairs:

In [25]:
keys = 'abcde'
values = [1, 2, 3, 4, 5]
d = {key: value 
     for key, value in zip(keys, values) 
     if key != 'c'}

print(d)

{'a': 1, 'b': 2, 'd': 4, 'e': 5}


If we don't need the "if" part of the comprehension we can even:    

In [26]:
d = dict(zip(keys, values))
print(d)

{'a': 1, 'b': 2, 'c': 3, 'd': 4, 'e': 5}


The same principle applies for set comprehensions:

In [27]:
se = {number % 7 for number in range(100)}

print(se)

{0, 1, 2, 3, 4, 5, 6}


Lastly, list comprehensions can iterate over more than one iterable:

In [28]:
summed_list = [outer + inner 
               for outer in range(10) 
               for inner in range(10, 20)]

This is identical to:

In [29]:
summed_list = []
for outer in range(10):
    for inner in range(10, 20):
        summed_list.append(outer + inner)

Once you get used to comprehensions, they quickly become the most "natural" way to iterate. Their brevity on the one hand, and readability on the other, are their most important features. That's why comprehension are very encouraged when writing true "Pythonic" code.

## Function Default Arguments and Types

One thing we didn't show in our introduction to functions was the default argument feature of function definitions, as well as the optional typing syntax. We'll start with showing what are default arguments.

Say you have a function which calculates the `n`th-power of any number its given. It obviously has to be given two inputs - the number and the power by which it will be multiplied. For example:

In [30]:
def calculate_power(x, power):
    """ Raises *x* to the power of *power* """
    return x ** power

This function works wonderfully, but we're assuming that most of the times users will raise `x` to the power of two. If we want to save the hassle for the caller, we can use default arguments when defining the function:

In [31]:
def calculate_power(x, power=2):
    """ Raises *x* to the power of *power* """
    return x ** power

We can now call this function either with the power argument or without it, which will default it to 2:

In [32]:
print(calculate_power(10))
print(calculate_power(10, 3))

100
1000


### Type hinting

Another interesting feature of Python (3.6+) is its optional typing capabilities. These allow us coders to 'hint' at the wanted types of certain variables. We say 'hint' since these annotations have no effect during runtime, i.e. a variable can be marked as a string but later receive an integer value without the Python interpreter caring. For functions, Python added a special "arrow" -> sign that marks the output type of it. Let's see how it works:

In [33]:
def typed_add(a: int, b: int = 0) -> int:  # combination of type hinting and default arguments
    """Adds the two inputs together"""
    return a + b

In [34]:
a: float = 0.25
b: str = 'a'

a = True  # Python doesn't care that we've overriden the type hint

Type hinting is used for documentation purposes mostly. Later in the course we'll discuss its more advanced features by utilizing the `mypy` package.

## File Input/Output

Yet another error-prone area in applications is their I/O (input-output) module. Interfacing with objects outside the scope of your own project should always be handled carefully. You never know what's really out there.

Assume we wish to write some data to a file - a list filled with counts of some sort, for example.

To write (and read) from a file, you have to do several operations:
1. Define the file path and name.
2. Open the file with the appropriate mode - read, write, etc.
3. Flush out the data.
4. Close the file.

Here's a mediocre example of how it's done:

In [35]:
data_to_write = 'A B C D E F'
filename = 'data.txt'
file = open(filename, 'w')  # w is write, 'open' is a built-in function
file.write(data_to_write)
file.close()

The variable `file` is a file object, and it has many useful methods, such as:
* `.read()` - reads the entire file.
* `.readline()` - reads a single line.
* `.readlines()` - read the entire file as strings into a list.
* `.seek(offset)` - go the `offset` position in the file.

File objects in Python can be opened as string files (the default) or as binary files (`open(filename, 'b')`), in which case their content will be interpreted as bytes rather than text.

When dealing with files, we generally first `open()` them, `read()` \ `write()` something, and `close()` them. The real issue stems from the fact that these steps are very error prone. For example, you can open a file to write something to it, but while the file is opened someone else (or some other Python process) can close and even delete the file.

Another example - some connection error might occur after you've flushed the data into the file, but before you managed to close it, leading to a file that can't be accessed by the operating system.

Gladly, Python is here to help, and its main method of doing so is context managers, called upon with the `with` keyword. Context managers are awesome, and I'll only briefly describe their capabilities. That being said, they shine the most when doing I/O, like in the following example:

In [36]:
data_to_write = 'A B C D E F'
filename = 'data.txt'
with open(filename, 'w') as file:
    file.write(data_to_write)
    file.write('abc')

The unique thing here is that once we've opened the file, the `with` block guarantees that the file will be closed, regardless of what code is executed. 

Even if an error occurs while the file is open - the context manager will ensure proper handling of the file and prevent our data from disappearing into the void of the file system.

## The Python Stack

### How to Run Python?

#### How does Matlab do it?

MATLAB has its excellent application GUI which essentially everyone uses all the time.

But all interpreted programming languages, including Python and MATLAB, can be run from the command line. 

In the case of MATLAB, though, rarely do you see people running it from the command line:

![MATLAB from the CL](matlab_cl.png)

It's obviously possible, but less comfortable than the standard GUI we all know. If all you wish to do is run a MATLAB `.m` file, you can also do it from the command line by simply writing `matlab -r myfile.m`. 

The MATLAB application we're familiar with combines a few sub-applications:
1. Text editor
2. Debugger and variable explorer
3. Command prompt (REPL, a place to write and immediately evaluate MATLAB expressions)
4. MATLAB's engine or interpreter  (the program that actually does the job of reading the source files and 'computing' them)

Python is very similar. Just like MATLAB, the quickest option to run Python is from the command line, by simply writing `python`:

![Python CL](python_cl.png)

Running Python scripts, like `myfile.py`, is as easy as `python myfile.py`.

### Integrated Development Environment (IDE)


More often than not, we wish to both write a script, experiment with it a little, and then run it, just as some of us are used to do from the MATLAB environment. Python, being non-propietary, has several such solutions. Generally speaking, software of this type are referred to as [Integrated Development Enviornments (IDEs)](https://en.wikipedia.org/wiki/Integrated_development_environment), and are a basic tool every programmer uses, with choice varying greatly depending on the languages used as well personal preference.

Some popular options for Python are:

#### Spyder

Spyder is an open-source, science-oriented IDE, that was designed with MATLAB's GUI in mind. It contains many similar functions and might look very similar in a quick glance:

![Spyder IDE](spyder.png)

Unfortunately, Spyder's main financial support was cut off in November 2017. It's still in active development, but on a much slower pace, and its future is unclear at the moment.

#### PyCharm
PyCharm is a full-blown IDE which contains many advanced features that any modern IDE has, like refactoring capabilities, testing suites and more. It has a free community edition, and a paid proffessional edition - which is actually free for poor students like us:

![PyCharm IDE](pycharm.jpg)

#### Visual Studio Code
VSCode is a free, open-source editor for nearly all existing programming languages. It's relatively lightweight, and relies mostly on its community-driven extensions marketplace. There's an extension for just about anything you might want to do with a code editor, and both the editor itself and its popular extensions are well maintained and in active development.

![VS Code](vs_code.gif)

### Jupyter Notebook
While not technically an IDE, Jupyter is designed with data exploration in mind. It's less suited for writing long, complex application, but great when it comes for a quick "plot-n-go" on some data you recently acquired.

## Version Control

### Introduction

Version control is the active management of the history of your source code. It is an essential part of every developers' work cycle, for both small and large projects.

With version control, in any point in time during the work on your code you can decide to "commit" the change. Committing your code means that the system will remember the current state of your work (all files in a folder), and will allow you to return to this exact state of your codebase whenever you wish.

![Final.doc, from PhD Comics](vcs_final.png)

Using version control is orthogonal to the traditional save operation. Saving records the current state of your codebase, but usually doesn't allow you to "go back in time" to previous versions.

This property is useful in many occasions. For example, if you have a working version of some function, but you wish to make it better - add a feature, or change its internal structure (refactoring). Version control allows you to record this point in time - when you have a good, functioning function - and change the working copy of the function however you'd like. If you fail to refactor the function you can simply jump back to the latest working version.

Another important version control use case is collaborative work. Version control systems (VCS) can help you communicate changes in code base between developers, without having to somehow transfer updated versions of files from one person to the other.

```{margin}
[![Linus Trovalds](https://upload.wikimedia.org/wikipedia/commons/thumb/0/01/LinuxCon_Europe_Linus_Torvalds_03_%28cropped%29.jpg/220px-LinuxCon_Europe_Linus_Torvalds_03_%28cropped%29.jpg)](https://en.wikipedia.org/wiki/Linus_Torvalds)
```

There are many version control applications, but the most popular one is Git, developed by [Linus Torvald](https://en.wikipedia.org/wiki/Linus_Torvalds)
(the Linux guy) in the early 2000's. In this course we'll be using Git with GitHub.com - an online backup site for your code.

There are numerous great Git and GitHub tutorials ([here's one](https://github.com/pluralsight/git-internals-pdf)), but for our course we'll be using only the most basic features of Git and GitHub, so going through all features and intricacies of the software is unnecessary. {ref}`general-setup` and {ref}`create-git-repo` contain instructions on how to setup Git and GitHub, which should be enough for homework submission.

### Git Fundamentals (Using VSCode and GitHub)

For the purposes of this walkthrough we'll assume we have some code inside a *my_project* directory, looking something like:
- my_project/
    - \_\_init\_\_.py
    - code.py
    - requirements.txt
    - README.md

Git can be run from the command line, but in this quick tutorial we'll be using the built-in Git interface in VSCode, which should be good enough for 99% of your Git needs. 

To start a new Git repository and host it on GitHub:

1. **Open the project**: Open VSCode, click *File --> Open Folder...* (or Ctrl+K Ctrl+O) and select the *my_project* directory. ![VSC My Project](vsc_my_project.png) <br />
2. **Initialize Git**: Click the <img src="vsc_vcs.png" alt="VSC Version Control" width="30" style="display:inline"></img> icon (or Ctrl+Shift+G G), and then "Initialize Repository".<br />![VSC Git init](vsc_git_init.png)<br />
4. **Stage**: Next, we need to tell Git which files to track, which is called "staging" the files. Currently, we see our files and folders under the "Changes" headline, which means that these files are new to Git (since its previous snapshot of our repository is really a "clean slate" without any files).<br />![VSC VCS Changes](vsc_git_changes.png)<br />There's a <span style="color:green">**U**</span> next to each file signifying that it's "<span style="color:green">**U**</span>ntracked", i.e. Git has no prior records of this file. To start tracking the files, we can click the "+" button appearing next to the <span style="color:green">**U**</span> when we hover over the file. ![VSC Git Stage](vsc_git_stage.png)<br />Alternatively, if we just want to add all files to the Git archive, we can click the "+" on the "Changes" line (again, visible when we hover with the mouse on it). Go ahead and stage all files. You should see them under a new headline: "Staged Changes", and showing an <span style="color:green">**A**</span> for "<span style="color:green">**A**</span>dded". ![VSC Git Staged](vsc_git_staged.png) Now Git knows what's inside the `my_project` folder, all files are tracked, but it hasn't captured a snapshot of the repository status quite yet. To do that, we'll need to "commit" the current state of the repository (repo). <br /><br />
5. **Commit**: Committing essentially means creating a snapshot of our codebase which we can name, label, describe, and go back to. We always commit changes along with a concise message. The message goes in the line above "Staged Changes" - write "Initial commit" (it's good enough for the purposes of the tutorial). ![VSC Initial Commit](vsc_first_commit.png) Now we can commit by clicking the ✓ checkmark in the line above (or Ctrl+Enter while in the commit message textbox). So far, we merely created a local repo in our computer. If the whole folder is erased, our backups and commit history go away with it as well. <br /><br />
6. **Push**: To make an online backup of a repository, we need to first set a "remote" (i.e. a remote location containing a copy of the repository). If you don't already have a [GitHub](https://www.github.com) account, create one, and then create a new repository: ![GitHub New Repo](github_new_repo.png) Give the repo a descriptive name create it. Now that you have a remote location to use (should be: https:\/\/github.com\/\<your-user-name\>/\<repo-name\>), head back to VSCode and in the "Source Control" options select "Add remote": ![VSC Add Remote](vsc_add_remote.png) and finally we are ready to "push" our changes: ![VSC Git Push](vsc_git_push.png)

Congratulations! You've published your first code repository! Whether you find it trivial or confusing, you've made a huge step towards creating software that will not only work better for you, but also for others.

## The Module System

### Namespaces

One of the largest differences between Python and MATLAB is the concept of namespaces. 

Namespaces provide a system to avoid ambiguity. We have many Hagai's, but much less Hagai Har-Gil's. In computers, we can have many files with the name `test1.py` saved in _different folders_ of our harddrive. In contrast - we can only have one `test1.py` per folder, since the file system dislikes these collisions. 

Similarly, In Python we have to declare which namespace does our function belong to. Usually functions aren't included in the scope of the program until we `import` them. After importing we can use the function with its "pathname", i.e. the module it belongs to. This way, the same identifier (function name, class name, etc.) can be used multiple times in different modules.

You might have realised that you're already familiar with this topic from your previous usage of functions. The following example should be clear:

In [37]:
a = 1
def f(a):
    """ Scopes and namespaces exemplified """
    a = 2
    print(f"Inside the function, a={a}")
f(a)
print(f"But outside of it, a={a}")

Inside the function, a=2
But outside of it, a=1


Namespaces for modules are the same. Python modules are an object with variables, classes and functions in it. A module can be a single class, a file or a folder containing files and other sub-folders. A module is brought into scope (into our namespace) with the import statement:

In [38]:
# The objects "math" or "pi" do not exist here yet.
import math
math.pi

3.141592653589793

`pi` is only defined in the context of the `math` module. Without the import statement there's no special meaning attached to neither `math` nor `pi`, and they can be used as normal variables.

In [39]:
import os
os.sep

'/'

When we try to use a function, class or constant from a package (module) we didn't import, we'll receive a `NameError`. That's also the exception raised when we try to use a variable that didn't exist beforehand. This is one of the reasons we always keep our imports at the top of the file.

In [40]:
cos  # we probably want math.cos, but we didn't import it, so we got a NameError

NameError: name 'cos' is not defined

### The Default Python Namespace

Python comes with a number of built-in functions and reserved keywords:

![Python Built-in Functions](python_builtin.png)
<p style="text-align: center; font-style: italic;">
    <a href="https://docs.python.org/3/library/functions.html">
        Built-in functions
    </a>
    in Python.
</p>

![Python Keywords](python_keyword.png)
<p style="text-align: center; font-style: italic;">
    <a href="https://docs.python.org/3.8/reference/lexical_analysis.html#keywords">
        Reserved keywords
    </a>
    in Python.
</p>

The number of functions and reserved keywords inside the default Python namespace is extremely limited on purpose, definitely in comparison to MATLAB's vast array of default functions - the true power of Python comes from its ecosystem. Its one of the largest and most comprehensive around, certainly for a scripting language, and allows you to do basically whatever you want with minimum effort.

The standard library of Python includes the packages that come with every installation of Python. I didn't have to do anything special to import the `math` module - it was just there, "waiting" for me to import it. 

Most of the functions you'd expect a programming language to have are indeed included in the standard library, available automatically to everyone who downloaded Python. Other modules, including many popular ones, are available online. We'll discuss later how to import them.

I'll let you discover by yourself what is included in the standard library, but some of the highlights include:

In [41]:
import pathlib
p = pathlib.Path(".")
print(p.absolute())

/home/flavus/Projects/textbook/textbook/source/classes/class_1


In [42]:
import urllib.request
url = urllib.request.urlopen("https://www.google.com")
print(url.read())

b'<!doctype html><html dir="rtl" itemscope="" itemtype="http://schema.org/WebPage" lang="iw"><head><meta content="text/html; charset=UTF-8" http-equiv="Content-Type"><meta content="/images/branding/googleg/1x/googleg_standard_color_128dp.png" itemprop="image"><title>Google</title><script nonce="1J3IfI3TTy5W6uow6NxL2A==">(function(){window.google={kEI:\'6yFiX7HhBoiYlwTYlojoDg\',kEXPI:\'0,202162,1151585,5662,730,224,756,4348,207,3204,10,1226,364,1499,612,91,114,383,246,5,1354,648,652,2800,133,5,134,42,3,66,308,677,131,130,22,865,113,47,407,409,212,652,252,240,1118818,1197697,569,329001,13677,4855,32692,16114,17444,11240,9188,8384,4858,1362,9290,3022,4740,6,6,11027,1808,4998,4090,3841,5297,2054,920,873,4200,6422,7432,7095,4518,1396,1381,919,2277,8,2796,1593,1279,2212,530,149,1103,840,517,1466,56,4258,312,1137,2,2063,606,2023,1777,520,1947,2229,93,328,1284,16,2927,2247,1812,504,1283,3227,2845,7,5599,469,6286,4454,642,6134,1743,4928,108,3409,906,2,940,716,1899,2397,1386,6084,1704,1571,3,346

It can be quite irritating to write the full namespace address of all functions. To this end we have the `from` and `as` keywords:

In [43]:
from math import cos as cosine
cosine(1.)

0.5403023058681398

Enumerations are great, use this opportunity to familiarize youself with them:

In [45]:
from enum import Enum
from pprint import pprint

class Day(Enum):
    SUNDAY = 1
    MONDAY = 2
    TUESDAY = 3
    WEDNESDAY = 4
    THURSDAY = 5
    FRIDAY = 6
    SATURDAY = 7

d = Day.MONDAY

print(d)
pprint(d)

Day.MONDAY
<Day.MONDAY: 2>


In [None]:
from random import randint, randrange

In [None]:
# Instead (or on top) of from:
import multiprocessing.pool as mpool

In [None]:
# A non-recommended version of importing is the star import
from math import *
print(pi)
print(e)
print(cos(pi))

The list of all standard library modules in Python 3 is [here](https://docs.python.org/3/library/index.html). However, it's just a drop in the ocean that is the Python ecosystem. 

## `pip` and External Modules

As noted before, Python's gigantic ecosystem is one of its biggest strengths. And the fact that you can easily install many of these package with a simple command-line instruction is even more important. The standard tool to do that is `pip`, a recursive acronym that stands for "`pip` installs Python".

`pip` itself is a Python program, __but its not run from inside the Python interpreter.__ Instead its run from a shell - the Windows command line (or PowerShell), for example, or the Mac's terminal - making it an external application to Python. Happily enough, basically all Python distributions come with `pip` pre-installed, so you don't have to install it yourself.

To install a library with pip, open a command line (in VSCode you can simply press Ctrl + ~) and type `pip install package_name`. 

`pip`, like any package manager, has two main jobs. The first is to provide a convenient API to a package repository. In essence, it's a download manager for a single site - the [Python Package Index](https://pypi.python.org/pypi) (PyPI, pronounced **pai-pee-eye**), the official Python repository for packages.

PyPI holds installation files for the packages hosted in it, alongside with some metadata, like version number and dependencies. `pip` (and other tools we'll discuss soon) downloads the wanted package from PyPI, together with its dependencies, and installs them in a pre-defined location in our personal computer.

`pip`'s second important job is handling dependencies. Many packages rely on other packages, which in turn rely on other, more basic packages, finally leading to the basic Python interpreter. `pip` has to make sure to install all dependencies of the package you're currently after, and to avoid any collisions with other installed packages. For example, a common problem in the Python world is the Python 2-3 schism, which means that packages written for Python 2 can't run on a Python 3 interpreter, and vice versa. The package manger's job is to grant you the right version of the package you're looking for.

### Comparison with MATLAB's approach

We should take a minute here to contrast Python's approach with that of MATLAB. In MATLAB, once we added a directory to the path using `pathtool`, each file in that file is now directly in the MATLAB namespace. This means that we don't have to `import` anything - adding something to `pathtool` is essentially `import`ing the entire folder into the general namespace, which is the only namespace in MATLAB world. This is a pretty straight-forward approach, but it's also one that no other programming language, especially a modern one, uses. This is because cluttering the one and only namespace is a bad idea, since you can quite easily overwrite names of functions from files with names of variables you use, and you won't even know it. Moreover, you don't need **all** functions around **all** the time. Each file and project will usually need a few different functions from a couple of toolboxes, and that's it. 

In Python we have another layer between our code and the `import`-able functions. Inside our code we can only use built-in functions (`list()`, `int()`, `print()`, etc.) and functions that we explicitly imported, which will vary between files and project. In addition, we can only call `import` on files and packages which are in our Python's path. We'll discuss Python's equivilant of `pathtool` in the next class, but for now you should see the separation Python forces on us between the specific file's namespace and the general Python namespace (what packages can we even import).

## File Structure

`import` statements are useful for more than just importing code - they're also our way of arranging our project's files. Here's the standard arrangement of files I'd like you to use throughout the course:

The base folder can contain many other files, including sample data, for example. The point here is that your actual code in confined to the `project_name` folder, which has an empty `__init__.py`. This file allows Python to import user-defined objects from that folder.

Thus, if you wish to use a function you defined in `class_1.py` in `class_2.py`, you should write inside `class_2.py` the following statement: `from class_1.py import my_func`.

## Scripts and Functions 
*Code examples in `import_demonstration` folder*

If you're familiar with MATLAB there's a good chance that you've written a script before. A script is a file which is run sequentially, while using other functions and definitions. Python supports scripts as well, as can be seen in `main.py`.

However, in the Python world people usually prefer to stay away from scripts. This is due to a number of reasons, the most important one being that running a `main()` function as it is easy as running a `main.py` script. You can see examples of a procedural and script-like approach in the `main.py` file, but keep in mind the the script version is discouraged.

If you wish to run a file full with functions from the command line or from your IDE, you should include the following lines:
```python
if __name__ == '__main__':
    run_main()
```

Every Python file which is being run has a caller. If this file was run directly from the Python interpreter its `__name__` will be `'__main__'`. This `if` statement basically tells the Python interpreter "Start from here", and is the conventional way to run Python procedures.

In this course you're highly encouraged to divide your code into many small functions and methods in well-defined compact classes. Each method should have a single purpose, documented in its docstring. Each class should have a logical structure that envelopes its methods and attributes in a sensible way.

Beware of God classes, or God scripts and functions. These are monolithic objects that encompass the entirety of your application, and are very hard to reason about. Simply create more files, each with a descriptive name and a bunch of related functions, and import this file into the other folders and main file.

Another important reason to partition our code into many small bits is *unit testing* which we'll cover later on in the course.

Importing code from one file to the other isn't as easy as you'd like it to be, especially since each language deals with this issue in a different way. It's completely expected that you won't be able to 'nail' the first couple of import statements you write. Keep trying, verify that your code is in the right directory structure, and don't be afraid to ask friends, Google or me.