Creating and Publishing a Python Library#
Early in the course we showed how easy it is to download packages from PyPI (Python Package Index), the official repository for all Python pacakges. This class will discuss process of creating and publishing a library to the outer world, which is a bit more complicated than downloading a package, but not by much.
Naive Solution#
The simplest way to share our code is to send a .py
file to someone by mail. If the receiver has a Python interpreter installed, he can just run the file by writing python new_file.py
in his command line. This has a few short-comings:
The script was written in a specific Python version. You might have to make sure that your receiver has the same Python version.
If it’s more than one script, you have to bundle scripts together and make sure they remain in the same directory when run.
Nearly all scripts have dependencies. There’s no way to verify that the receiver truly has all the dependencies and is using the same version of dependencies that you’re using.
If the original script is using code not written in Python, like Cython code or other C\C++ modules, you need to build these scripts in place - you can’t rely on them working on the target computer.
These issues, along with the relative ease with which one can publish Python code online, support a more robust approach to code sharing.
Python Libraries from the Ground Up#
There are a few “layers” for Python projects and libraries. The basic layers one can encounter in each Python (and software) project are source-code organization and version control. This tutorial will guide you through the process of creating a full-blown project, including PyPI publishing and test-suite support. If you’re aiming at something lighter you can just drop some of the components used here.
Features of this project include:
GitHub integration
Continuous Integration via GitHub Actions
Semantic versioning support
Easy PyPI uploading
Automatic documentation generation
Many of these features can be generated using cookiecutter
but we’ll avoid that for now. Once we understand what’s going on under the hood we might prefer using cookiecutter
instead of going through these manual steps. This guide is heavily based on a famous blog post.
Scaffolding#
I’ll be assuming that we’re creating a new project, but these steps can easily be adapted if you wish to convert an existing project.
Make a new directory with the project’s name. We’ll use “Parse Stuff” as our project name here, so this directory may be called
ParseStuff
,parseStuff
,parse_stuff
or really any other name.Open it in VSCode using “Open Folder…”.
Generate several basic files that we’ll fill in later:
Create
README.md
and write inside a single line describing the project.Create an empty
CHANGELOG.md
file.Create
.gitignore
and copy-paste the content from here.Create a
src
folder, and inside it create a folder with the name of your project in snake case. In our case, the folder issrc/parse_stuff
. Inside that folder create a__init__.py
file.Create a file named
LICENSE
(no suffix needed) and copy the content from here. This is the MIT license which gives other people permission to basically do whatever they want with your code. Make sure to replace the “year” and “name” placeholders.Create a folder named
tests
and inside it create an empty__init__.py
file.
Now that we have a couple of basic files in our hands, create a new git repository in that folder using VSCode’s “Initial repo” button.
In the same git menu, add all files (using the “+” button), write a commit message (“Initial commit” will do) and commit by clicking the “V”.
Go to your GitHub account and create a new project with a name matching the name of the folder in your system (i.e.
ParseStuff
). You don’t need a license or a .gitignore file, an empty project will do just fine.Copy the link to the repo (the one ending with
.git
) and go back to VSCode. Press Ctrl+Shift+P and write “Git: Add Remote”. Paste the link, insert “origin
” as the repo name and confirm other dialogue boxes.If everything worked then you should be able to push by clicking the “…” icon and choosing the “Push” option. Make sure your code was indeed uploaded to the repo.
Environment and dependencies#
We have a folder but the project really isn’t a project quite yet, and we also didn’t do anything quite new yet. Let’s do some Python-related work:
Create a new environment and activate it:
conda create -n parsestuff python=3.8
. Once it’s done writeconda activate parsestuff
.Install several key dependencies. You can add to this list any dependencies you know you’ll have for the actual code:
pip install build twine black flake8 pytest
Add configuration files. These files configure the behavior of the tools we just installed, so it’s OK if you don’t completely understand every written word here.
Generate a file called
pyproject.toml
and write inside the following:
[tool.black] line-length = 88 target-version = ['py36', 'py37', 'py38'] include = '\.pyi?$' exclude = ''' /( \.eggs | \.git | \.hg | \.mypy_cache | \.tox | \.venv | _build | buck-out | build | dist )/ ''' [build-system] requires = ["setuptools>=41.0", "wheel"] build-backend = "setuptools.build_meta"
Generate a file called
.flake8
and write inside the following:
[flake8] per-file-ignores = */__init__.py: F401
Generate a file called
MANIFEST.in
and write inside the following:
include LICENSE *.rst *.toml *.yml *.yaml # Tests recursive-include tests *.py # Documentation recursive-include docs *.png recursive-include docs *.svg recursive-include docs *.py recursive-include docs *.rst prune docs/_build
Generate a file called
requirements.txt
and write inside the following:
--index-url https://pypi.python.org/simple/ -e .
Generate a file called
.pypirc
in your home directory and write inside the following:
[distutils] index-servers= pypi test [test] repository = https://test.pypi.org/legacy/ username = <your test user name goes here> [pypi] username = __token__
Write
pip install mkdocs
in your console, followed by amkdocs new .
This should generate anmkdocs.yml
file and adocs
folder with anindex.md
in it. At the end of themkdocs.yml
file you should addtheme: readthedocs
. We’ll leave everything empty right now, and we’ll come back to this later.
Add a
setup.cfg
file with the following content:
[metadata]
name = time_travel
version = 0.0.1
author = Zvi Baratz
author_email = z.baratz@gmail.com
description = Travel in time using Python!
long_description = file: README.md, CHANGELOG.md, LICENSE
long_description_content_type = text/markdown
keywords = future, timetravel, example
url = https://github.com/ZviBaratz/time_travel
project_urls =
Bug Tracker = https://github.com/ZviBaratz/time_travel/issues
classifiers =
Programming Language :: Python :: 3
License :: OSI Approved :: MIT License
Operating System :: OS Independent
[options]
package_dir =
= src
packages = find:
python_requires = >=3.6
install_requires =
numpy
[options.extras_require]
dev = black; flake8; pytest
dist = build; twine
docs = mkdocs
[options.packages.find]
where = src
Add a
setup.py
with the following content:
import setuptools
if __name__ == "__main__":
setuptools.setup()
Before we go any further let’s recap our latest steps: First, we’re defining configuration files for some of the tools we’ll be using, like black
and flake8
. These are definitely not a must, but they’re nice to have. Then we added an eerie-looking MANIFEST.in
file which contains a bunch of incantations. This file define what will be packaged with our code besides the actual code, but I won’t go into any further details. Lastly we added setup.cfg
and setup.py
files which are basically a configuration\metadata files that define how will our package be packaged.
Build and publish!#
Before doing more advanced stuff, let’s see if it builds. Building a package means generating a file that can be more easily shared with others, since this file (or files) contain metadata regarding our dependencies, Python version, and more. Every time we build we first have to remove the dist
folder into which we’ll build our package. Since this is the first time we’re building, we don’t have this folder yet, but don’t forget to do it next time. To build, write in the command line (when you’re at the top project directory):
python3 -m build
If everything was successful, you should see a new dist
and two files there - a wheel
file and a tar.gz
file. These files can be shared and installed with pip
on other computers, but the package can’t be installed from the web yet. To do that we’ll have to upload them to PyPI. Before doing that it’s advised that you should make a new virtual environment and see if they can be installled.
Assuming installation was successful, we can try to upload them to PyPI. We’ll first upload it to a test (‘staging’) server, and then to the real PyPI. To create a username in the testing server, go here and create a new username. Fill in this username in the previously generated .pypirc
file. Now we’ll use twine
to upload it to the staging area:
twine upload -r test dist/parse_stuff*
If the upload was reported as successful, try to pip
-install it (again, in a new environment) with the following command:
pip install -i https://testpypi.python.org/pypi parse_stuff
If it can be imported, and your page online in the testing area looks fine, then we’re ready to upload it to the “real” PyPI. Write:
twine upload -r pypi dist/parse_stuff*
and witness the wonder. Easy, right? :)
Documentation and ReadTheDocs#
Documenting your project is extremely important. Besides having comments in your code and helpful docstrings, you sometimes also want more high-level material regarding installation and usage of your pacakge. This info should go in the docs
folder that was generated earlier. The idea is to build a website from markdown (.md
) files. Please consult the MkDocs website for more details.
If our project is on PyPI, we also want our documentation to have a website. ReadTheDocs is a website that happily hosts the documentation for your code. It will automatically fetch whatever’s in your docs
folder and display it in a nice-looking website.
Create an account there and log into it.
Go to your dashboard and Import a repository from the “My Projects” pulldown.
Inside GitHub, select Settings -> Webhooks and turn on the ReadTheDocs hook.
Your docs will update online every time you update them and push the changes to GitHub. They’re probably currently quite dull, but you should add more documents to the docs
folder sooner rather than later.
Continuous Integration#
CI is used to automatically test your code whenever you push something to GitHub. It works by pulling the latest commit from your GitHub into its servers. It then builds an environment identical to the one you have in your computer, installs your package and runs the tests for that package. Lastly it generates a detailed error report if one or more of your tests failed. This section is - again - based on a famous blog post.
The easiest CI tool available today is GitHub actions. It’s integrated inside GitHub and allows you monitor the execution easily. To set it up, go to the “Actions” tab in your GH repo and click “Skip this and setup a workflow yourself”. Delete the template and paste this instead:
---
name: CI
on:
push:
branches: ["master"]
pull_request:
branches: ["master"]
jobs:
tests:
name: "Python ${{ matrix.python-version }}"
runs-on: "ubuntu-latest"
env:
USING_COVERAGE: '3.6,3.8'
strategy:
matrix:
python-version: ["3.6", "3.7", "3.8"]
steps:
- uses: "actions/checkout@v2"
- uses: "actions/setup-python@v1"
with:
python-version: "${{ matrix.python-version }}"
- name: "Install dependencies"
run: |
set -xe
python -VV
python -m site
python -m pip install --upgrade pip setuptools wheel
python -m pip install --upgrade pytest
- name: "Run pytest"
run: "python -m pytest"`
To conclude this part, “…click the green “Start commit” button in the top right and make sure you select the “Create a new branch for this commit and start a pull request.” radio button. Give the branch a memorable name (e.g. github-actions) and subsequently click “Create pull request”.”
These changes are only updated in our web repo. We’ll need to pull them in order to see them locally.