Creating and Publishing a Python Library#

Early in the course we showed how easy it is to download packages from PyPI (Python Package Index), the official repository for all Python pacakges. This class will discuss process of creating and publishing a library to the outer world, which is a bit more complicated than downloading a package, but not by much.

Naive Solution#

The simplest way to share our code is to send a .py file to someone by mail. If the receiver has a Python interpreter installed, he can just run the file by writing python new_file.py in his command line. This has a few short-comings:

  1. The script was written in a specific Python version. You might have to make sure that your receiver has the same Python version.

  2. If it’s more than one script, you have to bundle scripts together and make sure they remain in the same directory when run.

  3. Nearly all scripts have dependencies. There’s no way to verify that the receiver truly has all the dependencies and is using the same version of dependencies that you’re using.

  4. If the original script is using code not written in Python, like Cython code or other C\C++ modules, you need to build these scripts in place - you can’t rely on them working on the target computer.

These issues, along with the relative ease with which one can publish Python code online, support a more robust approach to code sharing.

Python Libraries from the Ground Up#

There are a few “layers” for Python projects and libraries. The basic layers one can encounter in each Python (and software) project are source-code organization and version control. This tutorial will guide you through the process of creating a full-blown project, including PyPI publishing and test-suite support. If you’re aiming at something lighter you can just drop some of the components used here.

Features of this project include:

  1. GitHub integration

  2. Continuous Integration via GitHub Actions

  3. Semantic versioning support

  4. Easy PyPI uploading

  5. Automatic documentation generation

Many of these features can be generated using cookiecutter but we’ll avoid that for now. Once we understand what’s going on under the hood we might prefer using cookiecutter instead of going through these manual steps. This guide is heavily based on a famous blog post.

Scaffolding#

I’ll be assuming that we’re creating a new project, but these steps can easily be adapted if you wish to convert an existing project.

  1. Make a new directory with the project’s name. We’ll use “Parse Stuff” as our project name here, so this directory may be called ParseStuff, parseStuff, parse_stuff or really any other name.

  2. Open it in VSCode using “Open Folder…”.

  3. Generate several basic files that we’ll fill in later:

    • Create README.md and write inside a single line describing the project.

    • Create an empty CHANGELOG.md file.

    • Create .gitignore and copy-paste the content from here.

    • Create a src folder, and inside it create a folder with the name of your project in snake case. In our case, the folder is src/parse_stuff. Inside that folder create a __init__.py file.

    • Create a file named LICENSE (no suffix needed) and copy the content from here. This is the MIT license which gives other people permission to basically do whatever they want with your code. Make sure to replace the “year” and “name” placeholders.

    • Create a folder named tests and inside it create an empty __init__.py file.

  4. Now that we have a couple of basic files in our hands, create a new git repository in that folder using VSCode’s “Initial repo” button.

  5. In the same git menu, add all files (using the “+” button), write a commit message (“Initial commit” will do) and commit by clicking the “V”.

  6. Go to your GitHub account and create a new project with a name matching the name of the folder in your system (i.e. ParseStuff). You don’t need a license or a .gitignore file, an empty project will do just fine.

  7. Copy the link to the repo (the one ending with .git) and go back to VSCode. Press Ctrl+Shift+P and write “Git: Add Remote”. Paste the link, insert “origin” as the repo name and confirm other dialogue boxes.

  8. If everything worked then you should be able to push by clicking the “…” icon and choosing the “Push” option. Make sure your code was indeed uploaded to the repo.

Environment and dependencies#

We have a folder but the project really isn’t a project quite yet, and we also didn’t do anything quite new yet. Let’s do some Python-related work:

  1. Create a new environment and activate it: conda create -n parsestuff python=3.8. Once it’s done write conda activate parsestuff.

  2. Install several key dependencies. You can add to this list any dependencies you know you’ll have for the actual code: pip install build twine black flake8 pytest

  3. Add configuration files. These files configure the behavior of the tools we just installed, so it’s OK if you don’t completely understand every written word here.

    • Generate a file called pyproject.toml and write inside the following:

    [tool.black]
    line-length = 88
    target-version = ['py36', 'py37', 'py38']
    include = '\.pyi?$'
    exclude = '''
    /(
        \.eggs
      | \.git
      | \.hg
      | \.mypy_cache
      | \.tox
      | \.venv
      | _build
      | buck-out
      | build
      | dist
    )/
    '''
    [build-system]
    requires = ["setuptools>=41.0", "wheel"]
    build-backend = "setuptools.build_meta"
    
    • Generate a file called .flake8 and write inside the following:

    [flake8]
    per-file-ignores =
        */__init__.py: F401
    
    • Generate a file called MANIFEST.in and write inside the following:

        include LICENSE *.rst *.toml *.yml *.yaml
        # Tests
        recursive-include tests *.py
        # Documentation
        recursive-include docs *.png
        recursive-include docs *.svg
        recursive-include docs *.py
        recursive-include docs *.rst
        prune docs/_build
    
    • Generate a file called requirements.txt and write inside the following:

        --index-url https://pypi.python.org/simple/
        -e .
    
    • Generate a file called .pypirc in your home directory and write inside the following:

        [distutils]
        index-servers=
            pypi
            test
        [test]
        repository = https://test.pypi.org/legacy/
        username = <your test user name goes here>
        [pypi]
        username = __token__
    
    • Write pip install mkdocs in your console, followed by a mkdocs new . This should generate an mkdocs.yml file and a docs folder with an index.md in it. At the end of the mkdocs.yml file you should add theme: readthedocs. We’ll leave everything empty right now, and we’ll come back to this later.

  4. Add a setup.cfg file with the following content:

    [metadata]
    name = time_travel
    version = 0.0.1
    author = Zvi Baratz
    author_email = z.baratz@gmail.com
    description = Travel in time using Python!
    long_description = file: README.md, CHANGELOG.md, LICENSE
    long_description_content_type = text/markdown
    keywords = future, timetravel, example
    url = https://github.com/ZviBaratz/time_travel
    project_urls =
        Bug Tracker = https://github.com/ZviBaratz/time_travel/issues
    classifiers =
        Programming Language :: Python :: 3
        License :: OSI Approved :: MIT License
        Operating System :: OS Independent

    [options]
    package_dir =
        = src
    packages = find:
    python_requires = >=3.6
    install_requires =
        numpy

    [options.extras_require]
    dev = black; flake8; pytest
    dist = build; twine
    docs = mkdocs

    [options.packages.find]
    where = src
  1. Add a setup.py with the following content:

    import setuptools

    if __name__ == "__main__":
        setuptools.setup()

Before we go any further let’s recap our latest steps: First, we’re defining configuration files for some of the tools we’ll be using, like black and flake8. These are definitely not a must, but they’re nice to have. Then we added an eerie-looking MANIFEST.in file which contains a bunch of incantations. This file define what will be packaged with our code besides the actual code, but I won’t go into any further details. Lastly we added setup.cfg and setup.py files which are basically a configuration\metadata files that define how will our package be packaged.

Build and publish!#

Before doing more advanced stuff, let’s see if it builds. Building a package means generating a file that can be more easily shared with others, since this file (or files) contain metadata regarding our dependencies, Python version, and more. Every time we build we first have to remove the dist folder into which we’ll build our package. Since this is the first time we’re building, we don’t have this folder yet, but don’t forget to do it next time. To build, write in the command line (when you’re at the top project directory):

python3 -m build

If everything was successful, you should see a new dist and two files there - a wheel file and a tar.gz file. These files can be shared and installed with pip on other computers, but the package can’t be installed from the web yet. To do that we’ll have to upload them to PyPI. Before doing that it’s advised that you should make a new virtual environment and see if they can be installled.

Assuming installation was successful, we can try to upload them to PyPI. We’ll first upload it to a test (‘staging’) server, and then to the real PyPI. To create a username in the testing server, go here and create a new username. Fill in this username in the previously generated .pypirc file. Now we’ll use twine to upload it to the staging area:

twine upload -r test dist/parse_stuff*

If the upload was reported as successful, try to pip-install it (again, in a new environment) with the following command: pip install -i https://testpypi.python.org/pypi parse_stuff

If it can be imported, and your page online in the testing area looks fine, then we’re ready to upload it to the “real” PyPI. Write:

twine upload -r pypi dist/parse_stuff*

and witness the wonder. Easy, right? :)

Documentation and ReadTheDocs#

Documenting your project is extremely important. Besides having comments in your code and helpful docstrings, you sometimes also want more high-level material regarding installation and usage of your pacakge. This info should go in the docs folder that was generated earlier. The idea is to build a website from markdown (.md) files. Please consult the MkDocs website for more details.

If our project is on PyPI, we also want our documentation to have a website. ReadTheDocs is a website that happily hosts the documentation for your code. It will automatically fetch whatever’s in your docs folder and display it in a nice-looking website.

  1. Create an account there and log into it.

  2. Go to your dashboard and Import a repository from the “My Projects” pulldown.

  3. Inside GitHub, select Settings -> Webhooks and turn on the ReadTheDocs hook.

Your docs will update online every time you update them and push the changes to GitHub. They’re probably currently quite dull, but you should add more documents to the docs folder sooner rather than later.

Continuous Integration#

CI is used to automatically test your code whenever you push something to GitHub. It works by pulling the latest commit from your GitHub into its servers. It then builds an environment identical to the one you have in your computer, installs your package and runs the tests for that package. Lastly it generates a detailed error report if one or more of your tests failed. This section is - again - based on a famous blog post.

The easiest CI tool available today is GitHub actions. It’s integrated inside GitHub and allows you monitor the execution easily. To set it up, go to the “Actions” tab in your GH repo and click “Skip this and setup a workflow yourself”. Delete the template and paste this instead:

---
name: CI

on:
  push:
    branches: ["master"]
  pull_request:
    branches: ["master"]

jobs:
  tests:
    name: "Python ${{ matrix.python-version }}"
    runs-on: "ubuntu-latest"
    env:
      USING_COVERAGE: '3.6,3.8'

    strategy:
      matrix:
        python-version: ["3.6", "3.7", "3.8"]

    steps:
      - uses: "actions/checkout@v2"
      - uses: "actions/setup-python@v1"
        with:
          python-version: "${{ matrix.python-version }}"
      - name: "Install dependencies"
        run: |
          set -xe
          python -VV
          python -m site
          python -m pip install --upgrade pip setuptools wheel
          python -m pip install --upgrade pytest

      - name: "Run pytest"
        run: "python -m pytest"`

To conclude this part, “…click the green “Start commit” button in the top right and make sure you select the “Create a new branch for this commit and start a pull request.” radio button. Give the branch a memorable name (e.g. github-actions) and subsequently click “Create pull request”.”

These changes are only updated in our web repo. We’ll need to pull them in order to see them locally.