List of Necessary Files for Packaging

There are several files that are commonly included in the root folder of a Python project. These are:

Almost all projects benefit from these files from one way or another and it can even be said that they are necessary for packaging a Python library.

Let’s take a closer look at each of these files so you can also construct them for you project:

Estimated Time

10 mins

Skill Level

Upper-Intermediate

Exercises

na

Posts from the Series

Course Provider

Provided by HolyPython.com

setup.py

Although each file have a function, setup.py is a file with very high priority in the setup chain.

You can include a number of features in your Python package’s setup.py file. On top it usually includes:

from setuptools import setup

Whenever package is being installed, this is the module that’s used to install your package.

It’s also useful to include this file reading procedure. This allows PyPI website to see inside your readme.md file and include it on the website on your project’s webpage.

Make sure to match the exact file name and extension on your package’s root folder, it’s also case sensitive.

with open("README.md", 'r') as f:
    long_description = f.read()

Next is constructing a setup function with all the parameters that can be handy for an installation. Most of the information your provide here will either be used in the installation criteria or they’ll be shown on the project’s page on PyPI.

Check out this example from Watermarkd repository:

It’s important to note that these values are sensitive. They can alter the installation process of your package or the name it is listed under. So be especially careful with:

  • name, unique name of your package to be distributed under
  • version: Package version, important to match with the release version on Github and download_url.
  • license: The license content based on the license you decided to choose.
  • download_url: Compressed file link under releases on Github.
  • packages: List of all Python import packages that should be included.

You can probably fix a mistake about author_email or url but you’d probably still want to avoid mistakes.

url is the link that shows under Homepage link on PyPI while;

download_url is the link to your compressed source distribution file on Github that you get after making a release.

long_description=long_description might seem confusing, this just tells PyPI where to get the long description that will be shown on project’s page from. Remember the previous code where we opened readme.md file and assigned this value to its content. This is a common practice to have package’s long_description to come from the readme.md file so that there is a standard and it also helps avoid repeat work.

It’s also a good idea to include long_description_content_type=”text/markdown”, this way setuptools knows the type of your long_description. If you skip this step you might get errors related to the format of your readme.md file.

Additionally, install_requires and python requires ensures necessary libraries as well as the correct Python version is installed for your package to be installed.

If you’re curious about any other parameter python’s official documentation is pretty neat and satisfactory so, you can also check out all the Python setup.py parameters here.

   name='Watermarkd',
   version='0.7.1.2',
   description='A friendly watermarking tool with optional GUI component.',
   license="Apache-2.0",
   long_description=long_description,
   long_description_content_type="text/markdown",
   author='holypython.com',
   author_email='watermarkd@holypython.com',
   url="https://holypython.com/",
   download_url = 'https://github.com/holypython/Watermarkd/archive/0.7.1.2.tar.gz',
   packages=['Watermarkd'],

   install_requires=[
       'pillow',
       'pysimplegui',
   ],

   python_requires='>=3.6'
)

Additionally, it’s a good idea to include two lists namely, keywords and classifiers. These will be shown on PyPI and help search engines, contributors and your audience in general to find, identify and understand your project better.

A word of caution must be mentioned here. There is a standard for the classifiers, you can’t just make up new categories. You can check out this index from PyPI to decide on your classifiers.

Here is a sample from Watermarkd:

   keywords = ['watermarking', 'image processing', 'watermark', 'photography', 'copyrights', 'holypython', 'batch watermark', 'holypython.com'],
   classifiers=[
       "Development Status :: 3 - Alpha",
       "Intended Audience :: Developers",
       "Intended Audience :: Education",
       "Intended Audience :: End Users/Desktop",
       "Programming Language :: Python :: 3",
       "License :: OSI Approved :: Apache Software License",
       "Operating System :: OS Independent",
   ],

There you go, we have a nice and full setup.py file. Check out this full code borrowed from Watermarkd library on Github.

"""Simple Photo Watermarker"""

import setuptools
from setuptools import setup

with open("README.md", 'r') as f:
    long_description = f.read()

setup(
   name='Watermarkd',
   version='0.7.1.2',
   description='A friendly watermarking tool with optional GUI component.',
   license="Apache-2.0",
   long_description=long_description,
   long_description_content_type="text/markdown",
   author='holypython.com',
   author_email='watermarkd@holypython.com',
   url="https://holypython.com/",
   download_url = 'https://github.com/holypython/Watermarkd/archive/0.7.1.2.tar.gz',
   packages=['Watermarkd'],
   keywords = ['watermarking', 'image processing', 'watermark', 'photography', 'copyrights', 'holypython', 'batch watermark', 'holypython.com'],
   classifiers=[
       "Development Status :: 3 - Alpha",
       "Intended Audience :: Developers",
       "Intended Audience :: Education",
       "Intended Audience :: End Users/Desktop",
       "Programming Language :: Python :: 3",
       "License :: OSI Approved :: Apache Software License",
       "Operating System :: OS Independent",
   ],

   install_requires=[
       'pillow',
       'pysimplegui',
   ],

   python_requires='>=3.6'
)

readme.md

This is a sometimes overlooked but very important file.

Open source projects can lack the fine showcase proprietary software usually has. But, it’s actually very important to have a decent readme.md file so that:

  • users know what the library is capable of
  • users don’t have to struggle to get started with your library
  • known issues are shared honestly with the audience
  • development plans are shared, so users can know what to expect and you can also build rapport with your audience.
  • don’t forget the people who helped you, supported you and inspired you. As the saying goes, nothing’s new under the sun, we’re all learning from each other.
  • examples can make understanding your library 100x easier.
  • you can also include images and appropriate links, but remember, images have to be in the hosted in the repository for PyPI to pick them up. Otherwise they’ll show in Github readme but they’ll be broken on PyPI.

Finally, .md extension stands for markdown language. It’s a bit different than html or text but it’s very intuitive and easy to pick up. Just check out a few readme.md files on Github and you’ll get the idea.

Some of the common useful subtopics in a readme.md file are:

  • Installation
  • How to Use
  • Examles
  • Classes and Functions Explained
  • Known Issues
  • Versions
  • References
  • Acknowledgements
  • Future Plans
  • Dependencies
  • Contact Information
  • Instructions for Contributing

Requirements.txt

Requirements file is easy, it is used to manage dependencies. It communicates with the setup interface that your package requires certain other libraries because its working depends on them.

The important point here is that:

  1. You don’t have to list default Python libraries. They are already included in Python. So, if your library is using something like:
    1. random
    2. datetime
    3. getpass
    4. turtle
    5. calendar
    6. collections
    7. math
    8. sqlite3
    9. json
    10. socket
    11. unittest
    12. tkinter
    13. re
    14. zipfile
    15. gzip

etc. You don’t have to include these libraries because they are included in the standard Python library. You can see the complete standard library list in Official Python Documents here.

However, if your packaging is using a third party library such as:

pysimplegui, pil, tensorflow, bs4 etc. those are the ones you’ll want to include in your requirements.txt file with the versions your package depends on. You can find the latest version on library’s PyPI page or Github page. 

Here is an example from the Watermarkd library’s requirements.txt file:

pillow==7.2.0
pysimplegui==4.29.0

__init__.py

This file manages the import process when a user is using the library after installation.

So, when we type something like import pandas or import matplotlib or import PIL what happens behind the scene is that __init__.py file manages what to actually import.

There can be many different needs and variations depending on the code but there are usually two types of management that takes place related to __init__.py file:

  • which libraries to actually import when import command is executed.
  • which classes from those libraries to import when a library is imported.

It can help to check out a couple of examples through Holypython’s Watermarkd library on Github.

Firstly, this library includes source code (Watermarkd.py) inside a folder (Watermarkd) under root folder.

  1. First one, under the root folder imports the source code from the folder it’s in when import is executed.
    • from .Watermarkd import Watermarkd
  2. Second one, under the Watermarkd folder where source code can be found. This __init__.py ensures necessary classes are imported when Watermarkd library source code is imported.
    • from .Watermarkd import Spread

So, two __init__.py files are being utilized to ensure intended, optimum user experience.

You can check out the files in the repo for a clearer demonstration.

License

This file is simple from the file structure perspective. It’s just a text file that includes your license. However, it’s not so simple from the legal or philosophical perspective.

If you’re an entrepreneur, inventor, engineer, developer or coder or anything that’s similar to these innovative and inventive occupations you may want to familiarize yourself with different types of licenses and legal obligations they bring or prevent from happening.

We have a nicely structured and extensive article about different open source licenses, so, you’re welcome to check that one out.

Generally, if you decided to publish your code as open source you will likely decide between two categories of open source licenses: copyleft or permissive type.

  1. Copyleft is more restrictive regarding the continuation of the same license and sharing everything (even related work) as open source regarding derivative work. It also comes with a subcategory; weak copyleft. Weak copyleft licenses usually have less restrictions than a copyleft license. Depending on perspective copyleft can be favorable or unfavorable. But in general it can be said that it’s much less business friendly (or commercial friendly) in the traditional sense.
  2. Permissive on the other hand comes with less restrictions. There are many varieties that cater to different needs but at their core they usually share this message: Here is my code, it’s free, it’s as is, it’s open source. You can do whatever you want with it. You can’t sue me though, I’m not responsible for any damage or commercial failure or anything like that.

This is the essence for people with non-legal backgrounds. Even permissive licences usually have many nuances so it makes sense to take a deeper look especially if your work has intellectual weight and potential. Generally speaking it can be said that, the more important the code, the more important license decisions become.

Some of the most common copyleft open source licenses are:

  • GNU GPL (strong copyleft)
  • Eclipse Public License 2.0 (weak copyleft)
  • LGPL (weak copyleft)
  • Common Development and Distribution License, CDDL, (weak copyleft)

And some of the popular permissive open source licenses are:

  • Apache-2.0
  • MIT License
  • Berkeley Software Distribution, BSD

You can check out differences and explanations about each open source license here:

Open source licenses and best practices explained.

Legal disclaimer: This article is not intended to be legal advice. Please seek appropriate professional advice and don’t rely on information here. Holypython.com is not responsible for the correctness of any of the information although utmost effort is spent to ensure sharing valid information.

Recommended Posts