Uploading to PyPI

I found myself within a forest dark, For the straightforward pathway had been lost…

Edit: I have been kindly informed that there is, in fact, an official guide for uploading packages to PyPI, located here.

I’m a research assistant in a Speech Informatics lab at the University of Minnesota. One of my tasks recently has been to take a Python script written by a former student, make it more flexible and turn it into a legit Python package. I spent a few weeks working on the code, making sure it worked as we needed it to, de-duplicating code, validating it, etc. It installed fine locally. Then two days ago I tried to register the package at pypi.python.org so we could install the package using pip, along with the rest of our analysis code.

I think I have it working now, but honestly, it’s been a terrible experience. I’ve had to pull resources from documentation, message boards (both stackexchange and google groups), blogs, and other assorted places on the internet. Why isn’t this all in one place somewhere? In the end, many of the things that worked ended up being the results of trial-and-error. Part of the problem is that my package is a little complicated: it requires including some datasets that should be installed in the same directory. It also requires compiling a stand-alone C file. But really, that shouldn’t be a big deal.

In order to document the process in case I have to repeat it in the future, and to help guide any poor soul who is embarking on this perilous journey with no prior experience, I’m writing up my notes in detail and sharing them with the world.

I’ve been using Python for various projects for a few years now. Still, I’m a scientist, not a software engineer, so many of the following specifics were not obvious to me. Apologies if something that follows is not the most efficient way to do something - I’m simply writing the documentation I wish I’d had two days ago in hopes that it might be useful to others.

Package structure

This has always been a little confusing to me, but this project has cleared it up a little bit. Your package should be structured something like this:

YOUR-PROJECT-FOLDER
├── CHANGES.txt (OPTIONAL)
├── LICENSE.txt
├── MANIFEST.in
├── README
├── docs (FOLDER)
├── setup.py
└── PACKAGENAME (FOLDER)
    ├── __init__.py
    ├── Makefile (OPTIONAL)
    ├── FILE1.py
    ├── FILE2.py
    ├── data (FOLDER, OPTIONAL)
    │   ├── included_data.dat
    └── example (FOLDER)
        └── EXAMPLE.txt

A few important things here are:

  • the name of the project folder on your computer doesn’t really matter. It doesn’t have to be the same as the package name. To me, it’s less confusing if the two names are different.

  • the PACKAGENAME folder is the name that you’ll use to import your package into Python. So if your package is my_awesome_package, the folder with your code needs to be my_awesome_package too. Note that Pythonistas recommend having one-name package names with no capital letters.

Now for the key files:

  • CHANGES.txt: (optional) keep notes on package versions, etc.

  • LICENSE.txt: The license that you’re using for releasing your package. If you want to share it and want people to use it, you should include a license. As far as I understand, not including one is the same thing as claiming “All Rights Reserved”, meaning people can’t legally re-share your code. Disclaimer: I am not a lawyer, and I don’t know a ton about the available licenses.

  • MANIFEST.in: When setup.py builds your package, it includes *.py files in your package folder by default. If you want any other files included in the .tar.gz file that gets created and uploaded to PyPI (more on this later), you need to include those filenames in MANIFEST.in. Here’s what the contents of mine looks like:

#documentation
recursive-include docs/_build/html *

#data used for clustering
recursive-include vfclust/data *

#phonetic representation
include vfclust/t2p/t2p.c
include vfclust/t2p/t2pin.tmp
include vfclust/Makefile

#example files
include vfclust/example/EXAMPLE.*
include vfclust/example/EXAMPLE_sem.*

#Misc
include CHANGES.txt
include LICENSE.txt

My package name is vfclust, so I’m including the contents of certain folders within the package folder that my package requires in order to run (data files, etc). I also include CHANGES.txt and LICENSE.txt, since I don’t think those are included by default (?). Finally, notice the recursive-include syntax. The way I wrote it, everything in the specified folders is included as well.

  • README: can also be README.md or README.rst. The format needs to be rst (reStructured Text) for it to display appropriately on PyPI. I go into this further down.

  • docs: (optional) this is where I put the formatted documentation. More on this later.

  • setup.py: This is where the magic happens - where your package is defined, how it’s setup, etc. Much more on this later.

  • PACKAGENAME: This is where your actual package code goes. The FILE1.py, FILE2.py, etc. are the package files that do the work of your package.

  • the __init__.py file tells the world that PACKAGENAME is a Python package that is importable using >> import PACKAGENAME. This is important: __init__.py turns a folder full of Python scripts into a package that can be imported. It can be an empty file, or not, but whatever is inside will get executed when you import the package. If you want to use the other modules (files) in the folder from within Python, and you probably do, put from PACKAGENAME import * inside the __init__.py file.

  • data and example: These folders could be named anything, or could be omitted. Any data files you want to distribute along with your package should be in a folder like this WITHIN the package folder.

.tar.gz vs site-packages

This was a serious source of confusion for me. When you build your package (more on this below), setup.py creates a dist/ directory in your project directory and puts everything on the specified packages (more on this below also) along with everything in the MANIFEST.in file in a single .tar.gz file in that directory. You can then upload the .tar.gz file to PyPI and make it available there. However, when you use pip install PACKAGENAME, only *.py files will be installed into site-packages (where pip puts packages you download). This means that any data files, C files, or anything else you want available when using your package must be explicitly included in BOTH MANIFESET.in AND setup.py. As far as I understand, there are things (like formatted documentation) that you may want to include with the full *.tar.gz package but might not want to bury within site-packages. This makes some sense, but it’s still somewhat obnoxious to have to specify included files in two places. The setuptools documentation says that you only have to specify files in setup.py, but that didn’t work for me.

setup.py

This file is like magic, and is key to creating and installing your package, as well as keeping track of the version and formatting for PyPI. There are two commonly-used tools for creating your setup.py file: distutils and setuptools. The internet tells me that setuptools is more modern and fixes some fo the problems with distutils, but I don’t really understand what. The syntax for both is nearly identical, so it’s easy to switch back and forth.

The setup.py file must, at a minimum, include something like the following. This is modified from this project, which I found to be a helpful resource.

from setuptools import setup, find_packages  # Always prefer setuptools over distutils
from codecs import open  # To use a consistent encoding
from os import path

here = path.abspath(path.dirname(__file__))

# Get the long description from the relevant file
with open(path.join(here, 'README'), encoding='utf-8') as f:
    long_description = f.read()

setup(
    name='sample',

    # Versions should comply with PEP440.  For a discussion on single-sourcing
    # the version across setup.py and the project code, see
    # http://packaging.python.org/en/latest/tutorial.html#version
    version='1.2.0',

    description='A sample Python project',
    long_description=long_description,  #this is the

    # The project's main homepage.
    url='https://github.com/whatever/whatever',

    # Author details
    author='yourname',
    author_email='your@address.com',

    # Choose your license
    license='MIT',

    # See https://PyPI.python.org/PyPI?%3Aaction=list_classifiers
    classifiers=[
        # How mature is this project? Common values are
        #   3 - Alpha
        #   4 - Beta
        #   5 - Production/Stable
        'Development Status :: 3 - Alpha',

        # Indicate who your project is intended for
        'Intended Audience :: Developers',
        'Topic :: Software Development :: Build Tools',

        # Pick your license as you wish (should match "license" above)
        'License :: OSI Approved :: MIT License',

        # Specify the Python versions you support here. In particular, ensure
        # that you indicate whether you support Python 2, Python 3 or both.
        'Programming Language :: Python :: 2.7',
    ],

    # What does your project relate to?
    keywords='sample setuptools development',

    packages=["MY-PACKAGE"],

)

setup() is just a function called when you run python setup.py install within the project directory. Install is only one of the things it can do - you can also python setup.py sdist to build your package into a .tar.gz file, python setup.py develop to tell Python to look in the project directory rather than site-packages when doing import, etc.

I’ll go through these setup() arguments and some others in turn.

  • name: This is what your package will be called, in big bold letters, on the new PyPI page for your package. Pick something you like. This will also be the root name for the .tar.gz files created. Those are formatted like PACKAGENAME-VERSION.tar.gz. Which brings us to…

  • version: This is the version number, obviously. Note that PyPI forces you to make a new version for each new upload. So you MUST change this for any new upload to PyPI (unlike git or other version control systems, where you can make a new commit without worrying about version numbers). I ended up with version numbers that look like 0.1.0.12 until I got finished debugging my package on PyPI. Luckily you can go on PyPI and remove unwanted versions.

  • description: Short description used on PyPI.

  • long_description: Written on main page of your PyPI package, meaning it should be formatted as reStructured text. I just used my README file.

  • author etc.: Used to list authors, contact information, etc on PyPI.

  • classifiers: Used to categorize your package on PyPI, so people can find it while browsing or searching.

  • packages: Your distribution may have more than one package. That is, when you install this package, you may want to be able to do import package1 and import package2. You can also packages and sub-packages. All of these should be listed here. You can also do something like:

packages = find_packages(exclude=['build', 'docs', 'templates'])

Those are the basics. The next few sections detail other arguments I ended up using.

Include data files

  • package_data: As mentioned earlier, if you want any non *.py files to be installed by pip and available to your package, specify them here. Messing around with this led me to the conclusion that all included data must in subfolders of the package folder (NOT the higher-level project folder). For example, if I had “data” and “my-package” as subfolders in “my-project-folder”, and then include “data” in package_data, then pip installs “data” and “my-package” both as separate folders in “site-packages”. Not really want I want. After much fiddling, my directory structure looks like this:
.
├── CHANGES.txt
├── LICENSE.txt
├── MANIFEST.in
├── README
├── docs (COLLAPSED)
├── setup.py
└── vfclust
    ├── Makefile
    ├── TextGridParser.py
    ├── __init__.py
    ├── data
    │   ├── EOWL
    │   │   ├── EOWL Version Notes.txt
    │   │   ├── The English Open Word List.pdf
    │   │   └── english_words.txt
    │   ├── animals_lemmas.dat
    │   ├── animals_names.dat
    │   ├── animals_names_raw.dat
    │   ├── animals_term_vector_dictionaries
    │   │   ├── term_vectors_dict91.dat
    │   │   └── term_vectors_dict91_cpickle.dat
    │   ├── cmudict.0.7a.tree
    │   ├── modified_cmudict.dat
    │   └── t2pin.tmp
    ├── example
    │   ├── EXAMPLE.TextGrid
    │   ├── EXAMPLE.csv
    │   ├── EXAMPLE_sem.TextGrid
    │   └── EXAMPLE_sem.csv
    ├── t2p
    │   ├── t2p.c
    │   └── t2pin.tmp
    └── vfclust.py

Here’s what I ended up with:

include_package_data=True,
    # relative to the vfclust directory
    package_data={
        'vfclust':[
             'Makefile'],
        'data':
             ['data/animals_lemmas.dat',
             'data/animals_names.dat',
             'data/animals_names_raw.dat',
             'data/cmudict.0.7a.tree',
             'data/modified_cmudict.dat',
             'data/animals_term_vector_dictionaries/term_vectors_dict91_cpickle.dat',
             ],
        'data/EOWL':
            ['data/EOWL/english_words.txt',
             'data/EOWL/EOWL Version Notes.txt',
             'data/EOWL/The English Open Word List.pdf'
            ],
        'example':
            ['example/EXAMPLE.csv',
             'example/EXAMPLE.TextGrid',
             'example/EXAMPLE_sem.csv',
             'example/EXAMPLE_sem.TextGrid'],
        't2p':
            ['t2p/t2p.c',
             't2p/t2pin.tmp'
             ],
    },

As far as I can tell, package_data is a dictionary where each key corresponds to a subfolder of the installed package, and each value is supposed to be a list of the files you want to go into that subfolder. In practice, however, I couldn’t get everything situated properly in the final installation until I made all my data files subfolders of the package folder. If you know a better and more flexible way to do this, please enlighten me (not being sarcastic here, really do e-mail me! I want to know!). With this setup, the following gets installed in site-packages when I do pip install vfclust:

vfclust
├── Makefile
├── TextGridParser.py
├── __init__.py
├── data
│   ├── EOWL
│   │   ├── EOWL Version Notes.txt
│   │   ├── The English Open Word List.pdf
│   │   └── english_words.txt
│   ├── animals_lemmas.dat
│   ├── animals_names.dat
│   ├── animals_names_raw.dat
│   ├── animals_term_vector_dictionaries
│   │   ├── term_vectors_dict91.dat
│   │   └── term_vectors_dict91_cpickle.dat
│   ├── cmudict.0.7a.tree
│   ├── modified_cmudict.dat
│   └── t2pin.tmp
├── example
│   ├── EXAMPLE.TextGrid
│   ├── EXAMPLE.csv
│   ├── EXAMPLE_sem.TextGrid
│   └── EXAMPLE_sem.csv
└── t2p
    ├── t2p
    ├── t2p.c
    └── t2pin.tmp

which is what I wanted. data_files is another argument used to include data, but this puts your data files in /Library/Frameworks/Python.framework/Versions/2.7/my-subfolder, at least on my system. I don’t know why you’d want to put data there. At any rate I didn’t.

Run your package as a script

If it makes sense for your package to run as a command-line script instead of (or as well as) an importable Python package, you can ask setup.py to add your script to the system path during installation. To do this, include an entry_points argument, similar to this:

entry_points={
       'console_scripts': [
           'vfclust = vfclust.vfclust:main',
       ],
    }

If I understand correctly, the thing on the left of the equals sign is what you call from the command line,

$ vfclust args-go-here

and the thing on the right is the corresponding python script that runs. In my case, I have a main() function in the vfclust.py file in the vfclust package folder (the one with the __init__.py in it as well).

Makefile

One of my package’s functions relies on a compiled C file to convert words from a textual to a phonetic representation. This isn’t a Python extension, but rather a completely separate C file that I want to be able to call from the command line. I wanted it to compile when the package was being installed by pip, so it would be compiled appropriately on the host system. There are ways for doing this for Python extensions, but as this file technically isn’t an extension, getting this to work was slightly complicated.

Essentially, I took the advice from this site (thanks!) and made a new class, that then asked subprocess to call make. When I ended up with was the following:

from os import path
import subprocess
from setuptools.command.install import install
here = path.abspath(path.dirname(__file__))

class MyInstall(install):
    def run(self):
        try:
            # note cwd - this makes the current directory
            # the one with the Makefile.
            subprocess.call(['make'],cwd=path.join(here,'vfclust'))
        except Exception as e:
            print e
            print "Error compiling t2p.c.   Try running 'make'."
            exit(1)
        else:
            install.run(self)

along with a corresponding entry in the setup() function:

cmdclass={'install': MyInstall},

A few things to note. First, cmdclass={'install': MyInstall} tells setup() to run the code. Not quite sure how it works. Second, subprocess.call(['make'],cwd=path.join(here,'vfclust')) is tricky: to run make properly, I had to have subprocess run it from the folder it’s in. It resides in a different folder from setup.py (I wanted to include it was a ‘data’ file in the pip installation), so I had to specify the subfolder (named vfclust). here = path.abspath(path.dirname(__file__)) gives the path of the setup.py folder on the system, and path.join(here,'vfclust') creates an absolute path to the subfolder.

Complete setup.py

My final, really long setup.py file looks like this, again using a file modified from the one here:

from setuptools import setup, find_packages  # Always prefer setuptools over distutils
from codecs import open  # To use a consistent encoding
from os import path
import subprocess
from setuptools.command.install import install

here = path.abspath(path.dirname(__file__))

# Get the long description from the relevant file
with open(path.join(here, 'README'), encoding='utf-8') as f:
    long_description = f.read()

#see http://www.niteoweb.com/blog/setuptools-run-custom-code-during-install
#This is so we can call "make" automatically during setup
class MyInstall(install):
    def run(self):
        try:
            subprocess.call(['make'],cwd=path.join(here,'vfclust'))
        except Exception as e:
            print e
            print "Error compiling t2p.c.   Try running 'make'."
            exit(1)
        else:
            install.run(self)

setup(
    name='VFClust',

    # Versions should comply with PEP440.  For a discussion on single-sourcing
    # the version across setup.py and the project code, see
    # http://packaging.python.org/en/latest/tutorial.html#version
    version='0.1.0',

    description='Clustering of Verbal Fluency responses.',
    long_description=long_description,

    # The project's main homepage.
    url='https://github.com/speechinformaticslab/vfclust',

    # Author details
    author='Thomas Christie, James Ryan, Kyle Marek-Spartz and Serguei Pakhomov',
    author_email='tchristie@umn.edu',

    # Choose your license
    license='Apache License, Version 2.0',

    # See https://PyPI.python.org/PyPI?%3Aaction=list_classifiers
    classifiers=[
        # How mature is this project? Common values are
        #   3 - Alpha
        #   4 - Beta
        #   5 - Production/Stable
        'Development Status :: 3 - Alpha',

        # Indicate who your project is intended for
        "Intended Audience :: Developers",
        "Intended Audience :: Healthcare Industry",
        "Intended Audience :: Science/Research",
        "Topic :: Multimedia :: Sound/Audio :: Speech",
        "Topic :: Scientific/Engineering :: Bio-Informatics",
        "Topic :: Scientific/Engineering :: Medical Science Apps.",
        "Topic :: Software Development :: Libraries",
        "Topic :: Text Processing :: Linguistic",

        # Pick your license as you wish (should match "license" above)
        "License :: OSI Approved :: Apache Software License",

        # Specify the Python versions you support here. In particular, ensure
        # that you indicate whether you support Python 2, Python 3 or both.
        'Programming Language :: Python :: 2.7',
        'Programming Language :: C'
    ],

    # What does your project relate to?
    keywords='bioinformatics speech linguistics',

    # You can just specify the packages manually here if your project is
    # simple. Or you can use find_packages().
    packages = find_packages(exclude=['build', '_docs', 'templates']),
    #package_dir={'vfclust': 'vfclust'},

    # List run-time dependencies here.  These will be installed by pip when your
    # project is installed. For an analysis of "install_requires" vs pip's
    # requirements files see:
    # https://packaging.python.org/en/latest/technical.html#install-requires-vs-requirements-files
    #install_requires=['NLTK'],

    # If there are data files included in your packages that need to be
    # installed in site-packages, specify them here.  If using Python 2.6 or less, then these
    # have to be included in MANIFEST.in as well.
    include_package_data=True,
    # relative to the vfclust directory
    package_data={
        'vfclust':[
             'Makefile'],
        'data':
             ['data/animals_lemmas.dat',
             'data/animals_names.dat',
             'data/animals_names_raw.dat',
             'data/cmudict.0.7a.tree',
             'data/modified_cmudict.dat',
             'data/animals_term_vector_dictionaries/term_vectors_dict91_cpickle.dat',
             ],
        'data/EOWL':
            ['data/EOWL/english_words.txt',
             'data/EOWL/EOWL Version Notes.txt',
             'data/EOWL/The English Open Word List.pdf'
            ],
        'example':
            ['example/EXAMPLE.csv',
             'example/EXAMPLE.TextGrid',
             'example/EXAMPLE_sem.csv',
             'example/EXAMPLE_sem.TextGrid'],
        't2p':
            ['t2p/t2p.c',
             't2p/t2pin.tmp'
             ],
    },

    #run custom code
    cmdclass={'install': MyInstall},

    # Although 'package_data' is the preferred approach, in some case you may
    # need to place data files outside of your packages.
    # see http://_docs.python.org/3.4/distutils/setupscript.html#installing-additional-files
    # In this case, 'data_file' will be installed into '<sys.prefix>/my_data'
    # data_files=[
    #     ('my_data',['data_file']),
    # ],

    # To provide executable scripts, use entry points in preference to the
    # "scripts" keyword. Entry points provide cross-platform support and allow
    # pip to create the appropriate form of executable for the target platform.
    entry_points={
       'console_scripts': [
           'vfclust = vfclust.vfclust:main',
       ],
    }

)

Documentation

Many sophisticated tools exist to help with code documentation. Unfortunately, their sophistication means that learning how to use them is a headache for the uninitiated. Here are some lessons I learned the hard way.

Readme

Your package’s PyPI page will only display your README file properly if it’s formatted as reStructuredText, not Markdown. I’d originally written it in Markdown so it would appear properly on Github. Bummer. Luckily, there’s a handy Python package that converts formats automatically. Do

pip install pypandoc

Then navigate in the terminal to the location of your README file, launch Python, and do:

import pypandoc

#converts markdown to reStructured
z = pypandoc.convert('README','rst',format='markdown')

#writes converted file
with open('README.rst','w') as outfile:
    outfile.write(z)

If your document was valid Markdown, you should now have a properly formatted reStructured version of your README in the project folder.

Docstrings

It’s common to write relatively verbose documentation for your functions and methods in Python code so other people can read and improve on your code. This documentation is written as “docstrings”, which are strings of one or more lines written directly after the function/method declaration:

def my_function():
    ''' This is my docstring.'''

This is convenient for those reading the code. However, if you write your docstrings in reStructured format, you can use a tool called Sphinx to automatically create HTML/PDF/epub versions of your documentation that includes both the content of the README file and the docstrings, formatted in pretty, easy-to-read format.

It’s possible to include test code and other complexities in the docstring, but the simple version looks like this:

def compute_similarity_score(self, word1, word2):
    """ Returns the similarity score between two words.

        :param str word1: First word to be compared.
        :param str word2: Second word to be compared.
        :return: Number indicating degree of similarity of the two input words.
            The maximum value is 1, and a higher value indicates that the words
            are more similar.
        :rtype : Float

        Longer description of the function, how it works, etc.
    """

This is probably more verbose than needed (PEP guidelines suggest that you don’t state the obvious in your docstrings), but I’m leaving it like this for demonstration purposes. A few things to note:

  • The docstring is in triple quotes, directly following the function declaration.

  • The last line with “”” doesn’t have any other text on it.

  • the :param TYPE VARIABLE: syntax is a reStructured way of declaring your input/output variables. You can read more about the options here.

reStructured Text can be very complicated and can interact with your code in lots of different ways. I just stuck to the minimum knowledge necessary to make reasonable documentation for a small project.

HTML Documentation

So far, we have README and *.py files. Since we wrote our README and docstrings in reStructured text, we can now create a ready-made HTML/PDF version of the documentation using software called Sphinx. I found Sphinx to be very complicated, but here are basic instructions to get you going (using OS X, at least)

Install Sphinx:

$ easy_install -U sphinx

Make a folder for documentation, and go there:

$ mkdir docs
$ cd docs

Run Sphinx:

$ sphinx-quickstart

You’re already in docs/, so leave the first question as the default. Answer the next few questions by filling in your name, project name, etc. I left all the rest at their default values, EXCEPT

autodoc: automatically insert docstrings from modules (y/n) [n]: y

Selecting y here lets Sphinx pull information from your docstrings, as promised above, and make it into a nice little easy-to-read version in HTML or whatever.

Sphinx made a Makefile in the docs/ directory that will generate your documentation for you. You can type things like make html or `make latexpdf to generate documentation. If you try that now (as I did), your documentation will be empty. Another bummer.

Tell Sphinx to generate your documentation by pulling from your docstrings.

$ sphinx-apidoc -o <dest-directory> <source-directory>

You should already be in your “docs” directory. My “docs” directory is at the same level as my package directory with my nice docstrings, so I did:

$ sphinx-apidoc -o . ../vfclust

Finally, edit the new index.rst file to change from:

Welcome to VFClust's documentation!
===================================

Contents:

.. toctree::
   :maxdepth: 2



Indices and tables
==================

* :ref:`genindex`
* :ref:`modindex`
* :ref:`search`

to

Welcome to VFClust's documentation!
===================================

.. include:: ../README

Contents:

.. toctree::
   :maxdepth: 2



Indices and tables
==================

* :ref:`genindex`
* :ref:`modindex`
* :ref:`search`

Your project name will be different, but the important part is the .. include:: ../README line, assuming your README is located in the root project directory (one up from “docs”). That’ll put the README as the landing page of your documentation. You can click on the “Index”, “Moduel Index” or “Search” links to see documentation for your functions/methods.

Generate documentation. Finally! Type:

$ make

to see the available options for output formatting. Create your HTML documentation using:

$ make html

This will generate a set of html files in “docs/_build/html/”, assuming you didn’t change the “_build” default during questioning. Open up “docs/_build/html/index.html” to see your nice new documentation.

PyPI

Alright, the package has the correct directory structure, the documentation is formatted, and we’re ready to share it with the world so that anyone can install it using pip install mypackage.

Make a test-build

Navigate to the directory of your setup.py file. First, make sure everything is configured properly using

$ python setup.py test

Hopefully you get no errors.

Then, create the distribution you’re going to send to PyPI using:

$ python setup.py sdist

This creates a .tar.gz package of your source files. It also creates a new subfolder in your project folder called dist/, and puts the .tar.gz file in there. You should extract the folder and examine its contents - this is what will be downloadable from PyPI.

Register with PyPI

You can do this via the command line, but you can also just go to https://pypi.python.org/pypi and make a username and password for yourself. While you’re at it, make an account at https://testpypi.python.org/pypi. We’re going to use the second site for testing purposes. This is optional, but given the trouble I had getting pip to do what I thought it should, I personally recommend it.

Use your favorite text editor to open up your new ~/.pypirc file, created by setup.py. Make it look something like this:

[distutils]
index-servers =
    pypi
    test

[pypi]
username:yourname
password:yourpassword

[test]
repository:https://testpypi.python.org/pypi
username:yourname
password:yourpassword

You’re going to need to add the test repository entry for testing. Now register your package with PyPI. Make sure you’re in your root project directory (the one with the setup.py file), and type:

$ python setup.py register
$ python setup.py register -r https://testpypi.python.org/pypi

This sends the metadata for your package to PyPI.

Install twine

When you upload your package to PyPI, your login information is sent plaintext. Not great. To solve this, install the twine package.

$ pip install twine

Create a test upload

This has two steps: first, package the file using sdist. Then, upload that file to the server.

$ python setup.py sdist #creates .tar.gz, puts it in dist/ folder
$ twine upload -r test dist/PACKAGENAME-VERSION.tar.gz

The -r test uploads it to the server named in the [test] part of the .pypirc file.

Create a test install

Python has a neat way of making sandboxed environments for you to work in, called virtual environments. I’d suggest creating a new one to test your installation, thereby guaranteeing a fresh install.

Navigate to some folder on your hard drive and type

$ virtualenv MY_VIRTUALENV

where you obviously use some other name than MY_VIRTUALENV. Then fire it up by typing

$ cd MY_VIRTUALENV
$ source bin/activate

You’re now in a little sandboxed Python installation. Install your package from the PyPI test servers:

$ pip install -i https://testpypi.python.org/pypi PACKAGENAME

You may need to use sudo, I can’t seem to figure out why. Anyway, your package should now be installed! Test it out:

>> import PACKAGENAME

If you included a way to run it as a script in setup.py, that should work now too.

Upload for real

If everything seems to be working (good for you if it is, it took me 2 days to get to this point), you can upload it to the PyPI servers:

$ twine upload dist/PACKAGENAME-VERSION.tar.gz

And voila, you’re live! Now you can go to https://pypi.python.org.pypi and search for your package. You should also be able to install it using

$ pip install PACKAGENAME

See, nothing to it! (Haha…ha…) I hope this was helpful for someone!