When working with Python, a programmer often encounters situations where she needs to install packages not contained in the Standard Library. In such situations, she must install modules from online repositories using packager installers.
The goal of this article is to help beginners develop a working knowledge of
pip (acronym for “PIP Installs Packages”) as quickly as possible while defining all the prerequisite jargon along the way. In particular, this article aims to make the content of the pip documentation as accessible as possible for beginners by using easier words and emphasizing practical examples.
As you go through the article, feel free to watch my detailed explainer video:
What is pip?
PIP (“Pip Installs Packages”) is the official package managing software for Python which installs packages from PyPI (“Python Package Index”). PyPI contains over 300,000 packages as of November 2021 which is much larger than similar package repositories for Python. PIP allows users to install and uninstall packages, manage dependencies, keep archives of wheel files, amongst many other things.
The purpose of this article is to develop a “working knowledge” of PIP that may become useful while working on Python projects at a basic to intermediate level. In particular, we will talk about the most useful parts of the PIP documentation and provide explanations so to make the system more accessible to the beginner. The article will assume that the user is working on MacOS, but the commands for Windows can be obtained through minor modifications.
Note on Pip vs. Conda
A popular alternative to PIP is Conda, which is a package managing software aimed for data analysis. We will highlight three key differences to give you a sense of which you may prefer to use. For a more extensive discussion, see the official Anaconda blog page or StackOverflow.
1) Virtual Environments. Virtual environments are isolated Python environments used for Python projects. Because many Python projects depend on having specific versions of packages installed in the environment, projects may be broken when globally installed packages are updated. To prevent this, virtual environments are created so that projects can be run in the same environment, every time they need to be implemented.
PIP has several virtual environment builders such as
venv. (See Chris’s article for a more detailed discussion.) In contrast, Conda has a built-in virtual environment manager. (This can be managed through a GUI if you install Anaconda Navigator.) In this respect, Conda may be easier to use for beginning coders.
2) Availability of Packages. As noted before, PyPI boasts over 300,000 packages in contrast to around 7000 packages in the Anaconda repository. Although PyPI packages can be installed through Conda, they often lead to complications, and mixing the two should generally be avoided. (For more details, see the official Anaconda blog page). Many popular Python packages (
pandas to name a few) are available through Conda, but when working on Python projects, it is not uncommon for developers to come across packages that are only available through PyPI.
3) Languages. While PIP only deals with Python packages, Conda can install packages written in other languages such as R or C. This is because Conda is aimed toward data science tasks.
Part I: How to install packages using pip?
In this section, we will look at how to install packages and manage dependencies using pip.
To install packages on pip from PyPI, open up your terminal and use the command:
pip install matplotlib
pip is replaced with
python -m pip in the PIP documentations.
-m flag searches the
sys.path for the
pip module and executes it as an executable file. Some systems require that you use
python -m pip. For this article, we will just use
The install command installs all of the package’s dependencies, which is to say it installs all the necessary packages for the desired package to install properly. For instance,
cycler, amongst many others whereas NumPy has none. Dependency resolution is a major topic in using
There are various other sources from which you can install packages.
Requirement Files. Requirement files are
.txt files that allow users to install packages in bulk, possibly with specifications such as package versions. (See the “Example” in the PyPI documentation to get a sense of what the contents of the file should look like.) Many of the
pip commands have options that make outputs suitable for requirement files.
You can use the
pip install command to install from requirement files. To do this, navigate to the appropriate directory on the terminal (using the terminal command
cd). Then use the following PIP command:
pip install -r requirements.txt
Instead of navigating to the directory on terminal, you could use the absolute path of the file:
pip3 install -r /Users/username/Desktop/requirements.txt
VCS Projects. Many Python packages are available through VCS repositories (such as GitHub) as well. The following example is if you wanted to install Django from their GitHub repository:
pip install git+https://github.com/django/django.git#egg=django
Wheel and Tarball File. The pip install command can also be used to install from local wheel (
.whl) and tarball (
.tar.gz) files. (Read this Medium article and StackOverflow post on their differences.)
The syntax is similar to before. Navigate to the directory where the files are located using the change directory (
cd) command on terminal. For example, to install the
tea package from a
whl file, use:
pip install tea-0.1.6-py3-none-any.whl
To install the
tea package using
pip install tea-0.1.6.tar
The uninstall command is fairly self-explanatory. It allows users to uninstall packages. For instance, if you were to uninstall the tea package using
pip, then use:
pip uninstall -y tea
You can (optionally) add
-y as above to prevent the program from asking for confirmation.
To uninstall multiple packages at once, you can list the packages in a
requirements.txt file (much like we did for
pip install), and use the following command:
pip uninstall -r requirements.txt
The check command allows users to check for any broken dependencies, i.e. if there are any packages that depend on other packages that are not installed in the environment. The syntax is as follows:
The show command lists all the relevant information for a particular package. For instance, if you want to know where Django is installed on your device or if you want to know its package dependencies, you can use:
pip show django
For which you can get the output:
Name: Django Version: 3.0 Summary: A high-level Python Web framework that encourages rapid development and clean, pragmatic design. Home-page: https://www.djangoproject.com/ Author: Django Software Foundation Author-email: firstname.lastname@example.org License: BSD Location: /Users/user_name/Library/Python/3.8/lib/python/site-packages Requires: pytz, sqlparse Required-by:
To list all the packages available in your environment, use the
pip list command:
For which you may get the output:
Package Version -------------- ------- pip 19.2.3 setuptools 41.2.0 setuptools-scm 6.3.2 six 1.15.0 sqlparse 0.4.2 tea 0.1.6 tomli 1.2.2 tzlocal 3.0 wheel 0.33.1
What if the user wanted to uninstall all the packages except the bare essentials? You can obtain a list of packages that are not dependencies of installed packages using:
pip3 list --format freeze --not-required
The option “
--format freeze” puts the list in a format compatible with a
pip==19.2.3 setuptools-scm==6.3.2 six==1.15.0 sqlparse==0.4.2 tea==0.1.6 wheel==0.33.1
Now the user can copy the above into a
requirements.txt file, delete the names of files that the user wants to keep, and use
pip uninstall -r requirements.txt
to uninstall all the rest.
The freeze command outputs a list of packages installed in the environment in a package suitable for requirement files. The syntax is as follows:
freeze command is useful for copying all the packages from environment A to environment B. First run
freeze in environment A, and store the contents in a
pip freeze > requirements.txt
The file gets stored in the current directory (which you can check using
pwd command on terminal). Then go to environment B. (If A and B are virtual environments, deactivate A and activate B on terminal using commands from whichever virtual environment manager is being used.) Then install the packages in the requirements file using install:
pip install -r requirements.txt
Part II: Distribution Files
In this section, we will discuss how to download and manage distribution files for Python packages.
Distribution files are compressed files containing various files necessary to implement the Python library. See this medium article on an extensive discussion on the different types of distribution files. We just need to know the following in order to understand the rest of this section:
.whl) Wheel files are essentially zip files containing everything necessary to install packages in your local environment. They are generally faster to download and install compared to tarballs. For more details, see this article from RealPython.org and this article from PythonWheels.com.
A “built” distribution file is in a format that is ready to install, thereby making the whole installation process faster.
.tar.gz) Tarballs are types of source distributions that contain both python codes and codes for any extension modules for the package.
Wheel files are the preferred format for installations using pip. See this stackoverflow post on a discussion on wheels versus tarballs.
pip install command, the
pip download command downloads the necessary distribution files from repositories (e.g. for offline installation), but does not install the packages from the downloaded files. As such, the command supports many of the options that the install command does.
For instance, if you were to download the distribution file, we would use the following syntax:
pip download numpy
wheel command allows users to build
wheel files. Since the command outputs wheel files, its behavior is very similar to the
download command. The main difference between the two is that the
wheel command is intended for building wheel files whereas the download command is for downloading them from the web. See this stackoverflow discussion on their differences.
To build a wheel file for the standalone module, use:
pip wheel standalone
Much like the
wheel also supports requirement files:
pip wheel -r requirements.txt
pip has a built-in cache system for keeping distribution files downloaded from repositories. Whenever
pip is used to install a package, the
wheel files in the cache are preferred over downloading new distribution files from the repository. This helps the whole installation process faster as well as reduces traffic to repositories.
pip cache command allows users to interact with pip’s wheel cache. There are several things you can do with it:
Show file path to the directory of all cache files:
pip cache dir
Show various information regarding the cache, such as the number of files and size of the cache:
pip cache info
List the file names in a pip cache:
pip cache list
To see a list of file paths for wheel files of specific packages, use:
pip cache list numpy --format==abspath
To remove specific packages from the cache, use:
pip cache remove numpy
Finally, to clear the whole cache:
pip cache purge
A hash value is a value assigned to a file that changes if the file is altered. Since anyone can upload packages to
pypa, there may be tampered packages in the repository, at least in principle. Hash values allow users to check whether files have been tampered with or not.
To generate a hash value for a
wheel file, use:
python -m pip hash tea-0.1.7-py3-none-any.whl
There are different algorithms for computing hash values. On
pip, you can choose from
python -m pip hash -a 'sha256' tea-0.1.7-py3-none-any.whl
Running this, the output is:
We can compare this to the hash code available on PyPI to confirm that it is indeed the correct file.
Here are some other commands listed in the pip documentation.
config command allows users to interact with the configuration file (
pip.conf) from terminal. The configuration files are located in standardized locations depending on the platform (see “Location” in the documentation), and most of what can be done by the config command can be done by opening the configuration file in a text editor and editing its contents. An easy way to open the configuration file is to use the following terminal commands:
This will print out the locations for the
pip.conf file on your system. If you wanted to open the global configuration file, then you can use:
open /Library/Application\ Support/pip/pip.conf
(Notice that the space character has been escaped. Otherwise, the terminal will return an error.)
Alternatively, you can use the
pip config --user edit
(For this to work, the
$EDITOR environment variable needs to be set to the executable file of your favorite plain text editor. See this stackoverflow post for how to do this.)
Configuration File. The configuration files determine the default behavior of
pip commands. There are three levels to configuration files. The global files determine
pip‘s behavior throughout the system, the user files determine the behavior for the user, and finally, the site file determines the behavior depending on the virtual environment.
Let’s look at what the contents of a configuration file should look like. If you wanted the output of the list command to be in freeze format, then you can put the following in the user configuration file:
[list] format = freeze
There are several ways of viewing the content of config files using
pip. If you want to see the contents of the user config file, use the following command:
pip config --user list
In the case of the configuration file we defined above, we will see the following output:
list.format = freeze
When using the
config command, command behavior is assigned using variables given in the form “
command.option”. (This is what is meant by “
name” in the pip documentation.)
If you wanted to see the contents of all of the configuration files at once (along with other information concerning the configuration files), you can use debug command:
pip config debug
You can display, set, and delete individual variables from the terminal as well. To display the contents of the variable, use the
pip config --user get list.format
To delete the value for a variable (e.g. reset
list.format to its default value), then use the
pip config --user unset list.format
If you want to set a value to the variable (e.g. you want to set the format back to
freeze), use the
pip config --user set list.format freeze
debug command outputs information about the system that may be useful for debugging, such as the versions for
python, where the executable is located etc:
pip search command allowed users to search for PyPI packages using a query. However, the command has been permanently disabled as of March 2021.
Finally, note that much of the content in the documentation and this blog article is available through the
pip help command. For instance, if the user forgets the syntax for
config, then use:
pip help config
This command provides the syntax for the config command as well as all the possible options associated with the command.