What is the best way to install Python packages?

What is the best way to install Python packages in Ubuntu 11? I am a recent convert to Ubuntu and want to learn best practices.

For context, I am looking to install the tweeststream package, but I did not see it in my Synaptic package manager. Also, I am very new to programming, but I usually can follow along with code samples.


Solution 1:

updated: 2019-05-11: This post mostly mentions virtualenv, but according to the Python doc about module installation, since Python 3.5 "the use of venv is now recommended for creating virtual environments", while virtualenv is an alternative for versions of Python prior to 3.4.

updated: 2018-08-17: since conda-4.4.0 use conda to activate anaconda on all platforms

updated: 2017-03-27: PEP 513 - manylinux binaries for PyPI

updated: 2016-08-19: Continuum Anaconda Option

This is somewhat a duplicate of easy_install/pip or apt-get.

For global Python packages, use either the Ubuntu Software Center, apt, apt-get or synaptic

Ubuntu uses Python for many important functions, therefore interfering with Python can corrupt your OS. This is the main reason I never use pip on my Ubuntu system, but instead I use either Ubuntu Software Center, synaptic, apt-get, or the newer just apt, which all by default install packages from the Ubuntu repository. These packages are tested, usually pre-compiled so they install faster and ultimately designed for Ubuntu. In addition all required dependencies are also installed and a log of installs is maintained so they can be rolled back. I think most packages have corresponding Launchpad repos so you can file issues.

Another reason to use either Ubuntu packages is that sometimes these Python packages have different names depending on where you downloaded them from. Python-chardet is an example of a package which at one time was named one thing on PyPI and another thing in the Ubuntu repository. Therefore doing something like pip install requests will not realize that chardet is already installed in your system because the Ubuntu version has a different name, and consequently install a new version which will corrupt your system in a minor insignificant way but still why would you do that.

In general you only want to install trusted code into your OS. So you should be nervous about typing $ sudo pip <anything-could-be-very-bad>.

Lastly some things are just easier to install using either Ubuntu packages. For example if you try pip install numpy to install numpy & scipy unless you have already installed gfortran, atlas-dev, blas-dev and lapack-dev, you will see an endless stream of compile errors. However, installing numpy & scipy through the Ubuntu repository is as easy as...

$ sudo apt-get install python-numpy python-scipy

You are in luck, because you are using Ubuntu, one of the most widely supported and oft updated distributions existing. Most likely every Python package you will need is in the Ubuntu repository, and probably already installed on your machine. And every 6 months, a new cycle of packages will be released with the latest distribution of Ubuntu.

If you are 100% confident that the package will not interfere with your Ubuntu system in any way, then you can install it using pip and Ubuntu is nice enough to keep these packages separate from the distro packages by placing the distro packages in a folder called dist-packages/. Ubuntu repository has both pip, virtualenv and setuptools. However, I second Wojciech's suggestion to use virtualenv.

For personal Python projects use pip and wheel in a virtualenv

If you need the latest version, or the module is not in the Ubuntu repository then start a virtualenv and use pip to install the package. Although pip and setuptools have merged, IMO pip is preferred over easy-install or distutils, because it will always wait until the package is completely downloaded and built before it copies it into your file system, and it makes upgrading or uninstalling a breeze. In a lot of ways it is similar to apt-get, in that it generally handles dependencies well. However you will may have to handle some dependencies yourself, but since PEP 513 was adopted there are now manylinux binaries at the Python Package Index (PyPI) for popular Linux distros like Ubuntu and Fedora. for example as mentioned above for NumPy and SciPy make sure you have installed gfortran, atlas-dev, blas-dev and lapack-dev from the Ubuntu repository For example, both NumPy and SciPy are now distributed for Ubuntu as manylinux wheels by default using OpenBLAS instead of ATLAS. You can still build them from source by using the pip options --no-use-wheel or --no-binary <format control>.

~$ sudo apt-get install gfortran libblas-dev liblapack-dev libatlas-dev python-virtualenv
~$ mkdir ~/.venvs
~$ virtualenv ~/.venvs/my_py_proj
~$ source ~/.venvs/my_py_proj/bin/activate
~(my_py_proj)$ pip install --no-use-wheel numpy scipy

See the next section, "You're not in sudoers", below for installing updated versions of pip, setuptools, virtualenv or wheels to your personal profile using the --user installation scheme with pip. You can use this to update pip for your personal use as J.F. Sebastian indicated in his comment to another answer. NOTE: the -m is really only necessary on MS Windows when updating pip.

python -m pip install --user pip setuptools wheel virtualenv

Newer versions of pip automatically cache wheels, so the following is only useful for older versions of pip. Since you may end up installing these many times, consider using wheel with pip to create a wheelhouse. Wheel is already included in virtualenv since v13.0.0 therefore if your version of virtualenv is too old, you may need to install wheel first.

~(my_py_proj)$ pip install wheel  # only for virtualenv < v13.0.0
~(my_py_proj)$ pip wheel --no-use-wheel numpy scipy

This will create binary wheel files in <cwd>/wheelhouse, use -d to specify a different directory. Now if you start another virtualenv and you need the same packages you've already built, you can install them form your wheelhouse using pip install --find-links=<fullpath>/wheelhouse

Read Installing Python Modules in the Python documentation and Installing packages on the Python Package Index main page. Also pip, venv, virtualenv and wheel.

If you're not in sudoers and virtualenv isn't installed.

Another option to using a virtual environment, or if you are using a Linux share without root privileges, then using either the --user or --home=<wherever-you-want> Python installation schemes with Python's distutils will install packages to the value of site.USERBASE or to wherever you want. Newer versions of pip also have a --user option. Do not use sudo!

pip install --user virtualenv

If your Linux version of pip is too old, then you can pass setup options using --install-option which is useful for passing custom options to some setup.py scripts for some packages that build extensions, such as setting the PREFIX. You may need to just extract the distribution and use distutils to install the package the old-school way by typing python setup install [options]. Reading some of the install documentation and the distutils documentation may help.

Python is nice enough to add site.USERBASE to your PYTHONPATH ahead of anything else, so the changes will only effect you. A popular location for --home is ~/.local. See the Python module installation guide for the exact file structure and specifically where your site-packages are. Note: if you use the --home installation scheme then you may need to add it to the PYTHONPATH environment variable using export in your .bashrc, .bash_profile or in your shell for your localized packages to be available in Python.

Use Continuum Anaconda Python for Math, Science, Data, or Personal Projects

If you are using Python for either math, science, or data, then IMO a really good option is the Anaconda-Python Distribution or the more basic miniconda distro released by Anaconda, Inc. (previously known as Continuum Analytics). Although anyone could benefit from using Anaconda for personal projects, the default installation includes over 500 math and science packages like NumPy, SciPy, Pandas, and Matplotlib, while miniconda only installs Anaconda-Python and the conda environment manager. Anaconda only installs into your personal profile, ie: /home/<user>/ and alters your ~/.bashrc or ~/.bash_profile to prepend Anaconda's path to your personal $PATH recommends sourcing conda.sh in your ~/.bashrc which lets you use conda activate <env|default is base> to start anaconda - this only affects you - your system path is unchanged. Therefore you do not need root access or sudo to use Anaconda! If you have already added Anaconda-Python, miniconda, or conda to your personal path, then you should remove the PATH export from your ~/.bashrc, and update to the new recommendation, so your system Python will be first again.

This is somewhat similar to the --user option I explained in the last section except it applies to Python as a whole and not just packages. Therefore Anaconda is completely separate from your system Python, it won't interfere with your system Python, and only you can use or change it. Since it installs a new version of Python and all of its libraries you will need at least 200MB of room, but it is very clever about caching and managing libraries which is important for some of the cool things you can do with Anaconda.

Anaconda curates a collection of Python binaries and libraries required by dependencies in an online repository (formerly called binstar), and they also host user packages as different "channels". The package manager used by Anaconda, conda, by default installs packages from Anaconda, but you can signal a different "channel" using the -c option.

Install packages with conda just like pip:

$ conda install -c pvlib pvlib  # install pvlib pkg from pvlib channel

But conda can do so much more! It can also create and manage virtual environments just like virtualenv. Therefore since Anaconda creates virtual environments, the pip package manager can be used to install packages from PyPI into an Anaconda environment without root or sudo. Do not use sudo with Anaconda! Warning! Do be careful though when mixing pip and conda in an Anaconda environment, b/c you will have to manage package dependencies more carefully. Another option to pip in a conda environment is to use the conda-forge channel, but also best to do that in a fresh conda environment with conda-forge as the default channel. As a last resort, if you can't find a package anywhere but on PyPI, consider using --no-deps then install the remaining dependencies manually using conda.

Anaconda is also similar in some ways to Ruby RVM if you're familiar with that tool. Anaconda conda also lets you create virtual environments with different versions of Python. e.g.: conda create -n py35sci python==3.5.2 numpy scipy matplotlib pandas statsmodels seaborn will create a scientific/data-science stack using Python-3.5 in a new environment called py35sci. You can switch environments using conda. Since conda-4.4.0, this is now different to virtualenv which uses source venv/bin/activate, but previous to conda-4.4.0 the conda commands were the same as virtualenv and also used source:

# AFTER conda-4.4 
~/Projects/myproj $ conda activate py35sci

# BEFORE conda-4.4 
~/Projects/myproj $ source activate py35sci

But wait there's more! Anaconda can also install different languages such as R for statistical programming from the Anaconda r channel. You can even set up your own channel to upload package distributions built for conda. As mentioned conda-forge maintains automated builds of many of the packages on PyPI at the conda-forge Anaconda channel.

Epilogue

There are many options for maintaining your Python projects on Linux depending on your personal needs and access. However, if there's any one thing I hope you take away from this answer is that you should never use sudo pip to install Python packages. The use of sudo should be a warning to you to be extra cautious because you will make system wide changes that could have bad consequences. You have been warned.

Good luck and happy coding!

Solution 2:

I think best way for you would be to install Python packaging system like "python-pip". You can install it with Synaptic or Ubuntu Software Center.

Pip will allow you to easy install and uninstall Python packages, simply as pip install package. In your case it would be something like this from terminal:

sudo pip install tweeststream