Python Virtual Environment

Python Virtual Environment


  Machine Learning in Python

Contents

what is a Virtual Environment ?

In real time, you will have to work with multiple projects. For example, say you are working with 2 projects.

Each python project requires different versions of Python and 3rd party packages.

Each python project requires different versions of Python and 3rd party packages. However, when you install python, typically, there is only version that is readily available. For example, I have a 64-bit python 3.7 installation on my system, installed for all users , and here is how the installation directory looks like.

And whenever I type in python on my console ( command prompt ), this executable gets called. Which python version to call is based on the path variable. You can use the following command to get the full path value.

echo %path%  # on windows
echo $PATH  # on Linux or Mac
use echo command (on windows or Mac) to get the PATH variable

So, whenever I type in python on my command prompt, only this version ( Python 3.7, 64-bit) is called. What if I wanted to have the flexibility to call another version of Python ? say 3.6 , 32-bit ?

It can be done, by specifying the full path, but that is a pain to do it every time. Also, as we are going to see in the next section, having different versions for the 3rd party libraries ( like NumPy, Pandas etc) is next to impossible if you don’t have a virtual environment.

To understand this better, let’s dive a bit deeper into the Python Installation Directory structure and the 3rd party packages directory structure.

Python Installation Directory Structure

Here is a brief overview of the Python installation directory structure. So, now you should know where the following are.

  • Python.exe executable
  • pip script
  • built-in libraries
  • platform specific binaries
python installation directory structure
python installation directory structure

And where are the packages that you have installed using pip ? A quick command will show you where the packages are stored. For example, here is where numpy is stored on my machine.

> pip show pandas
Name: pandas
Version: 0.25.0
Summary: Powerful data structures for data analysis, time series, and statistics
Home-page: http://pandas.pydata.org
Author: None
Author-email: None
License: BSD
Location: c:\users\ajaytech\appdata\roaming\python\python37\site-packages
Requires: numpy, pytz, python-dateutil
Required-by: mlxtend

The reason why it is stored under my user’s directory (C:\users\ajaytech…) is because when I installed NumPy, I have specified the –user switch – like so.

> pip install numpy --user

However, I have installed another package called bokeh (a visualization package similar to matplotlib) without the –user switch and guess where the package is installed.

> pip show bokeh
Name: bokeh
Version: 1.3.0
Summary: Interactive plots and applications in the browser from Python
Home-page: http://github.com/bokeh/bokeh
Author: Bokeh Team
Author-email: info@bokeh.org
License: New BSD
Location: c:\program files\python37\lib\site-packages
Requires: PyYAML, python-dateutil, six, Jinja2, tornado, numpy, pillow, packaging
Required-by:

It is installed right inside the python’s installation directory under the lib\site-packages folder. You should be able to open the folder and see it in the file explorer.

Need for Virtual Environment

Why are we trying to understand the directory structure ? The key here is to understand that there is just one global installation (either using –user switch or otherwise) for Python or 3rd party libraries. What that means is that with what we have been doing so far, we only have one version of Python or 3rd party libraries available at any given point on a system.

Like we discussed at the beginning of the post, we cannot satisfy multiple project environments (needing different versions of Python or 3rd party libraries) with this approach.

What we need is a virtual environment (don’t confuse with OS virtualization like VMWare or VirtualBox) in which, we can have multiple versions of Python (and 3rd party packages) simultaneously and be able to seamlessly switch between them whenever needed.

In the following section, we are going to discuss the most used solutions for the problem described above.

Solutions

We will be discussing 3 solutions to the problems described above.

Python venv

venv stands for Virtual Environment. From version 3.3, Python has provided a way to create light-weight virtual environments using standard python using venv. For example, let’s create a virtual environment for project 1.

> python -m venv C:\Users\AjayTech\Documents\python-env\project-1

As soon as this command is executed, a new virtual environment is created under the project you have specified. Here is how it looks like.

venv folder structure
venv folder structure

Go inside the folder and into the scripts directory in the command prompt and run Activate.bat.

run activate.bat

Once it is activated, the command prompt changes like this.

project-1 activated and you can see it pre-pended to the command prompt

Now, let’s install an older version of NumPy, specific to this package.

install older version of NumPy in virtual environment

And you should be able to clearly see that the installation location is different from the global site-packages installation location.

virtual environment specific package version and installation location

When you run a numpy import in your python program (when run from this project specific location), your numpy import will be specific to this virtual environment’s version.

virtual environment specific package version

So far so good. So far, we have only been able to achieve multiple versions of site-packages(3rd party packages like NumPy or Pandas) with venv. How about having different versions of Python ? Like 3.x and 2.x ? Is it possible with venv ?

Let’s be clear here – We have only created different virtual environments. That doesn’t mean we have to have the code this this directory. You are free to navigate to any directory you like and start your python executable from there.

venv is meant to create virtual environment for python libraries only. Not for the source code.

Your source code would remain where it is. It has nothing to do with the virtual environment directory that you have created. That directory is ONLY to hold the environment specific Python scripts and site-packages.

Now that we have managed to create different virtual environment’s using venv, how about being able to code

The answer is NO –

You cannot have multiple Python versions using venv.

Only different versions of 3rd party libraries are possible per each virtual environment using venv.

virtualenv package

Virtualenv is another python package that can manage virtual environments. Think of this as an enhanced version of venv. However, the basic ways to use virtualenv is almost exactly the same as venv.

  • Step 1 – Install virtualenv
> pip install virtualenv

Once installed, use virtualenv to create a new virtual environment directory. You can directly specify the target directory, or go to the project environment directory location that you have designated for virtual environments and type in the new virtual environment name. For example, say we want to create a new project – project-2,

> virtualenv project-2
Using base prefix 'c:\\program files (x86)\\python37-32'
New python executable in C:\Users\AjayTech\Documents\python-env\project-2\Scripts\python.exe
Installing setuptools, pip, wheel...
done.

The following directories will be created by virtualenv.

virtualenv virtual environment directory structure

As with venv, use the activate batch file inside the Scripts directory, to active this environment.

> activate
(project-2) C:\Users\AjayTech\Documents\python-env\project-2\Scripts>

Now that the new project environment is activated, you can quickly check which version of Python the current project-2 environment is using. On Windows use the where command. On Mac use the which command.

> (project-2) C:\Users\AjayTech\Documents\python-env\project-2\Scripts>where python
C:\Users\AjayTech\Documents\python-env\project-2\Scripts\python.exe
C:\Program Files (x86)\Python37-32\python.exe
C:\Program Files\Python37\python.exe

Previously, there were just 2 Python versions ( 32-bit and 64-bit versions of Python 3.7 ). Now, the where command shows a third Python script. In case you are wondering which version of Python will be used in this virtual environment, it will be the default Python version that was used when creating the virtual environment.

(project-2) C:\Users\AjayTech\Documents\python-env\project-2\Scripts>python --version
Python 3.7.4

and you can find out more info on the version and platform by going into the python shell. We can see that we are using the default Python 3.7, 32-bit version (that was used when creating the virtual environment).

> (project-2) C:\Users\AjayTech\Documents\python-env\project-2\Scripts>python
Python 3.7.4 (tags/v3.7.4:e09359112e, Jul  8 2019, 19:29:22) [MSC v.1916 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>>

Choose your Python environment

However, you do have an option of picking the Python version when it comes to creating the virtual environment using virtualenv package. Just use the -p option to specify the python version.

>virtualenv -p "C:\Program Files\Python37\python.exe" project-3
Running virtualenv with interpreter C:\Program Files\Python37\python.exe
Using base prefix 'C:\\Program Files\\Python37'
New python executable in C:\Users\AjayTech\Documents\python-env\project-3\Scripts\python.exe
Installing setuptools, pip, wheel...
done.

Now, if you activate the new virtual environment (project-3), you will see that we are using a specific version of Python ( version 3.7 , 64-bit ). You can verify this after activating the virtual environment.

(project-3) C:\Users\AjayTech\Documents\python-env>python
Python 3.7.4 (tags/v3.7.4:e09359112e, Jul  8 2019, 20:34:20) [MSC v.1916 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>>

More Options

There are so many more options to create virtual environments in Python that are beyond the scope of this article. Here are some of the other options.

  • pipenv
  • pyenv
  • conda
  • docker images

Manage Dependencies

Most python projects need at least one or more 3rd party packages. For example, most data science projects use Numpy, Pandas, Matplotlib etc as 3rd party packages or libraries. We typically use conda or pip to install these packages. These packages are dependencies for the project and they have to be managed effectively.

The code in a project moves through different environments

  • Development ( your’s and your co-developer’s desktop)
  • staging ( typically, your end users test the system here)
  • production (This is where your code finally resides)

Each of these environments might be having multiple projects running simultaneously. So, it is always the case that deployment to production needs a more deterministic build. A deterministic build is one that always produces a predictable project environment.

When you move the code to production, you typically don’t move the packages. You just move your code and a file called requirements.txt that contains the list of all the packages and their version. Once you are ready to move the code, create the dependencies (all the required packages and their versions into a text file – requirements.txt). You can do that using the pip freeze command.

pip freeze

> pip freeze > requirements.txt

You can quickly view the dependencies written to the file (requirements.txt) on the console or in a text editor.

> type requirements.txt
cycler==0.10.0
kiwisolver==1.1.0
matplotlib==3.1.1
numpy==1.17.0
pyparsing==2.4.1.1
python-dateutil==2.8.0
six==1.12.0

As you can see, the requirements.txt clearly states all the required packages for the project along with the respective version number.

pip install requirements

Once the requirements have been frozen for a project, you can move the code to any environment. And to recreate the dependencies, just use the command pip install -r and provide the dependencies file ( requirements.txt)

> pip install -r requirements.txt

That’s it – pip will take care of installing the specific versions of the packages and recreate all your dependencies to the dot.

Change Virtual Environment in IDE

As long as you are using the command line, we know how to change the virtual environment (using the activate and deactivate scripts). Once you change the virtual environment, you start to work on your code in your IDE (say visual studio code or jupyter). Now, how do you ensure that the IDE is using the new virtual environment and not the standard Python environment ?

The way you specify the python interpreter and the libraries to use in an IDE vary with each IDE.

Visual Studio Code

To work with specific versions of the Virtual Environment inside Visual Studio Code, you would have to specify the Virtual Environment.

Go to settings using Ctrl + Shift + p. Type in “Python: select Interpreter” . You can just type it in partially, and Visual Studio pulls up the rest.

visual studio code select python interpreter

Visual studio code will show you the list of python interpreters. In my case, the current python interpreter is a virtual environment that has been created using the venv command. And of course, you will be given an option to change from the drop down.

visual studio code python interpreter select

However, Visual studio code will not be able to look up the virtual environment by default – since it doesn’t know where you have created the virtual directory. To help it search, there is a specific parameter that you have to set in settings.json

visual studio code settings.json to specify virtual environment path (python.venvpath)

After specifying the virtual environment path, you can see that the following options show up.

visual studio code, python interpreter for each virtual environment available

You could select any of the interpreter based on the virtual environment you want.

Jupyter notebook

Jupyter notebook allows you to run from any particular virtual environment. The default kernel uses the global python installation along with its packages. However, we can create our own custom kernel that can use a specific version of python available in a virtual environment. In just a couple of steps, you can make this happen. For example, if we want to create a kernel version for project-2, here is what we do.

  • Activate the virtual environment

Go inside the scripts directory of the virtual environment and run the activate script. For example, in the screenshot below, project-2 virtual environment has been activated. You can see that it is activated from the command line prefix – (project-2) is prepended to the command line.

  • Install ipykernel within that virtual environment

Once inside the virtual environment, install ipykernel module using pip.

> pip install ipykernel
  • Run the following script to set up this virtual environment as a Jupyter Kernel profile

After ipykernel is installed within that virtual environment, run the following command from within that virtual environment.

> python -m ipykernel install --user --name project-2 --display-name "python project-2"
  • –name refers to the virtual environment name. In this case it is project-2
  • –display-name refers to how Jupyter notebook shows it in the dropdown to select the specific environment
  • –user is a safe option. It installs the new kernel (that corresponds to a specific virtual environment) only the current user. The location where it is installed is also shown as soon as the above command is run
Installed kernelspec project-2 in C:\Users\AjayTech\AppData\Roaming\jupyter\kernels\project-2

You can open the actual kernel.json file to see the actual location of the virtual environment’s Python file.

Jupyter’s kernel.json file corresponding to the virtual environment.
  • Now, we are ready to switch the profile in Jupyter. Open jupyter Run the following script to set up this virtual environment as a Jupyter Kernel profile
New jupyter notebook kernel corresponding to new virtual environment

Click on New and in the drop down, select the description of the new kernel we have created. For the virtual environment project-2, we have given a description of python project-2 . Select it and a new window opens up hooked up to the new virtual environment behind the scenes. You can check that the packages and versions correspond to the actual packages/versions in that specific virtual environment.

new virtual environment kernel
%d bloggers like this: