Python Virtual Environment
- What is a Virtual Environment
- Python Installation Directory Structure
- Need for Virtual Environment
- Virtual Environment Solutions
- Manage Dependencies
- Change Virtual Environment in IDE
what is a Virtual Environment ?
In real time, you will have to work with multiple projects. For example, say you are working with 2 projects.
Each python project requires different versions of Python and 3rd party packages. However, when you install python, typically, there is only version that is readily available. For example, I have a 64-bit python 3.7 installation on my system, installed for all users , and here is how the installation directory looks like.
And whenever I type in python on my console ( command prompt ), this executable gets called. Which python version to call is based on the path variable. You can use the following command to get the full path value.
echo %path% # on windows
echo $PATH # on Linux or Mac
So, whenever I type in python on my command prompt, only this version ( Python 3.7, 64-bit) is called. What if I wanted to have the flexibility to call another version of Python ? say 3.6 , 32-bit ?
It can be done, by specifying the full path, but that is a pain to do it every time. Also, as we are going to see in the next section, having different versions for the 3rd party libraries ( like NumPy, Pandas etc) is next to impossible if you don’t have a virtual environment.
To understand this better, let’s dive a bit deeper into the Python Installation Directory structure and the 3rd party packages directory structure.
Python Installation Directory Structure
Here is a brief overview of the Python installation directory structure. So, now you should know where the following are.
- Python.exe executable
- pip script
- built-in libraries
- platform specific binaries
And where are the packages that you have installed using pip ? A quick command will show you where the packages are stored. For example, here is where numpy is stored on my machine.
> pip show pandas
Name: pandas Version: 0.25.0 Summary: Powerful data structures for data analysis, time series, and statistics Home-page: http://pandas.pydata.org Author: None Author-email: None License: BSD Location: c:\users\ajaytech\appdata\roaming\python\python37\site-packages Requires: numpy, pytz, python-dateutil Required-by: mlxtend
The reason why it is stored under my user’s directory (C:\users\ajaytech…) is because when I installed NumPy, I have specified the –user switch – like so.
> pip install numpy --user
However, I have installed another package called bokeh (a visualization package similar to matplotlib) without the –user switch and guess where the package is installed.
> pip show bokeh
Name: bokeh Version: 1.3.0 Summary: Interactive plots and applications in the browser from Python Home-page: http://github.com/bokeh/bokeh Author: Bokeh Team Author-email: firstname.lastname@example.org License: New BSD Location: c:\program files\python37\lib\site-packages Requires: PyYAML, python-dateutil, six, Jinja2, tornado, numpy, pillow, packaging Required-by:
It is installed right inside the python’s installation directory under the lib\site-packages folder. You should be able to open the folder and see it in the file explorer.
Need for Virtual Environment
Why are we trying to understand the directory structure ? The key here is to understand that there is just one global installation (either using –user switch or otherwise) for Python or 3rd party libraries. What that means is that with what we have been doing so far, we only have one version of Python or 3rd party libraries available at any given point on a system.
Like we discussed at the beginning of the post, we cannot satisfy multiple project environments (needing different versions of Python or 3rd party libraries) with this approach.
What we need is a virtual environment (don’t confuse with OS virtualization like VMWare or VirtualBox) in which, we can have multiple versions of Python (and 3rd party packages) simultaneously and be able to seamlessly switch between them whenever needed.
In the following section, we are going to discuss the most used solutions for the problem described above.
We will be discussing 3 solutions to the problems described above.
venv stands for Virtual Environment. From version 3.3, Python has provided a way to create light-weight virtual environments using standard python using venv. For example, let’s create a virtual environment for project 1.
> python -m venv C:\Users\AjayTech\Documents\python-env\project-1
As soon as this command is executed, a new virtual environment is created under the project you have specified. Here is how it looks like.
Go inside the folder and into the scripts directory in the command prompt and run Activate.bat.
Once it is activated, the command prompt changes like this.
Now, let’s install an older version of NumPy, specific to this package.
And you should be able to clearly see that the installation location is different from the global site-packages installation location.
When you run a numpy import in your python program (when run from this project specific location), your numpy import will be specific to this virtual environment’s version.
So far so good. So far, we have only been able to achieve multiple versions of site-packages(3rd party packages like NumPy or Pandas) with venv. How about having different versions of Python ? Like 3.x and 2.x ? Is it possible with venv ?
Let’s be clear here – We have only created different virtual environments. That doesn’t mean we have to have the code this this directory. You are free to navigate to any directory you like and start your python executable from there.
venv is meant to create virtual environment for python libraries only. Not for the source code.Your source code would remain where it is. It has nothing to do with the virtual environment directory that you have created. That directory is ONLY to hold the environment specific Python scripts and site-packages.
Now that we have managed to create different virtual environment’s using venv, how about being able to code
The answer is NO –
You cannot have multiple Python versions using venv.Only different versions of 3rd party libraries are possible per each virtual environment using venv.
Virtualenv is another python package that can manage virtual environments. Think of this as an enhanced version of venv. However, the basic ways to use virtualenv is almost exactly the same as venv.
> pip install virtualenv
Once installed, use virtualenv to create a new virtual environment directory. You can directly specify the target directory, or go to the project environment directory location that you have designated for virtual environments and type in the new virtual environment name. For example, say we want to create a new project – project-2,
> virtualenv project-2
Using base prefix 'c:\\program files (x86)\\python37-32' New python executable in C:\Users\AjayTech\Documents\python-env\project-2\Scripts\python.exe Installing setuptools, pip, wheel... done.
The following directories will be created by virtualenv.
As with venv, use the activate batch file inside the Scripts directory, to active this environment.
> activate (project-2) C:\Users\AjayTech\Documents\python-env\project-2\Scripts>
Now that the new project environment is activated, you can quickly check which version of Python the current project-2 environment is using. On Windows use the where command. On Mac use the which command.
> (project-2) C:\Users\AjayTech\Documents\python-env\project-2\Scripts>where python
C:\Users\AjayTech\Documents\python-env\project-2\Scripts\python.exe C:\Program Files (x86)\Python37-32\python.exe C:\Program Files\Python37\python.exe
Previously, there were just 2 Python versions ( 32-bit and 64-bit versions of Python 3.7 ). Now, the where command shows a third Python script. In case you are wondering which version of Python will be used in this virtual environment, it will be the default Python version that was used when creating the virtual environment.
(project-2) C:\Users\AjayTech\Documents\python-env\project-2\Scripts>python --version
and you can find out more info on the version and platform by going into the python shell. We can see that we are using the default Python 3.7, 32-bit version (that was used when creating the virtual environment).
> (project-2) C:\Users\AjayTech\Documents\python-env\project-2\Scripts>python
Python 3.7.4 (tags/v3.7.4:e09359112e, Jul 8 2019, 19:29:22) [MSC v.1916 32 bit (Intel)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>>
Choose your Python environment
However, you do have an option of picking the Python version when it comes to creating the virtual environment using virtualenv package. Just use the -p option to specify the python version.
>virtualenv -p "C:\Program Files\Python37\python.exe" project-3
Running virtualenv with interpreter C:\Program Files\Python37\python.exe Using base prefix 'C:\\Program Files\\Python37' New python executable in C:\Users\AjayTech\Documents\python-env\project-3\Scripts\python.exe Installing setuptools, pip, wheel... done.
Now, if you activate the new virtual environment (project-3), you will see that we are using a specific version of Python ( version 3.7 , 64-bit ). You can verify this after activating the virtual environment.
Python 3.7.4 (tags/v3.7.4:e09359112e, Jul 8 2019, 20:34:20) [MSC v.1916 64 bit (AMD64)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>>
There are so many more options to create virtual environments in Python that are beyond the scope of this article. Here are some of the other options.
Most python projects need at least one or more 3rd party packages. For example, most data science projects use Numpy, Pandas, Matplotlib etc as 3rd party packages or libraries. We typically use conda or pip to install these packages. These packages are dependencies for the project and they have to be managed effectively.
The code in a project moves through different environments
- Development ( your’s and your co-developer’s desktop)
- staging ( typically, your end users test the system here)
- production (This is where your code finally resides)
Each of these environments might be having multiple projects running simultaneously. So, it is always the case that deployment to production needs a more deterministic build. A deterministic build is one that always produces a predictable project environment.
When you move the code to production, you typically don’t move the packages. You just move your code and a file called requirements.txt that contains the list of all the packages and their version. Once you are ready to move the code, create the dependencies (all the required packages and their versions into a text file – requirements.txt). You can do that using the pip freeze command.
> pip freeze > requirements.txt
You can quickly view the dependencies written to the file (requirements.txt) on the console or in a text editor.
> type requirements.txt
cycler==0.10.0 kiwisolver==1.1.0 matplotlib==3.1.1 numpy==1.17.0 pyparsing==188.8.131.52 python-dateutil==2.8.0 six==1.12.0
As you can see, the requirements.txt clearly states all the required packages for the project along with the respective version number.
pip install requirements
Once the requirements have been frozen for a project, you can move the code to any environment. And to recreate the dependencies, just use the command pip install -r and provide the dependencies file ( requirements.txt)
> pip install -r requirements.txt
That’s it – pip will take care of installing the specific versions of the packages and recreate all your dependencies to the dot.
Change Virtual Environment in IDE
As long as you are using the command line, we know how to change the virtual environment (using the activate and deactivate scripts). Once you change the virtual environment, you start to work on your code in your IDE (say visual studio code or jupyter). Now, how do you ensure that the IDE is using the new virtual environment and not the standard Python environment ?
The way you specify the python interpreter and the libraries to use in an IDE vary with each IDE.
Visual Studio Code
To work with specific versions of the Virtual Environment inside Visual Studio Code, you would have to specify the Virtual Environment.
Go to settings using Ctrl + Shift + p. Type in “Python: select Interpreter” . You can just type it in partially, and Visual Studio pulls up the rest.
Visual studio code will show you the list of python interpreters. In my case, the current python interpreter is a virtual environment that has been created using the venv command. And of course, you will be given an option to change from the drop down.
However, Visual studio code will not be able to look up the virtual environment by default – since it doesn’t know where you have created the virtual directory. To help it search, there is a specific parameter that you have to set in settings.json
After specifying the virtual environment path, you can see that the following options show up.
You could select any of the interpreter based on the virtual environment you want.
Jupyter notebook allows you to run from any particular virtual environment. The default kernel uses the global python installation along with its packages. However, we can create our own custom kernel that can use a specific version of python available in a virtual environment. In just a couple of steps, you can make this happen. For example, if we want to create a kernel version for project-2, here is what we do.
Go inside the scripts directory of the virtual environment and run the activate script. For example, in the screenshot below, project-2 virtual environment has been activated. You can see that it is activated from the command line prefix – (project-2) is prepended to the command line.
Once inside the virtual environment, install ipykernel module using pip.
> pip install ipykernel
After ipykernel is installed within that virtual environment, run the following command from within that virtual environment.
> python -m ipykernel install --user --name project-2 --display-name "python project-2"
- –name refers to the virtual environment name. In this case it is project-2
- –display-name refers to how Jupyter notebook shows it in the dropdown to select the specific environment
- –user is a safe option. It installs the new kernel (that corresponds to a specific virtual environment) only the current user. The location where it is installed is also shown as soon as the above command is run
Installed kernelspec project-2 in C:\Users\AjayTech\AppData\Roaming\jupyter\kernels\project-2
You can open the actual kernel.json file to see the actual location of the virtual environment’s Python file.
Click on New and in the drop down, select the description of the new kernel we have created. For the virtual environment project-2, we have given a description of python project-2 . Select it and a new window opens up hooked up to the new virtual environment behind the scenes. You can check that the packages and versions correspond to the actual packages/versions in that specific virtual environment.