Installing Python Packages & setting up libraries for Data Science - the right way

fastaicourse
terminal
Python
setup
fastai
Deep learning
Author

Kurian Benoy

Published

May 28, 2022

Introduction

Jeremy has been conducting these official course walk-thrus which started on May 27, 2022 for students of fastaiv5 course. The idea of these course walk-thrus were “going to explain exactly how to do every step, and why we do things the way we do”.

We in the sense fastai approach of development. It was very useful to me and it covered answers to some of things which I was searching for a long time. I will use it again and again, so thought of writing it down to refer later. The first walk thrus was on the topic: Introduction to the terminal. How to install Python and libraries and I felt I got more value out of these walk-thrus than some of lessons in course

Important

The right way here means the way Jeremy does stuff. Obiviously there as thousand of way to do thing, yet going through each ways has it’s own pros/cons.

🚖 What’s terminal, how to work with it?

Terminal vs Shell

A terminal is a program which can display console window to run program. Yet thing inside is not strictly a terminal, but called a shell. The black coloured stuff, which you see in movies used by hackers is Terminal as shown in image below.

image

Usually in a terminal there can be multiple shells, which can have different colours, shells etc. So terminal and shell are totally different in meaning. You can think shell as the a ship which does main things it’s supposed to do like running program, while terminal is like the a group of ships which is usually controlled by a parent like a corporal.

In a windows terminal it can start no of shells like PowerShell, Command Prompt, Ubuntu etc… Most of the time we use both terminal and shell interchangebly which is ok.

Installing Terminal & Shell

You can install terminal in Windows by downloading Windows terminal. In case of linux/MacOS, you can just search for terminal as it will come with a terminal pre-installed.

In Windows, Jeremy recommends to use WSL, which install a linux distribution within windows and then install Ubuntu from Microsoft Store.

Handy Tips

It’s a good idea to change the default shell from Power shell to Ubuntu for easy usage. Also learn some keyboard shortcuts.

Important

Learning Keyboard shortcuts can be immensely valuable.

Keyboard shortcuts

Some of the useful keyboard shortcuts shared during lesson and in forums are as follows:

Shared by Jeremy

Ctrl+Shift+1 - Open shell set for default profile.

Ctrl+Shift+3 - Open shell listed as number 3 in WSL.

Alt+Enter - Enter terminal in full screen.

Ctrl + r - Recursvie back search to search the previous typed commands.

Ctrl-a - Move to the start of the current line.

Ctrl-e - Move to the end of the line.

Tab Autocomplete.

Shared by miwojc

Ctrl-f - Move forward a character.

Ctrl-b - Move back a character.

Alt-f - Move forward to the end of the next word. Words are alphanumeric.

Alt-b - Move back to the start of the current or previous word. Words are alphanumeric.

Ctrl-l - Clear the screen.

Ctrl-p - Fetch the previous command from the history list (same as up but easier to reach).

Ctrl-n - Fetch the next command from the history list (same as down).

Ctrl-r - Search backward through history.

Ctrl-d - Delete the character under the cursor.

Ctrl-k - Kill (cut) forwards to the end of the line.

Ctrl-u - Kill (cut) backwards to the start of the line.

Alt-d - Kill (cut) forwards to the end of the current word.

Ctrl-w - Kill(cut) backwards to the start of the current word.

Shared by Kurian

Alt Shift + - Open a new vertical pane

Alt Shift - - Open a new horizontal pane

Ctrl Shift w - Closing a Pane

🔰 How to install Python Packages for datascience

Important

Never use the python which comes by default with operating system, always use a differnt python to work on stuff you want, else it will become really messy… ⚠️.

Installation with mamba

To follow this advice, this let’s go ahead and install python seperately with packaging manager called mamba which is a Fast Cross-Platform Package Manager. It’s aka. faster version of conda.

So go to the mambaforge installer. Based on your operating system, download the shell script to install with:

wget https://github.com/conda-forge/miniforge/releases/latest/download/Mambaforge-Linux-x86_64.sh
bash Mambaforge-Linux-x86_64.sh

After downloading and installation is complete. To refresh terminal either close and reopen terminal or typing below command:

. ~/.bashrc
Important

Jeremy recommends to install popular libraries which are supported in conda from mamba. If it’s not there in conda, or something which requires editable install use pip.

🐛 Uninstalling mamba/conda

Note

If you are a beginner, working with virtual environments is intimidating. So jeremy recommends whenever you face any issues as a beginner, it’s important to know how to delete your setup.

These are steps to see if conda/mamba is properly uninstalled in a linux environment:

  • Check if ipython, jupyter is installed. If yes, first uninstall it.

Note: You can uninstall via:

pip uninstall jupyter
pip uninstall ipython3

You can check the path of program by typing the command, to see if was installed by system python or mamba

(base) kurianbenoy@Lap-34:~/blog/ml-blog$ which jupyter
/home/kurianbenoy/mambaforge/bin/jupyter
  • Then remove mamabaforge folder. In linux usually, mambaforge is installed in /home/username.

So to delete the mambaforge package:

(base) kurianbenoy@Lap-34:~$ ls
blog  downloads  mambaforge
(base) kurianbenoy@Lap-34:~$ rm -rf mambaforge/
  • Then delete conda package also.

Using fastsetup

Goto github repository and download the setup-conda.sh file.

wget https://raw.githubusercontent.com/fastai/fastsetup/master/setup-conda.sh

This is a normal shell script, so just run the shellscript to install mamba as follows:

source setup-conda.sh
. ~/.bashrc
conda install -yq mamba

My question to Jeremy

❓I asked to Jeremy about: which version of python is using and how to use for a different python version

Jeremy on checking Mambaforge repository, realised that python version which we are using now is python 3.9. It’s always a good idea to know, which kind of python versions are near their end of life and can be here. At the time of writing any version above python 3.7 to python3.10 is recommended to use. Yet according to Jeremy, he usually prefers to use latest python only 1 year after it’s released, but fastai do support the latest version.

Radek who is a engineer at Nvidia, told he usually prefer mamba because with just one line of code you can switch to any python version you want. Afer the session in forums, Radek shared how to do this as shown in below screenshot.

image

Installing datascience packages - pytorch and jupyter

Since for fastai course, we are using pytorch. Let’s install pytorch based on official page instructions:

image
Important

One of the advantages of installing libraries like pytorch in Anaconda is that if you are using conda, it install all the packages and drivers for GPU setup as well. In a normal pip based installation, there are lot more steps required for installing correctly in GPU.

Go ahead and install python packages in this manner, by just replacing conda with mamba.

mamba install pytorch torchvision torchaudio cpuonly -c pytorch
Note

It’s always a good idea to google and find the correct conda packages before installation, as conda requires some specifications like setting the correct channel to install.

Now let’s install jupyter lab to do our experiments quickly:

mamba install jupyterlab

Then run the jupyter lab, with the following command:

jupyter lab --no-browser

This opens up jupyterlab in localhost:8888

image

Installing packages with fastchan

Even though Jeremy didn’t cover this topic during walk-thrus. I came to know about fastchan when I wanted to install pytorch and huggingface transformers in a gpu based system. What is fastchan and the problem it solves is covered by detailed blogpost by the Aman Arora.

To install both pytorch and huggingface transformers in a GPU, I used the following command:

mamba install pytorch transformers cudatoolkit=11.4 -c fastchan

Conclusion

One of fastai students, during the start of lesson talked about the need of a quick setup, which just works as expected. One of biggest takeaways for me, personally is a quick and fastsetup to do my experiments in DataScience. Sometimes installing packages in data science can take hours of effort, and that’s why I really loved this setup.

Thanks to Jeremy Howard for creating this quick setup and for starting the project fastchan.