Introduction
Jeremy has been conducting these official course walk-thrus which started on May 27, 2022 for students of fastaiv5 course. The idea of these course walk-thrus were “going to explain exactly how to do every step, and why we do things the way we do”.
We in the sense fastai
approach of development. It was very useful to me and it covered answers to some of things which I was searching for a long time. I will use it again and again, so thought of writing it down to refer later. The first walk thrus was on the topic: Introduction to the terminal. How to install Python and libraries
and I felt I got more value out of these walk-thrus than some of lessons in course
The right way here means the way Jeremy does stuff. Obiviously there as thousand of way to do thing, yet going through each ways has it’s own pros/cons.
🚖 What’s terminal, how to work with it?
Terminal vs Shell
A terminal is a program which can display console window to run program. Yet thing inside is not strictly a terminal, but called a shell. The black coloured stuff, which you see in movies used by hackers is Terminal as shown in image below.
Usually in a terminal there can be multiple shells, which can have different colours, shells etc. So terminal and shell are totally different in meaning. You can think shell as the a ship which does main things it’s supposed to do like running program, while terminal is like the a group of ships which is usually controlled by a parent like a corporal.
In a windows terminal it can start no of shells like PowerShell, Command Prompt, Ubuntu
etc… Most of the time we use both terminal and shell interchangebly which is ok.
Installing Terminal & Shell
You can install terminal in Windows by downloading Windows terminal. In case of linux/MacOS, you can just search for terminal as it will come with a terminal pre-installed.
In Windows, Jeremy recommends to use WSL, which install a linux distribution within windows and then install Ubuntu from Microsoft Store.
Handy Tips
It’s a good idea to change the default shell from Power shell to Ubuntu
for easy usage. Also learn some keyboard shortcuts.
Learning Keyboard shortcuts can be immensely valuable.
Keyboard shortcuts
Some of the useful keyboard shortcuts shared during lesson and in forums are as follows:
Shared by Jeremy
Ctrl+Shift+1 - Open shell set for default profile.
Ctrl+Shift+3 - Open shell listed as number 3 in WSL.
Alt+Enter - Enter terminal in full screen.
Ctrl + r - Recursvie back search to search the previous typed commands.
Ctrl-a - Move to the start of the current line.
Ctrl-e - Move to the end of the line.
Tab Autocomplete.
Shared by miwojc
Ctrl-f - Move forward a character.
Ctrl-b - Move back a character.
Alt-f - Move forward to the end of the next word. Words are alphanumeric.
Alt-b - Move back to the start of the current or previous word. Words are alphanumeric.
Ctrl-l - Clear the screen.
Ctrl-p - Fetch the previous command from the history list (same as up but easier to reach).
Ctrl-n - Fetch the next command from the history list (same as down).
Ctrl-r - Search backward through history.
Ctrl-d - Delete the character under the cursor.
Ctrl-k - Kill (cut) forwards to the end of the line.
Ctrl-u - Kill (cut) backwards to the start of the line.
Alt-d - Kill (cut) forwards to the end of the current word.
Ctrl-w - Kill(cut) backwards to the start of the current word.
Shared by Kurian
Alt Shift + - Open a new vertical pane
Alt Shift - - Open a new horizontal pane
Ctrl Shift w - Closing a Pane
🔰 How to install Python Packages for datascience
Never use the python which comes by default with operating system, always use a differnt python to work on stuff you want, else it will become really messy… ⚠️.
Installation with mamba
To follow this advice, this let’s go ahead and install python seperately with packaging manager called mamba which is a Fast Cross-Platform Package Manager. It’s aka. faster version of conda.
So go to the mambaforge installer. Based on your operating system, download the shell script to install with:
wget https://github.com/conda-forge/miniforge/releases/latest/download/Mambaforge-Linux-x86_64.sh
bash Mambaforge-Linux-x86_64.sh
After downloading and installation is complete. To refresh terminal either close and reopen terminal or typing below command:
. ~/.bashrc
Jeremy recommends to install popular libraries which are supported in conda from mamba. If it’s not there in conda, or something which requires editable install use pip.
🐛 Uninstalling mamba/conda
If you are a beginner, working with virtual environments is intimidating. So jeremy recommends whenever you face any issues as a beginner, it’s important to know how to delete your setup.
These are steps to see if conda/mamba is properly uninstalled in a linux environment:
- Check if ipython, jupyter is installed. If yes, first uninstall it.
Note: You can uninstall via:
pip uninstall jupyter
pip uninstall ipython3
You can check the path of program by typing the command, to see if was installed by system python or mamba
(base) kurianbenoy@Lap-34:~/blog/ml-blog$ which jupyter
/home/kurianbenoy/mambaforge/bin/jupyter
- Then remove mamabaforge folder. In linux usually, mambaforge is installed in /home/username.
So to delete the mambaforge package:
(base) kurianbenoy@Lap-34:~$ ls
blog downloads mambaforge
(base) kurianbenoy@Lap-34:~$ rm -rf mambaforge/
- Then delete conda package also.
Using fastsetup
Goto github repository and download the setup-conda.sh file.
wget https://raw.githubusercontent.com/fastai/fastsetup/master/setup-conda.sh
This is a normal shell script, so just run the shellscript to install mamba as follows:
source setup-conda.sh
. ~/.bashrc
conda install -yq mamba
My question to Jeremy
❓I asked to Jeremy about: which version of python is using and how to use for a different python version
Jeremy on checking Mambaforge repository, realised that python version which we are using now is python 3.9. It’s always a good idea to know, which kind of python versions are near their end of life and can be here. At the time of writing any version above python 3.7 to python3.10 is recommended to use. Yet according to Jeremy, he usually prefers to use latest python only 1 year after it’s released, but fastai do support the latest version.
Radek who is a engineer at Nvidia, told he usually prefer mamba because with just one line of code you can switch to any python version you want. Afer the session in forums, Radek shared how to do this as shown in below screenshot.
Installing datascience packages - pytorch and jupyter
Since for fastai course, we are using pytorch. Let’s install pytorch based on official page instructions:
One of the advantages of installing libraries like pytorch in Anaconda is that if you are using conda, it install all the packages and drivers for GPU setup as well. In a normal pip based installation, there are lot more steps required for installing correctly in GPU.
Go ahead and install python packages in this manner, by just replacing conda with mamba.
mamba install pytorch torchvision torchaudio cpuonly -c pytorch
It’s always a good idea to google and find the correct conda packages before installation, as conda requires some specifications like setting the correct channel to install.
Now let’s install jupyter lab
to do our experiments quickly:
mamba install jupyterlab
Then run the jupyter lab, with the following command:
jupyter lab --no-browser
This opens up jupyterlab in localhost:8888
Installing packages with fastchan
Even though Jeremy didn’t cover this topic during walk-thrus. I came to know about fastchan
when I wanted to install pytorch and huggingface transformers
in a gpu based system. What is fastchan and the problem it solves is covered by detailed blogpost by the Aman Arora.
To install both pytorch and huggingface transformers
in a GPU, I used the following command:
mamba install pytorch transformers cudatoolkit=11.4 -c fastchan
Conclusion
One of fastai students, during the start of lesson talked about the need of a quick setup, which just works as expected. One of biggest takeaways for me, personally is a quick and fastsetup to do my experiments in DataScience. Sometimes installing packages in data science can take hours of effort, and that’s why I really loved this setup.
Thanks to Jeremy Howard for creating this quick setup and for starting the project fastchan.