Practical Deep Learning for Coders Course - Lesson 2

This blog-post series captures my weekly notes while I attend the fastaiv5 course conducted by University of Queensland with fast.ai. So off to week2 where we learn about productionizing ML models and how to get good accuracy.
fastai
fastaicourse
Author

Kurian Benoy

Published

May 3, 2022

Lesson Setup

Jeremy was taking this session from his home, as the venue in University of queensland was already booked by someone else. Jeremy was really really pumped for this lesson and it’s like going to the early days of fast.ai with lot of super exciting work happening.

twitter: https://twitter.com/bhutanisanyam1/status/1521511103406043137

Jeremy mentioned some technique on using Jupyter notebooks, and asked to take a look at jupyter extensions. The navigation section and how to collapse headings was explained during class. [24:00]

Fastbook Chapter 2

This week we started by taking a look at putting model in production using fastai. This was the same thing which is covered in chapter 2 of Deep Learning book To build grizzly bears and teddy bears classifier.

Few things have changed in book in this version:

  • using search_images_ddg instead of bing search apis
  • using huggingfaces spaces as deployment instead of voila even though it’s still works

RandomResizedCrop could be a good idea to understand different varieties of same image.

Does RandomResizedCrop crop duplicate the image – i.e. you get multiple copies and you ensure that all the parts of the image are used in training? or does it just make one crop?

Jeremy answered it in video at [32:30]. His answer was it doesn’t copying image. In each epoch every image get’s written and what happens is in-memory image is being wrapped by recropping and colouring in realtime during model training. It’s like infinitely multi-copies of images.

Check the book to learn more in detail about various augmentations.

Sanyam mentioned that RandomResized crop as a augmentation is very helpful:

Important

Actually this technique is SUPER helpful-in a recent interview, Chris Deotte (4x Grandmaster) shared how these resizing techniques helped them win a solo gold. This was in the petfinder Kaggle competition (2nd run of the comp)

Note

Jeremy is running on a laptop with 4GB GPU. Jeremy says in GPU, just run one thing at a time else you will get CUDA error.

How to do fast.ai course

Tips for people in Yellow bucket:

Note

If you are in yellow, always stop try. First go ahead and watch video fully without touching your keyboard and write code. Then watch again and follow the course. This is an unusual way as it can’t be done in real college lectures, but it’s very effective way indeed.

I asked Wayde Gilliam who is a long term fastai community member after the lesson about his process of watching lectures. He was gracious enough to share it with mith

Important
  1. Watch the livestream and jot down timestamp to go back to for anything I found interesting in journal A (or just a piece of paper)
Important
  1. Go back through the video after 2-3 days, hit those spots I noted during the livestream. Will write detailed notes in another Journal (we’ll call that journal B)
Important

There’s too much info to digest in real-time so this approach works well and its what I’ve been doing for 4-5 yrs.

Huggingface spaces

Jeremy pointed to tanishq tutorial on Gradio + HuggingFace Spaces.

image

Also Jeremy mentioned some good tools which are useful:

  • Github Desktop: Hamel who was a employee in github previously, is even using github desktop. Some complicated stuff in git can be solved using this tool. Even knowing terminal is cool.
  • WSL: As a datascientist, you spend a lot of time in terminals. Just use ubuntu with windows terminal. Any time Jeremy shows in terminal, he just uses windows terminal.
  • In terminal, he uses Tmux as a terminal emulator as pointed out in fast.ai forums for my question.

Jeremy like Windows due to easiness in streaming, good apps and recording capabilities. Yet Jeremy also has a linux environment with a good Deep learning jig.

Note

Jupyter notebooks debugging with magic methods %time, %debug

In fastai for inference, it returns back a tensor. One of issue in gradio tensors is not supported at moment. So we need to convert tensors to float and do prediction.

Jeremy created a cats vs dogs classifier using spaces. His daughter when realised he is building such a classifier googled something which is a mix of cat and dog. For that his initial prediction was like 50-50% for both cats and dogs.

This kind of shows how important the support system around you and how much they acknowledge the work you do. This personally touched me. As my sister was encouraging me to go an all-nighter to complete the Music genre classification spaces.

TODO: Look through Jeremy setup and how he worked with gradio in local [58:00 onwards 1:14:00]

fastsetup

Installing python and jupyter-notebooks with proper git and conda setup.

Fastai setup

Important

A big issue in laptops with linux or mac there is a python default version, don’t use that python. As that python version is for your operating system to do it’s stuff. Don’t mess on top of it.

Use mamba based installation for fastai now:

mamba install fastai
mamba install -c fastchan jupyter nbdev

Trying gradio API with github Pages

An example API in gradio Example Jeremy showcased

With live demo, we could have easily used it with any websites. Without any software just with the browser, you can run this file. That’s the cool thing about javascript and can host in website called github pages

Code
fetch('https://hf.space/embed/kurianbenoy/audioclassification/+/api/predict/', 
{ method: "POST",
 body: JSON.stringify({"data":[ {"data": null, "is_example": true, "name": "000003.ogg"}
]}), headers: { "Content-Type": "application/json" }})
.then(function(response){ return response.json(); })
.then(function(json_response){ console.log(json_response) })

He used alembic theme. With a particular configuration. At top of any github pages, you should add three dashes. The world of javascript apps, he build this cool apps.

Important

The magic of using gradio APIs can be summarized as the following. It exposes a reliable way of sharing microservices. With this if you are just creating any hugging face spaces, with that APIs. You can use it any websites, apps etc. It looks to me there is no limitation with using Gradio API at the moment.