Log3- Learning FastBook along with Study Group

Deep learning

June 30, 2021


Recap of Week 3

  • Our captain Aman decided to cover only Chapter 2 during the session.
  • As a group, Aman asked all of us to check out the questionnaire at end of each Chapter.
  • Data loaders is something which passes the function one by one is called function. Fastai has a train data loader and validation data loader
  • I was immediately curious about what to do for the test dataset as well. The full discussion thread can be found in this link
  • Generally for any model, there are two terms. One is X which called the independent variable and Y which is the label to predict is called the dependent variable.

Let’s dive deep into DataLoaders. In fastai dataloaders, there are generally four things it does:

  1. Get dependent and Independent variables with help of ImageBlock (blocks=(ImageBlock, CategoryBlock))
  2. Getting the actuals files for dataset (get_items=get_image_files)
  3. Splitters, which is used for splitting validation and testing dataset randomly. That splitting is always being done with a random seed (splitter=RandomSplitter(valid_pct=0.2, seed=42))
  4. Get independent variables(using get_y=parent_label). Note parent_label is something provided by fastai to get the name of the folder. I heard there is something called grandparent splitters
  5. Item transforms and batch transforms (for making images in the same size, for the neural network to process which is resizing images to 128 px item_tfms=Resize(128)

Behind Datablocks and Dataloaders, there is lot of things are still happening 🤯

  • There are a lot of data augmentation techniques. In the book we discussed methods like padding images, Random Resized Crop, Resize etc. Since there is a lot of black spaced borders, padding methods are not good. All these methods are the suboptimal way of doing things.
  • According to the book, Random Resized Crop, it’s way better than any other technique as it doesn’t see the full images in each pass. Yet throughout moving through multiple epochs, it sees the entire images
  • In ImageDataLoaders from-name-func is internally calling DataBlock as in chapter 3.

Highlights of the full session can be found by clicking the below video:



The homework for this week is to have an online working version of the app to be demoed to people. Aman has said those who do not complete the assignment, need not even come to the sessions.

As mentioned in Wayde summary of fastai Chapter 2, the datablock is used in fastai to tell four things always:

  1. What kind of data you are working with
  2. How to get the Data
  3. How to label the Data
  4. How to create a validation dataset

Now let me share some of my learnings this week:

Tree classifier webapp


  • I started working on creating a tree classifier, which classifies based on the images what species of tree it is. The dataset creation step, was easy to do with duck-duck-go method. Yet what was realised was that method is buggy in some cases.
  • When I collected dataset with just names, like Mango out of domain images like image of mango juice, mangoes, all came. Cleaning of dataset is required for this, also I limited my search query to include tree with each fruit name. The dataset collection notebook can be found here.
  • The online webapp to classify trees I created can be accessed in the Binder link

  • My original goal, was to create a webapp for same use-case, with Vue.js and Flask, as a full stack web application. It’s being inspired by this blogpost
  • By upcoming week, I will be coming with a blogpost with how to use a fast.ai model for inferencing with Vue.js and Flask. Do remember to check it out
  • I stumbled upon a shortcut key in Jupyter, that is on pressing Shift + Tab together you can seen the function definition
  • I did the questionnare, after week 2. I was able to just answer 15 questions out of 27 in the first pass
  • After re-reading the chapter again, I asked these questions to our fantastic group:
1. Where do text models currently have a major deficiency?
2. What are possible negative societal implication of text
generation models?
3. What are the three steps in deployment process?
4. What are the downsides of deploying your app to server,
instead of device such as phone or PC?

The questions were anwered by Ravi Mashru:

1 & 2: From what I read in the book, text models are really good at generating context-appropriate content. However, this does not mean the generated content is correct. Take a look at this text generated by GPT-2 saying recycling is bad for the world: https://openai.com/blog/better-language-models/#sample8.

The real problem is when such models are used to generate incorrect content that might appear reasonable to laymen and drive disinformation.

3: I think the 3-step deployment process is shown in figure 2-5 and explained after the figure in the book. Did you find any particular step difficult to understand?

4: At the end, Alexis says you can go with the approach you find easiest.

The downsides of deploying to a server are that the client always needs to talk to the server over a network. This could also add latency (though with fast servers doing this over a network might actually become faster).

The biggest drawback I noticed is the need to send data over the network to an external server. However, this can be remedied by running on-premise servers at the cost of increased complexity of managing the infrastructure.

But we should keep in mind that deploying to edge devices has downsides as well - you need to adhere to the manufacturer specific steps to enable your model to run on different

and Vinayakh Nayakh added on some of major deficencies of Text models:

At my current organization, they are trying to build a text generation model which is a sort
of captioning model for a product given it's image. The problem which they're facing is of
controlled text generation i.e. let's say I have trained a model which takes an image as
input and produces a string og words as output. 

The issue which is encountered is that if certain sentence structures are more frequent,
the words present in those sentences are highly repeated at the inference 
time during generation of descriptions. Also, in tasks like keyword detection,
if the keywords are nuanced in nature, then they cannot be picked up very easily by 
these text models...

Thank you everyone for reading 🙏

Future Resource to ponder upon in the coming days: