Recap of Week 4
- The topic covered for this week was Going Deeper Into Computer Vision w/ MNIST Digit Classifier
- Pixels are the foundation of images, and we are using the MNIST dataset to explore Deep learning
- We are using
URLs.MNIST_SAMPLE, which is a sample of images of 3 and 7. There is another dataset MNIST containing handwritten images
from digits 0-9.
- In computer, everything is an array
- Lot of wonderful ideas were discussed on how to classify images in the discussion thread like classifying number based “surface % of this square of background that is covered by number”.
- We decided to go with Pixel similarity, that is to find the average value for every pixel of images of 3s and 7s. Then the two group averages define what is the ideal 3 and ideal 7
- what is the difference between tensor and array? An array is a terminology in NumPy and tensor for PyTorch
- rank is the number of axes or dimensions in a tensor
- shape is the size of each axis of a tensor
- Now for our ML model, to distinguish between images of 3 and 7. What it does is basically, check if compare our mean images of 3 and 7 obtained by stacking images and finding mean. Then you compare the new image and find the difference between the mean of 3 and 7. As Aman pointed out, when subtracting between two unlike quantities there is a possibility to cancel out.
- So calculating difference with absolute value to find loss(L1 loss)
- Else calculate by finding the difference and then squaring(L2 loss- Root mean squared loss)
- Major differences b/w Pytorch and Numpy are: a) Pytorch works on GPU and Numpy is working on CPU b) Pytorch can automatically calculate gradients
- Computing Metrics using broadcasting. Broadcasting is the automatic filling of lists to match matrix shape
- Check the
mnist_distancefunction which is used to do pixel similarity comparison:
- Why mean((-1, -2)) is used to find the mean across width and height of the image?
- Detailed explanation by Srinivas Raman in Week 4 discussion forum: Since the valid3_tens has a shape of (1010, 28, 28) and the ideal3 will have a shape of (1, 28, 28) with broadcasting, the ideal 3 will be broadcast over the 1st dimension. Now we want the average of the pixel values over the last two dimensions. The last two dimensions which are the height & width of the image are indexed by using the -1 index and -2 index. In python -1 refers to the last element and in this case, the last dim of a tensor and -2 is the 2nd last dim of tensor and so refers to the 28x28 dims of our valid3_tens and valid_tens
- Then we found the accuracy of the validation dataset.
Highlights of the full session can be found by clicking the below video:
For the guest lecture, we had Parul Pandey. Parul had come with an awesome article titled Building a complelling Data Science Portfolio with writing. I will recommend you to watch the Parul Guest lecture from 1:09:30 of the video.
Now let me share some of my learnings over the week:
Since I was interested to know more about broadcasting, I watched Andrew NG lesson on broadcasting. He gives an interesting problem to use broadcasting to calculate the percentage of calories of carbs, proteins, fats in 100 g of the following food
General principle of broadcasting
- I was able to look into the video on Understanding fastai Datablock API
- The Tree Classifier web app based on Vue.js + Flask is still a Work in progress. You can check the progress here During Parul Pandey’s session, I jotted some personal notes on writing:
Parul say's writing is a personal things Parul doesn't care about applause. Some of her blogposts has just 83 claps Work on small small projects, and bring that small projects into life. Always remember writing takes a lot of time and needs consistency. Yet it has a lot of benefits. Writing for yourself(growth mindset)
Before winding down, Sai Amrit Patnaik has come this week with an excellent initiative:
After interaction with a few members on this fastbook channel who are attending fastbook lectures, there seems to be an interest among many to start an introductory paper reading group and discuss some of the classical papers which laid the foundation of deep learning. The intention is to get started with reading papers and get acquainted and develop a habit of reading papers while learning the basics more comprehensively and without worrying much about technicalities in the latest papers.
Thank you everyone for reading 🙏. Please feel free to share your suggestions, questions, queries through the comments below..