Session Highlights of Week 2:
TLDR: A lot of ground was covered by explaining the fastai Lesson1 and Lesson2 till how to get the data for the fastai project on Grizzly bear. Tanishq Abhraham joined us with his tips on how to do projects effectively.
- Week 2 sessions started with the whole class reminded that it’s going to be 90 minutes from this week onwards
- Aman explained in detail how the Cats vs Dogs classification example worked under the hoods with his beautifully annotated notebooks.
- See how the process of data loading, to applying item transformations and batch transformation is being explained with this diagram.
- Then Aman went ahead explaining the section of how
Image Classifierswork under the hood and what various layers of network do.
- The architecture which is used for model training ie Resnets is not the state of art for Image classification now. The state of the art for Image classification will frequently change over the years. Last year it was Efficentnet models, yet now it seems Transformers have come to become the SOTA for Image classification now.
- Sanyam mentioned that the state of art always keeps changing, so a good resource to be updated with the latest models are PAPERS With Code website
- The second chapter mentions where Deep Learning is now good at. Deep learning is now really good at images, text data.
- For tabular, Aman warned it’s not always great. So it won’t be a good idea to do projects your initial projects in that domain.
- Aman explained the basic steps of collecting dataset using bing APIs. He went through the steps and recommended all of us to get the datasets, ready before the next session.
- Tanishq who is indeed a wonder kid, explained how fastai transformed his learning journey and helped him become a better Kaggler.
- He explained why it’s very important to repeat with a different dataset
- He also shared his experience of making projects more accessible and improving his Kaggle rankings over the years with the help of fastai community and course.
Before winding up, I am just sharing a reminder by Aman Arora on things to do this week:
- This week please make sure that you’re able to run code from chapter-1 and understand at a high level the 6 lines of code used to create the Pets classifier. Don’t forget to run doc(
) if you want to dig deeper into any fastai function. This week you should really start messing with code.
- Don’t forget to the questionnaire at the end of Chapter-1! This will help you with your understanding!
- Please make sure that you have your dataset ready - I personally found setting Azure API key a little tricky cause things have changed since the book was first written. Will post a discussion post with the exact steps and how you can get the API key too.
- Consider sharing your work here or create a W&B forum post - it’s okay if your project isn’t ready and its half baked. Remember we don’t want to be perfectionists, rather do things in multiple iterations and improve as we go.
Now let me share a list of a few things I learned over the past week. So here I am starting my own logs:
Read the article on Designing great Data Products using the Drive train approach. Few key highlights
We can make product recommendations that surprise and delight (using the optimized recommendation outlined in the previous section). We could offer tailored discounts or special offers on products the customer was not quite ready to buy or would have bought elsewhere. We can even make customer-care calls just to see how the user is enjoying our site and make them feel that their feedback is valued. The data collection and recommendation steps are not an add-on; they are Zafu’s entire business model — women’s jeans are now a data product. Zafu can tailor their recommendations to fit as well as their jeans because their system is asking the right questions.Starting with the objective forces data scientists to consider what additional models they need to build for the Modeler. We can keep the “like” model that we have already built as well as the causality model for purchases with and without recommendations, and then take a staged approach to adding additional models that we think will improve the marketing effectiveness.
Experimented with building Image classification using another dataset for Mamootty vs Mohanlal Image Classification in a different dataset.
Peeked into Chapter 8 on Diving deep into Collaborative filtering as it’s something I am very interested in building a recommender system.
TabularDataloaders.from_csvmethod procs function Normalize under the hood uses z-score ie (x- mean)/(std). Thanks @girijesh for pointing it out.
That’s all for this week. I am winding up by sharing something Rachael Tactman has mentioned in her blog on why you should blog:
You are best positioned to help people one step behind you. The material is still fresh in your mind. Many experts have forgotten what it was like to be a beginner(or an intermediate) and have forgotten why the topic is hard to understand when you hear it. The context of your particular background, your particular style, and your knowledge level will give a different twist to what you’re writing about.