Music genre classifier using fast.ai
A side-project done while attending Practical Deep Learning for Coders Course
- Downloading packages and importing libraries
- Collecting Data
- Quick EDA and Data Cleaning
- Loading Data using fastai DataLoaders
- Training fastai model
- Pushing models to hugging face
- Taking a look at results
- Inference function
- Conclusion
During first lesson of Practical Deep Learning for Coders course, Jeremy had mentioned how using simple computer vision model we can build even a model to classify audio with image classification model itself.
Recently Kaggle grandmaster Rob Mulla conducted a challenge to classify music according to what genre it was. At stakes there was a RTX 3080 Ti GPU. Let's look how we can classify music genres using a simple computer vision model which was taught in the first lesson of fast.ai.
! pip install -Uqq kaggle git+https://github.com/huggingface/huggingface_hub#egg=huggingface-hub["fastai"]
from fastai.data.all import *
from fastai.imports import *
from fastai.vision.all import *
from huggingface_hub import push_to_hub_fastai
Collecting Data
In this piece of code, I will show you how you can download datasets from Kaggle in general and the datasets I had used for training model. Inorder to train models in audio, first convert the audio to a spectogram and throw an image model. Check this tweet from Dien Hoa Truong who won a NVIDIA RTX 3080 Ti GPU in this competition.
This tweet makes me stick to the spectrogram approach for the Kaggle PogChamp music genre classification competition, and finish #1. Thanks @marktenenholtz :) https://t.co/xwtXQRfk51
— Dien Hoa Truong (@DienhoaT) April 28, 2022
For this competition you need two datasets:
The data provided here are over 20,000 royalty free song samples (30 second clips) and their musical genres. Your task is to create a machine learning algorithm capable of predicting the genres of unlabeled music files. Create features, design architectures, do whatever it takes to predict them the best.
The code for downloading data from kaggle has been adopted from Jeremy's notebook
creds = ""
from pathlib import Path
cred_path = Path("~/.kaggle/kaggle.json").expanduser()
if not cred_path.exists():
cred_path.parent.mkdir(exist_ok=True)
cred_path.write_text(creds)
cred_path.chmod(0o600)
path = Path("../input/kaggle-pog-series-s01e02")
path.ls()
from zipfile import ZipFile
from kaggle import api
if not path.exists():
api.competition_download_cli(str(path))
ZipFile(f"{path}.zip").extractall(path)
! kaggle datasets download -d dienhoa/music-genre-spectrogram-pogchamps
df_train = pd.read_csv("../input/kaggle-pog-series-s01e02/train.csv")
df_train.head()
df_train["filepath"] = df_train["filepath"].str.replace("ogg", "png")
Shows a highly imbalanced dataset
df_train["genre"].value_counts()
df_train.head()
df_train = df_train.set_index("song_id")
df_train = df_train.drop(
[
23078,
3137,
4040,
15980,
11088,
9963,
24899,
16312,
22698,
17940,
22295,
3071,
13954,
]
)
df_train.shape
Loading Data using fastai DataLoaders
For creating this notebook, I spend a major portion of my time in cleaning and sorting out appropriate datablocks/dataloaders for training image models using fast.ai. This is something which you as a practitioner experience, compared to learning all the theory and backpropogation algorithm.
So let's see how we load this data using fast.ai. There are two approaches which we will discuss below. Both the approaches of loading data works, but the first approach as a disadvantage, which I will tell in a moment.
Approach 1. Using DataBlock and loading images
- Create a data frame
temp_train
and create new columnis_valid
-
is_valid
is default column named created for using ColSplitter - Now set
get_x
which specifies the path of files for inputting data which is set as base_path+filename path: lambda o:f'{path}/'+o.path - Now set
get_y
which specifies the variable to predict, ie the genre of music
temp_train = df_train
temp_train.loc[:15000, "is_valid"] = True
temp_train.loc[15000:, "is_valid"] = False
path = Path("../input/music-genre-spectrogram-pogchamps/spectograms/")
dblock = DataBlock(
blocks=(ImageBlock, CategoryBlock),
splitter=ColSplitter(),
get_x=lambda o: f"{path}/" + o.path,
get_y=lambda o: o.genre,
item_tfms=Resize(224),
batch_tfms=aug_transforms(),
)
dls = dblock.dataloaders(temp_train)
dls.show_batch()
# dblock.summary(df_train)
This worked really well, and with this approach I was even able to train a ML model which got 50% accuracy.
Saturday evening side-project: Trained a baseline ML model to classify audio files to identify their music genre using @fastdotai based on a kaggle dataset.
— Kurian Benoy (@kurianbenoy2) April 30, 2022
Acheived only 50% accuracy, probably because problem is hard. Next job is to check what @DienhoaT has done to win a GPU. pic.twitter.com/EahvgtYBDL
Yet when it came to export models, due to usage of lamda method in DataBlock. I got Pickling error as the model was not able to be exported with learn.export()
method.
2. Using DataLoaders methods with loading from dataframe method
This issue got me into using approach that using ImageDataLoaders.from_df
in fastai. Let's first take a look at our df_train
dataframe:
df_train.head()
If you look at the dataframe, we know that on appending to the path, the filepath column.
- This is the exact value for
get_x
method in fastaifn_col = 1
which specifies the column namefilepath
at position 1. - label or
get_y
is specified by the column namegenre
at position 3. - valid_pct (ensure what percentage of data to be used for validation)
- y_block=CategoryBlock to ensure it's used for normal classification only and not multi-label
dls = ImageDataLoaders.from_df(
df_train,
path,
valid_pct=0.2,
seed=34,
y_block=CategoryBlock,
item_tfms=Resize(460),
batch_tfms=aug_transforms(size=224),
fn_col=1,
label_col=3,
)
dls.show_batch()
learn = vision_learner(dls, resnet50, metrics=error_rate)
learn.lr_find()
learn.fine_tune(10, 0.0008317637839354575)
learn.export("model.pkl")
Pushing models to hugging face
huggingface_hub
has released two new functions to easily push fastai models to Huggingface Hub.
- Using
push_to_hub_fastai
you can easily push the fastai Learner to huggingface. - Additionally, you can load any fastai Learner from the Hub using
from_pretrained_fastai
Omar Espejel had shared a fantastic notebook on these new functionalities in huggingface here.
from huggingface_hub import push_to_hub_fastai
push_to_hub_fastai(
learn,
"kurianbenoy/music_genre_classification_baseline",
commit_message="Resnet50 with 10 epochs of training",
)
If you want to load this model in fastai and use it directly for inference just from_pretrain_fastai
as shown in the below screenshot:
learn.show_results()
interp = Interpretation.from_learner(learn)
interp.plot_top_losses(9, figsize=(15, 10))