Releasing MalWhisper models

Released two version of fine-tuning OpenAI Whisper on medium and small checkpoint by fine-tuning on IMasc dataset. The models where released here:

Malwhisper-v1-medium
Malwhisper-v1-small

About IMaSC dataset

IMaSC is a Malayalam text and speech corpus made available by ICFOSS for the purpose of developing speech technology for Malayalam, particularly text-to-speech. The corpus contains 34,473 text-audio pairs of Malayalam sentences spoken by 8 speakers, totalling in approximately 50 hours of audio.