Course Details

Course details of EE5171
Course NoEE5171
Course TitleDeep Learning based Speech Processing
Credit12
Course Content(1) Fundamentals of Speech Signals and Signal Processing (2 weeks) – review of fourier transforms, short-time fourier transform and spectrograms, phonemes, source-filter models, pitch and formants, cepstral analysis (2) Statistical Approaches to Speech Signal Processing -- LPC, GMM, HMMs, N-gram LM (2 weeks) (3) Basics of Neural Networks (1 week) – Feedforward NNs, backpropogation, RNNs, LSTMs (4) Sequence Modelling RNNs, RNN-LM (1 week) (5) Transformers and Self Attention (1 week) (5) Self Supervised Learning Theory (2 weeks) word2vec, BERT, GPT, wav2vec2.0, HuBERT, wavLM, data2vec (6) Build Automatic Speech Recognition (like Gboard etc) from scratch in Indian languages (and English) (1 week) – encoder-decoder models, non-autoregressive models, CTC loss (7) Building Transformer Language Models (GPT) from scratch and instruction tuning (1 week) (8) Build Text to Speech system from Scratch in Indian languages and English (1 week) – fastspeech2, VITS (9) Build Speaker Verification System from scratch for Voice biometrics ( 1 week) – ECAPA-TDNN
Course Offered this semesterNo
Faculty Name