Abstract: In recent years, deep learning-based audio signal processing is a popular way to extract features from audio signals and make the system learnt about those extracted features and patterns.