ISP-Lab

Intelligent Sound Processing Laboratory (ISP-Lab

Address: Room 27, Research laboratories complex of CSE, Shahid Beheshti University, Evin, Tehran, Iran.

Research Topics (2018-current):

– Spoofing Speech Detection (Audio Deepfake)

– Automatic Speaker Identification (SID) & Automatic Speaker Verification (ASV)

– Automatic Spoken Language Identification (LID) & Automatic Spoken Gender Identification (GID)

– Automatic Spoken Keyword Spotting (KWS) & Spoken Term Detection (STD)

– Automatic Spoken Emotion Recognition (SER)

– Voice Activity Detection (VAD) & Speech Activity Detection (SAD) 

– Automatic Speaker Diarization (Speaker Segmentation) 

– Automatic Speech Recognition (ASR) & Speech-to-Text (STT)

– Persian/Farsi Text-to-Speech (TTS) & Voice Synthesizer

– Voice Pathology Detection and Classification From Spontaneous/Read Speech or Phones

– Automatic Audio Scene Recognition

– Audio Source Separation & Speech Enhancement

– Anomalous Sound Detection (ASD)

– English-to-Persian Voice Actor Recommender System ‎

– Diagnosis of Depression from Speech Signals of Conversations

– Alzheimer’s Dementia Recognition From Spontaneous Speech

– Imagined Speech Classification using EEG signals 

– ElectroMagnetic Articulography (EMA Signals) to measure the position of parts of the mouth

– Heart Sound Signal Classification using PCG signals (phonocardiogram) 

– Music genre classification 

———————————————————————————————————-

PhD Candidates: 

Mr. Hossein Fayyazi (2021), PhD Thesis: Interpretable Spoofing Speech Detection, (Paper1C, Paper2J, Paper3J, Paper4J, Paper5, Paper6) 

 

PhD Students:

Mr. Siavosh Djazmi (2024), 

Ms. Maryam Alizadeh (2023), 

Mr. Farbod Haddaian (2022), 

 

MSc Students:

Ms. Masoumeh Javidan (2024), MSc Thesis: Improving Text-to-Speech Systems Using Articulatory Features to Deceive Audio Deepfake Detection Systems.  

Ms. Parisa Ahmadzadeh-Raji (2023), MSc Thesis: Music genre classification based on representation learning of acoustic characteristics. 

Ms. Parnaz Latifi (2023), MSc Thesis (Data Science): Spoken language identification using embedding vectors extracted from a deep learning model with attention mechanism. 

Mr. Ali Hekmat (2023), MSc Thesis (Data Science): Speech emotion recognition in Persian using interpretable generative adversarial networks. 

Mr. Mohamadjavad Rahsepar-yeganeh (2022), MSc Thesis: A Hybrid Model based on Deep Learning using Spectral, Temporal, Acoustic, and Linguistic Information of the Speech for Diagnosis of Alzheimer’s Disease.

Ms. Mahya Rajaei (2021), MSc Thesis: Design of an Operational System for Depression Diagnosis from the Speech Signal of Conversations using the Refinement of Input data. 

Ms. Sepideh Saeidinia (2018), MSc Thesis: Feature Extraction based on Human Auditory Models for Acoustic Scene Classification.

MSc Alumni:

Ms. Helia Khoshroo (2022), MSc Thesis (Data Science): Recognition and Prediction of Chaotic Time Series using Machine Learning Methods in the Reconstructed Phase Space. (Paper1, Paper2) 

Ms. Sepideh Pirhayati (2022), MSc Thesis: Anomalous Sound Detection for Machine Condition Monitoring using Representation Learning of Acoustic Features. (Paper1)

Ms. Sahar Farazi (2021), MSc Thesis: Designing a Voice Disorder Detection System based on Spontaneous Speech using Side Information. (Paper1J, Paper2, Paper3)

Mr. Ali Yazdani (2020), MSc Thesis: Emotion Recognition in Persian Speech using Acoustic and Linguistic Information. (Paper1C, Paper2C, Paper3C, Paper4)

Ms. Sogol Alipour-esgandani (2020), MSc Thesis: Introducing an English-To-Persian Voice Actor Recommender System Using Acoustic Features. (Paper1, Paper2, Paper3)

Mr. Ashkan Moradi (2019), MSc Thesis: Performance Improvement of Spoken Language Identification System via Fusion of the Acoustic and Phonetic Information based on Learning Methods. (Paper1C, Paper2J)

Ms. Sheida Morakkabati (2019), MSc Thesis: Classification of Heart Sound Signals (PCG) by Modeling in Reconstructed Phase Space (RPS). (Paper1C, Paper2)

Mr. Saeed Zarei (2019), MSc Thesis: Performance Improvement of Keyword Spotting System using Post-processing of Keywords Hypotheses Based on Sub-words Unit Modeling. (Paper1C, Paper2C, Paper3)

Mr. Alireza Azadi (2018), Advisor of MSc Thesis: A Context-Aware Speech Enhancement FPGA-SoC-Based Architecture.  (Paper1C, Paper2) 

Ms. Mahboubeh Sheidaei (2018), MSc Thesis: Identification of Imagined Speech in BCI Application Using Nonlinear Dynamical ‎Analysis of EEG Signal. ‎(Paper1C, Paper2W)

Ms. Atefeh Kerdekari Khosroshahi (2018), MSc Thesis: Diagnosis and Identification of Speech Disorders Using Speech Attractors Modeling in the Reconstructed Phase Space (RPS). (PaperC)

Mr. Pouya Payandeh (2018), MSc Thesis: Separation of Speech Signal from Background using Deep Neural Networks For Speech ‎Recognition Application.

Ms. Mahsa Hedayatipour (2017), MSc Thesis: Recognition of CV syllables from the Lip Images. (Paper1C, Paper2C, Paper3C, Paper4) 

—————————————————————————————-

Seminars & Webinars 

Speech technology in medical and healthcare (AICup-2023, Slides