Intelligent Sound Processing Laboratory (ISP-Lab)
Address: Room 27, Research laboratories complex of CSE, Shahid Beheshti University, Evin, Tehran, Iran.
Research Topics (2018-current):
– Spoofing Speech Detection (Audio Deepfake)
– Automatic Speaker Identification (SID) & Automatic Speaker Verification (ASV)
– Automatic Spoken Language Identification (LID) & Automatic Spoken Gender Identification (GID)
– Automatic Spoken Keyword Spotting (KWS) & Spoken Term Detection (STD)
– Automatic Spoken Emotion Recognition (SER)
– Voice Activity Detection (VAD) & Speech Activity Detection (SAD)
– Automatic Speaker Diarization (Speaker Segmentation)
– Automatic Speech Recognition (ASR) & Speech-to-Text (STT)
– Persian/Farsi Text-to-Speech (TTS) & Voice Synthesizer
– Voice Pathology Detection and Classification From Spontaneous/Read Speech or Phones
– Automatic Audio Scene Recognition
– Audio Source Separation & Speech Enhancement
– Anomalous Sound Detection (ASD)
– English-to-Persian Voice Actor Recommender System
– Diagnosis of Depression from Speech Signals of Conversations
– Alzheimer’s Dementia Recognition From Spontaneous Speech
– Imagined Speech Classification using EEG signals
– ElectroMagnetic Articulography (EMA Signals) to measure the position of parts of the mouth
– Heart Sound Signal Classification using PCG signals (phonocardiogram)
– Music genre classification
———————————————————————————————————-
PhD Candidates:
Mr. Hossein Fayyazi (2021), PhD Thesis: Interpretable Spoofing Speech Detection, (Paper1C, Paper2J, Paper3J, Paper4J, Paper5, Paper6)
PhD Students:
Mr. Siavosh Djazmi (2024),
Ms. Maryam Alizadeh (2023),
Mr. Farbod Haddaian (2022),
MSc Students:
Ms. Masoumeh Javidan (2024), MSc Thesis: Improving Text-to-Speech Systems Using Articulatory Features to Deceive Audio Deepfake Detection Systems.
Ms. Parisa Ahmadzadeh-Raji (2023), MSc Thesis: Music genre classification based on representation learning of acoustic characteristics.
Ms. Parnaz Latifi (2023), MSc Thesis (Data Science): Spoken language identification using embedding vectors extracted from a deep learning model with attention mechanism.
Mr. Ali Hekmat (2023), MSc Thesis (Data Science): Speech emotion recognition in Persian using interpretable generative adversarial networks.
Mr. Mohamadjavad Rahsepar-yeganeh (2022), MSc Thesis: A Hybrid Model based on Deep Learning using Spectral, Temporal, Acoustic, and Linguistic Information of the Speech for Diagnosis of Alzheimer’s Disease.
Ms. Mahya Rajaei (2021), MSc Thesis: Design of an Operational System for Depression Diagnosis from the Speech Signal of Conversations using the Refinement of Input data.
Ms. Sepideh Saeidinia (2018), MSc Thesis: Feature Extraction based on Human Auditory Models for Acoustic Scene Classification.
MSc Alumni:
Ms. Helia Khoshroo (2022), MSc Thesis (Data Science): Recognition and Prediction of Chaotic Time Series using Machine Learning Methods in the Reconstructed Phase Space. (Paper1)
Ms. Sepideh Pirhayati (2022), MSc Thesis: Anomalous Sound Detection for Machine Condition Monitoring using Representation Learning of Acoustic Features. (Paper1)
Ms. Sahar Farazi (2021), MSc Thesis: Designing a Voice Disorder Detection System based on Spontaneous Speech using Side Information. (Paper1J, Paper2)
Mr. Ali Yazdani (2020), MSc Thesis: Emotion Recognition in Persian Speech using Acoustic and Linguistic Information. (Paper1C, Paper2C, Paper3C, Paper4)
Ms. Sogol Alipour-esgandani (2020), MSc Thesis: Introducing an English-To-Persian Voice Actor Recommender System Using Acoustic Features. (Paper1, Paper2)
Mr. Ashkan Moradi (2019), MSc Thesis: Performance Improvement of Spoken Language Identification System via Fusion of the Acoustic and Phonetic Information based on Learning Methods. (Paper1C, Paper2J)
Ms. Sheida Morakkabati (2019), MSc Thesis: Classification of Heart Sound Signals (PCG) by Modeling in Reconstructed Phase Space (RPS). (Paper1C, Paper2)
Mr. Saeed Zarei (2019), MSc Thesis: Performance Improvement of Keyword Spotting System using Post-processing of Keywords Hypotheses Based on Sub-words Unit Modeling. (Paper1C, Paper2C, Paper3)
Mr. Alireza Azadi (2018), Advisor of MSc Thesis: A Context-Aware Speech Enhancement FPGA-SoC-Based Architecture. (Paper1C, Paper2)
Ms. Mahboubeh Sheidaei (2018), MSc Thesis: Identification of Imagined Speech in BCI Application Using Nonlinear Dynamical Analysis of EEG Signal. (Paper1C, Paper2W)
Ms. Atefeh Kerdekari Khosroshahi (2018), MSc Thesis: Diagnosis and Identification of Speech Disorders Using Speech Attractors Modeling in the Reconstructed Phase Space (RPS). (PaperC)
Mr. Pouya Payandeh (2018), MSc Thesis: Separation of Speech Signal from Background using Deep Neural Networks For Speech Recognition Application.
Ms. Mahsa Hedayatipour (2017), MSc Thesis: Recognition of CV syllables from the Lip Images. (Paper1C, Paper2C, Paper3C, Paper4)
—————————————————————————————-
Seminars & Webinars
– Speech technology in medical and healthcare (AICup-2023, Slides)
Recent Comments