RIKEN Center for Advanced Intelligence Project Sound Scene Understanding Team
Team Leader: Kazuyoshi Yoshii (Ph.D.)
Research Summary

The Sound Scene Understanding team is developing analysis techniques for various kinds of audio signals including speech, music, and environmental sounds. Our approach is to formulate physically- or theoretically-reasonable probabilistic generative models that reflect the characteristics of target signals and solve the inverse problem. We tackle real-world problems by integrating Bayesian learning with deep learning.
Research Subjects:
- Statistical Audio Signal Processing (Source Separation/Localization, Speech Enhancement)
- Bayesian Learning (Hierarchical Bayes, Nonparametric Bayes)
- Music Information Processing (Source Separation, Automatic Music Transcription)
Main Research Fields
- Computer Science
Related Research Fields
- Engineering
- Mathematics
Selected Publications
- 1.
Yoshiaki Sumura, Diego Di Carlo, Aditya Arie Nugraha, Yoshiaki Bando, Kazuyoshi Yoshii.:
"Joint Audio Source Localization and Separation With Distributed Microphone Arrays Based on Spatially-Regularized Multichannel NMF."
IEEE International Workshop on Acoustic Signal Enhancement (IWAENC), pp. 145-149, September 2024. - 2.
Liam Kelley, Diego Di Carlo, Aditya Arie Nugraha, Mathieu Fontaine, Yoshiaki Bando, Kazuyoshi Yoshii.:
"RIR-in-a-Box: Estimating Room Acoustics from 3D Mesh Data through Shoebox Approximation."
Annual Conference of the International Speech Communication Association (Interspeech), pp. 3255–3259, September 2024. - 3.
Diego Di Carlo, Aditya Arie Nugraha, Mathieu Fontaine, Yoshiaki Bando, Kazuyoshi Yoshii.:
"Neural Steerer: Novel Steering Vector Synthesis with a Causal Neural Field over Frequency and Direction."
IEEE International Conference on Acoustics, Speech, and Signal Processing Workshops (ICASSPW), pp. 740–744, April 2024. - 4.
Aditya Arie Nugraha, Diego Di Carlo, Yoshiaki Bando, Mathieu Fontaine, Kazuyoshi Yoshii.:
"Time-Domain Audio Source Separation Based on Gaussian Processes with Deep Kernel Learning."
IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp.1–5, October 2023. - 5.
Yoshiaki Bando, Yoshiki Masuyama, Aditya Arie Nugraha, Kazuyoshi Yoshii.:
"Neural Fast Full-Rank Spatial Covariance Analysis for Blind Source Separation."
European Signal Processing Conference (EUSIPCO),pp. 51–55, September 2023. - 6.
Kouhei Sekiguchi, Aditya Arie Nugraha, Yicheng Du, Yoshiaki Bando, Mathieu Fontaine, Kazuyoshi Yoshii.:
"Direction-Aware Adaptive Online Neural Speech Enhancement with an Augmented Reality Headset in Real Noisy Conversational Environments."
IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 266–9273, October 2022. - 7.
Yicheng Du, Aditya Arie Nugraha, Kouhei Sekiguchi, Yoshiaki Bando, Mathieu Fontaine, Kazuyoshi Yoshii.:
"Direction-Aware Joint Adaptation of Neural Speech Enhancement and Recognition in Real Multiparty Conversational Environments."
Annual Conference of the International Speech Communication Association Interspeech), pp. 2918–2922, September 2022. - 8.
Aditya Arie Nugraha, Kouhei Sekiguchi, Mathieu Fontaine, Yoshiaki Bando, Kazuyoshi Yoshii.:
"DNN-Free Low-Latency Adaptive Speech Enhancement Based on Frame-Online Beamforming Powered by Block-Online FastMNMF."
IEEE International Workshop on Acoustic Signal Enhancement (IWAENC), pp. 1–5, September 2022. - 9.
Kouhei Sekiguchi, Yoshiaki Bando, Aditya Arie Nugraha, Mathieu Fontaine, Kazuyoshi Yoshii, Tatsuya Kawahara.:
"Autoregressive Moving Average Jointly-Diagonalizable Spatial Covariance Analysis for Joint Source Separation and Dereverberation."
IEEE/ACM Transactions on Audio, Speech, and Language Processing, Vol. 30, pp. 2368–2382, 2022. - 10.
Mathieu Fontaine, Kouhei Sekiguchi, Aditya Arie Nugraha, Yoshiaki Bando, Kazuyoshi Yoshii.:
"Generalized Fast Multichannel Nonnegative Matrix Factorization Based on Gaussian Scale Mixtures for Blind Source Separation."
IEEE/ACM Transactions on Audio, Speech, and Language Processing, Vol. 30, pp. 1734–1748, 2022.
Related Links
Lab Members
Principal investigator
- Kazuyoshi Yoshii
- Team Leader
Core members
- Aditya Arie Nugraha
- Research Scientist
- Diego Di Carlo
- Postdoctoral Researcher
- Yoshiaki Bando
- Visiting Scientist
- Hidetoshi Shimodaira
- Visiting Scientist
- Makoto Yamada
- Visiting Scientist
- Mathieu Francois Gustave Fontaine
- Visiting Scientist
- Momose Oyama
- Research Part-time Worker I
- Yoto Fujita
- Research Part-time Worker I
- Ryosuke Ono
- Research Part-time Worker II
- Ryunosuke Nihei
- Research Part-time Worker II
Contact Information
Yoshida-honmachi, Sakyo, Kyoto 606-8501
Email: kazuyoshi.yoshii@riken.jp