Apple is sponsoring the 25th annual Interspeech conference, taking place in Kos, Greece, from September 1-5. Interspeech focuses on research related to the science and technology of speech processing. Below is the schedule of Apple-sponsored workshops and events at Interspeech 2024.
Schedule
Visit the Apple booth at Kipriotis Hotels & Conference Center, Floor 1, Booth #4, from 10:30 a.m. to 7 p.m. on Monday, September 2, from 9:30 a.m. to 6 p.m. on Tuesday, September 3 and Wednesday, September 4, and from 10:30 a.m. to 4 p.m. on Thursday, September 5 (all times are GMT+3).
Saturday, August 31st
Wednesday, September 4th
Thursday, September 5th
- ESPnet-SPK: Complete channel speaker verification toolkit with multiple reproducible recipes, self-monitoring interfaces, and ready-to-use models
- 11:00 – 11:20 GMT+3, Iasso
- Jee-weon Jung, Wangyou Zhang, Jiatong Shi, Zak Aldeneh, Takuya Higuchi, Barry Theobald, Ahmed Hussen Abdelaziz, Shinji Watanabe
- Large multimodal language models with low-rank fusion adaptation for device-directed speech detection
- 13:30 – 15:30 GMT+3, Poster Area 2A
- Shruti Palaskar, Oggi Rudovic, Sameer Dharur, Florian Pesce, Gautam Krishna, Aswin Sivaraman, Jack Berkowitz, Ahmed Hussen Abdelaziz, Saurabh Adya, Ahmed Tewfik
Accepted articles
Is it possible to eliminate the top-down model for speaker recognition with self-supervised speech features?
Zak Aldeneh, Takuya Higuchi, Jee-weon Jung, Skyler Seto, Tatiana Likhomanenko, Stephen Shum, Ahmed Hussen Abdelaziz, Shinji Watanabe
Comparative analysis of custom voice activity detection systems: Evaluating real-world effectiveness
Satyam Kumar, Sai Srujana Buddi, Oggy Sarawgi, Vineet Garg, Shivesh Ranjan, Oggi Rudovic, Ahmed Hussen Abdelaziz, Saurabh Adya
Improving CTC-based speech recognition with various modeling units
Michael Han, Zhihong Lei, Mingbin Xu, Xingyu Na, Zhen Huang
ESPnet-SPK: Complete channel speaker verification toolkit with multiple reproducible recipes, self-monitoring interfaces, and ready-to-use models
Jee-weon Jung, Wangyou Zhang, Jiatong Shi, Zak Aldeneh, Takuya Higuchi, Barry Theobald, Ahmed Hussen Abdelaziz, Shinji Watanabe
Large multimodal language models with low-rank fusion adaptation for device-directed speech detection
Shruti Palaskar, Oggi Rudovic, Sameer Dharur, Florian Pesce, Gautam Krishna, Aswin Sivaraman, Jack Berkowitz, Ahmed Hussen Abdelaziz, Saurabh Adya, Ahmed Tewfik
New vision of acoustic synthesis from 3D reconstructed rooms
Byeongjoo Ahn, Karren Yang, Brian Hamilton, Jonathan Sheaffer, Anurag Ranjan, Oncel Tuzel, Miguel Sarabia del Castillo, Rick Chang
Positional description for numerical normalization
Deepanshu Gupta, Javier Latorre Martinez
RepCNN: Powerful micro models for wake word detection
Arnav Kundu, Prateeth Nayak, Priyanka Padmanabhan, Devang Naik
Transformer-based model for ASR rewriting and N-Best scoring
Edwin Kang, Christophe Van Gysel and Man-Hung Siu
Expressions of gratitude
Arnav Kundu, Ilya Oparin, Javier Latorre Martinez, Lyan Verwimp, Markus Nussbaum-Thom, Mirko Hannemann, Thiago Fraga da Silva, Tuomo Raitio and Tatiana Likhomanenko are Interspeech reviewers.