Apple is sponsoring the International Conference on Acoustics, Signal Processing, and Speech (ICASSP), which will take place in person from June 4-10 on the island of Rhodes, Greece. ICASSP is the IEEE Signal Processing Society’s flagship conference on signal processing and its applications. Below is the schedule of Apple-sponsored workshops and events at ICASSP 2023.
Schedule
Tuesday, June 6
- I See What You Hear: A Vision-Inspired Method of Locating Words
- 10:50 AM – 12:20 PM LT in Salon des Roses A
- Mohammad Samragh, Arnav Kundu, Ting-Yao Hu, Aman Chadha, Ashish Srivastava, Minsik Cho, Once Tuzel, Devang Naik
- Variable attention masking for configurable transformer transducer speech recognition
- 10:50 a.m. to 12:20 p.m.
- Pawel Swietojanski, Stefan Braun, Dogan Can, Thiago Fraga da Silva, Arnab Ghoshal, Takaaki Hori, Roger Hsiao, Henry Mason, Erik McDermott, Honza Silovsky, Ruchir Travadi, Xiaodan Zhuang
- Text is all you need: customization of ASR models through controllable speech synthesis
- 2:00 – 3:30 pm LT in sign area 2 – Garden
- Karren Yang, Ting-Yao Hu, Jen-Hao Rick Chang, Hema Swetha Koppula, Oncel Tuzel
- Neural transducer training: reduced memory consumption with sample calculation
- 2:00 – 3:30 pm LT in Poster Area 3 – Garden
- Stefan Braun, Erik McDermott, Roger Hsiao
- More speakers or more speakers?
- 2:00 – 3:30 pm LT in Poster Area 3 – Garden
- Dan Berrebbi, Ronan Collobert, Navdeep Jaitly, Tatiana Likhomanenko
- Audio-to-Intent using end-to-end ASR acoustic-textual subword representations
- 2:00 – 3:30 pm LT in sign area 4 – Garden
- Pranay Dighe, Prateeth Nayak, Oggi Rudovic, Erik Marchi, Xiaochuan Niu, Ahmed Tewfik
Wednesday, June 7
- HEiMDaL: highly efficient method for the detection and localization of keywords
- 8:15 – 9:45 am LT in sign area 8 – Dome
- Arnav Kundu, Mohammad Samragh Razlighi, Minsik Cho, Priyanka Padmanabhan, Devang Naik
- Women in signal processing
- 12:20 – 2:20 pm LT at Ambrosia Restaurant
Thursday, June 8
- Naturalistic generation of head movement from speech
- 10:50 AM – 12:20 PM LT in Salon des Roses A
- Trisha Mittal, Zakaria Aldeneh, Masha Fedzechkina, Anurag Ranjan, Barry-John Theobald
- Student Job Fair and Lunch
- 12:00 – 15:00 LT at Ambrosia Restaurant
- Pretrained model representations and their robustness to noise for speech emotion analysis
- 2:00 – 3:30 pm LT in sign area 4 – Garden
- Vikramjit Mitra, Vasudha Kowtha, Hsiang-Yun Sherry Chien, Erdrin Azemi, Carlos Avendano
- On the role of lip articulation in the visual perception of speech
- 2:00 – 3:30 pm LT in sign area 10 – Dome
- Zakaria Aldeneh, Masha Fedzechkina, Skyler Seto, Katherine Metcalf, Miguel Sarabia, Nicholas Apostoloff, Barry-John Theobald
- PRESENTATION OF POSTER OR POSTER
- Learning to detect novel and fine-grained acoustic sequences using pretrained audio representations
- 3:35 – 5:05 PM LT in Poster Area 2 – Garden
- Vasudha Kowtha, Miquel Espi, Jonathan J Huang, Yichi Zhang, Carlos Avendano
Friday, June 9
- Improvements to ASR acoustic-to-word-embedding-matching using multi-hypothesis pronunciation-based embeddings
- 8:15 – 9:45 AM in Poster Area 4 – Garden
- Hao Yen, Woojay Jeon
accepted papers
Audio-to-Intent using end-to-end ASR acoustic-textual subword representations
Pranay Dighe, Prateeth Nayak, Oggi Rudovic, Erik Marchi, Xiaochuan Niu, Ahmed Tewfik
HEiMDaL: highly efficient method for the detection and localization of keywords
Arnav Kundu, Mohammad Samragh Razlighi, Minsik Cho, Priyanka Padmanabhan, Devang Naik
I See What You Hear: A Vision-Inspired Method of Locating Words
Mohammad Samragh, Arnav Kundu, Ting-Yao Hu, Aman Chadha, Ashish Srivastava, Minsik Cho, Once Tuzel, Devang Naik
Improvements to ASR acoustic-to-word-embedding-matching using multi-hypothesis pronunciation-based embeddings
Hao Yen, Woojay Jeon
Learning to detect novel and fine-grained acoustic sequences using pretrained audio representations
Vasudha Kowtha, Miquel Espi, Jonathan J Huang, Yichi Zhang, Carlos Avendano
More speakers or more speakers?
Dan Berrebbi, Ronan Collobert, Navdeep Jaitly, Tatiana Likhomanenko
Naturalistic generation of head movement from speech
Trisha Mittal, Zakaria Aldeneh, Masha Fedzechkina, Anurag Ranjan, Barry-John Theobald
Neural transducer training: reduced memory consumption with sample calculation
Stefan Braun, Erik McDermott, Roger Hsiao
On the role of lip articulation in the visual perception of speech
Zakaria Aldeneh, Masha Fedzechkina, Skyler Seto, Katherine Metcalf, Miguel Sarabia, Nicholas Apostoloff, Barry-John Theobald
Pretrained model representations and their robustness to noise for speech emotion analysis
Vikramjit Mitra, Vasudha Kowtha, Hsiang-Yun Sherry Chien, Erdrin Azemi, Carlos Avendano
Text is all you need: customization of ASR models through controllable speech synthesis
Karren Yang, Ting-Yao Hu, Jen-Hao Rick Chang, Hema Swetha Koppula, Oncel Tuzel
Variable attention masking for configurable transformer transducer speech recognition
Pawel Swietojanski, Stefan Braun, Dogan Can, Thiago Fraga da Silva, Arnab Ghoshal, Takaaki Hori, Roger Hsiao, Henry Mason, Erik McDermott, Honza Silovsky, Ruchir Travadi, Xiaodan Zhuang
Manifestation
Contextual understanding in Siri
This is a demonstration of the context-aware technology included in Siri. Users can refer to a previously mentioned entity using anaphoras or nominal ellipses, refer to an on-screen entity, or correct a previous Siri or user error. Context Understanding for Siri takes advantage of several back-end ML solutions, such as query rewrite and reference resolution. This work is a step towards having more natural conversations with Siri and shipped in iOS 16.
All ICASSP attendees are invited to stop by the Apple booth (booth number 16, located next to the main entrance of the Dome Bar at the Rodos Palace Luxury Convention Resort) to experience this demo in person.
Thanks
Tatiana Likhomanenko, Arnav Kundu, Stefan Braun, Vikram Mitra, and Pawel Swietojanski are reviewers for ICASSP 2023.
Yannis Stylianou is Chairman of Short Courses and Seasonal Schools for ICASSP 2023.
Let’s innovate together. Create amazing machine learning experiences with Apple. Discover opportunities for researchers, students and developers by visiting our Work with us page.