International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2023

Apple is sponsoring the International Conference on Acoustics, Signal Processing, and Speech (ICASSP), which will take place in person from June 4-10 on the island of Rhodes, Greece. ICASSP is the IEEE Signal Processing Society’s flagship conference on signal processing and its applications. Below is the schedule of Apple-sponsored workshops and events at ICASSP 2023.

Schedule

Tuesday, June 6

ORAL PRESENTATION
I See What You Hear: A Vision-Inspired Method of Locating Words
10:50 AM – 12:20 PM LT in Salon des Roses A
Mohammad Samragh, Arnav Kundu, Ting-Yao Hu, Aman Chadha, Ashish Srivastava, Minsik Cho, Once Tuzel, Devang Naik
PRESENTATION OF POSTER OR POSTER
Variable attention masking for configurable transformer transducer speech recognition
10:50 a.m. to 12:20 p.m.
Pawel Swietojanski, Stefan Braun, Dogan Can, Thiago Fraga da Silva, Arnab Ghoshal, Takaaki Hori, Roger Hsiao, Henry Mason, Erik McDermott, Honza Silovsky, Ruchir Travadi, Xiaodan Zhuang
PRESENTATION OF POSTER OR POSTER
Text is all you need: customization of ASR models through controllable speech synthesis
2:00 – 3:30 pm LT in sign area 2 – Garden
Karren Yang, Ting-Yao Hu, Jen-Hao Rick Chang, Hema Swetha Koppula, Oncel Tuzel
PRESENTATION OF POSTER OR POSTER
Neural transducer training: reduced memory consumption with sample calculation
2:00 – 3:30 pm LT in Poster Area 3 – Garden
Stefan Braun, Erik McDermott, Roger Hsiao
PRESENTATION OF POSTER OR POSTER
More speakers or more speakers?
2:00 – 3:30 pm LT in Poster Area 3 – Garden
Dan Berrebbi, Ronan Collobert, Navdeep Jaitly, Tatiana Likhomanenko
PRESENTATION OF POSTER OR POSTER
Audio-to-Intent using end-to-end ASR acoustic-textual subword representations
2:00 – 3:30 pm LT in sign area 4 – Garden
Pranay Dighe, Prateeth Nayak, Oggi Rudovic, Erik Marchi, Xiaochuan Niu, Ahmed Tewfik

Wednesday, June 7

PRESENTATION OF POSTER OR POSTER
HEiMDaL: highly efficient method for the detection and localization of keywords
8:15 – 9:45 am LT in sign area 8 – Dome
Arnav Kundu, Mohammad Samragh Razlighi, Minsik Cho, Priyanka Padmanabhan, Devang Naik
LUNCH
Women in signal processing
12:20 – 2:20 pm LT at Ambrosia Restaurant

Thursday, June 8

ORAL PRESENTATION
Naturalistic generation of head movement from speech
10:50 AM – 12:20 PM LT in Salon des Roses A
Trisha Mittal, Zakaria Aldeneh, Masha Fedzechkina, Anurag Ranjan, Barry-John Theobald
WORK FAIR
Student Job Fair and Lunch
12:00 – 15:00 LT at Ambrosia Restaurant
PRESENTATION OF POSTER OR POSTER
Pretrained model representations and their robustness to noise for speech emotion analysis
2:00 – 3:30 pm LT in sign area 4 – Garden
Vikramjit Mitra, Vasudha Kowtha, Hsiang-Yun Sherry Chien, Erdrin Azemi, Carlos Avendano
PRESENTATION OF POSTER OR POSTER
On the role of lip articulation in the visual perception of speech
2:00 – 3:30 pm LT in sign area 10 – Dome
Zakaria Aldeneh, Masha Fedzechkina, Skyler Seto, Katherine Metcalf, Miguel Sarabia, Nicholas Apostoloff, Barry-John Theobald
PRESENTATION OF POSTER OR POSTER
Learning to detect novel and fine-grained acoustic sequences using pretrained audio representations
3:35 – 5:05 PM LT in Poster Area 2 – Garden
Vasudha Kowtha, Miquel Espi, Jonathan J Huang, Yichi Zhang, Carlos Avendano

Friday, June 9

PRESENTATION OF POSTER OR POSTER
Improvements to ASR acoustic-to-word-embedding-matching using multi-hypothesis pronunciation-based embeddings
8:15 – 9:45 AM in Poster Area 4 – Garden
Hao Yen, Woojay Jeon

accepted papers

Audio-to-Intent using end-to-end ASR acoustic-textual subword representations

Pranay Dighe, Prateeth Nayak, Oggi Rudovic, Erik Marchi, Xiaochuan Niu, Ahmed Tewfik

HEiMDaL: highly efficient method for the detection and localization of keywords

Arnav Kundu, Mohammad Samragh Razlighi, Minsik Cho, Priyanka Padmanabhan, Devang Naik

I See What You Hear: A Vision-Inspired Method of Locating Words

Mohammad Samragh, Arnav Kundu, Ting-Yao Hu, Aman Chadha, Ashish Srivastava, Minsik Cho, Once Tuzel, Devang Naik

Improvements to ASR acoustic-to-word-embedding-matching using multi-hypothesis pronunciation-based embeddings

Hao Yen, Woojay Jeon

Learning to detect novel and fine-grained acoustic sequences using pretrained audio representations

Vasudha Kowtha, Miquel Espi, Jonathan J Huang, Yichi Zhang, Carlos Avendano

More speakers or more speakers?

Dan Berrebbi, Ronan Collobert, Navdeep Jaitly, Tatiana Likhomanenko

Naturalistic generation of head movement from speech

Trisha Mittal, Zakaria Aldeneh, Masha Fedzechkina, Anurag Ranjan, Barry-John Theobald

Neural transducer training: reduced memory consumption with sample calculation

Stefan Braun, Erik McDermott, Roger Hsiao

On the role of lip articulation in the visual perception of speech

Zakaria Aldeneh, Masha Fedzechkina, Skyler Seto, Katherine Metcalf, Miguel Sarabia, Nicholas Apostoloff, Barry-John Theobald

Pretrained model representations and their robustness to noise for speech emotion analysis

Vikramjit Mitra, Vasudha Kowtha, Hsiang-Yun Sherry Chien, Erdrin Azemi, Carlos Avendano

Text is all you need: customization of ASR models through controllable speech synthesis

Karren Yang, Ting-Yao Hu, Jen-Hao Rick Chang, Hema Swetha Koppula, Oncel Tuzel

Variable attention masking for configurable transformer transducer speech recognition

Pawel Swietojanski, Stefan Braun, Dogan Can, Thiago Fraga da Silva, Arnab Ghoshal, Takaaki Hori, Roger Hsiao, Henry Mason, Erik McDermott, Honza Silovsky, Ruchir Travadi, Xiaodan Zhuang

Manifestation

Contextual understanding in Siri

This is a demonstration of the context-aware technology included in Siri. Users can refer to a previously mentioned entity using anaphoras or nominal ellipses, refer to an on-screen entity, or correct a previous Siri or user error. Context Understanding for Siri takes advantage of several back-end ML solutions, such as query rewrite and reference resolution. This work is a step towards having more natural conversations with Siri and shipped in iOS 16.

All ICASSP attendees are invited to stop by the Apple booth (booth number 16, located next to the main entrance of the Dome Bar at the Rodos Palace Luxury Convention Resort) to experience this demo in person.

Thanks

Tatiana Likhomanenko, Arnav Kundu, Stefan Braun, Vikram Mitra, and Pawel Swietojanski are reviewers for ICASSP 2023.

Yannis Stylianou is Chairman of Short Courses and Seasonal Schools for ICASSP 2023.

Let’s innovate together. Create amazing machine learning experiences with Apple. Discover opportunities for researchers, students and developers by visiting our Work with us page.

International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2023

Technical Terrence Team

What is needed to put inclusive curriculum legislation into practice?

Leave a Reply Cancel reply

Recommended.

The future of NFT games

Ethereum trading at 40-month lows vs. Bitcoin: Is this an endorsement of BTC?

Improved content moderation with bulk analytics and custom moderation from Amazon Rekognition

Analyst Says ADA Will Soar 650% in July, But Could Still Be Overshadowed by This Bitcoin ETF Opportunity With 20x Potential

Binance.US continues to offer USD transfers

Categories

Important Links

International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2023

Schedule

Tuesday, June 6

Wednesday, June 7

Thursday, June 8

Friday, June 9

accepted papers

Manifestation

Thanks

Related

Technical Terrence Team

What is needed to put inclusive curriculum legislation into practice?

Leave a Reply Cancel reply

Recommended.

The future of NFT games

Ethereum trading at 40-month lows vs. Bitcoin: Is this an endorsement of BTC?

Improved content moderation with bulk analytics and custom moderation from Amazon Rekognition

Analyst Says ADA Will Soar 650% in July, But Could Still Be Overshadowed by This Bitcoin ETF Opportunity With 20x Potential

Binance.US continues to offer USD transfers

Categories

Important Links

Get daily news updates to your inbox!