Contextualizing ASR with LLM using phonetic retrieval-based augmentation

Large language models (LLMs) have demonstrated excellent ability to model multimodal signals, including audio and text, allowing the model to generate a spoken or textual response given a speech input. However, it remains a challenge for the model to recognize entities with personal names, such as contacts in a phone book, when the input modality is voice. In this work, we start with a speech recognition task and propose a retrieval-based solution to contextualize the LLM: we first let the LLM detect named entities in speech without any context, then we use this named entity as a query to Retrieve phonetically similar named entities. entities from a personal database and send them to the LLM, and finally execute context-aware LLM decoding. In a voice assistant task, our solution achieved a relative word error rate reduction of up to 30.2% and a relative named entity error rate reduction of up to 73.6% compared to a reference system without contextualization. In particular, our solution by design avoids requesting the LLM for the full named entity database, making it highly efficient and applicable to large named entity databases.

Contextualizing ASR with LLM using phonetic retrieval-based augmentation

Technical Terrence Team

LPL Financial falls after CEO is fired for cause By Investing.com

Leave a Reply Cancel reply

Recommended.

MetaMask Ethereum Staking Beta Released with Lido and Rocket Pool

This Toro race is over; The next one will not happen until the new narrative arrives

The Morning After: EU reveals six tech giants that’ll be hit by its new competition laws

How to make your smartphone last more

Solana growth in danger for $ 15b Market Wipeout. Absects's presale promises to fix it

Categories

Important Links

Contextualizing ASR with LLM using phonetic retrieval-based augmentation

Related

Technical Terrence Team

LPL Financial falls after CEO is fired for cause By Investing.com

Leave a Reply Cancel reply

Recommended.

MetaMask Ethereum Staking Beta Released with Lido and Rocket Pool

This Toro race is over; The next one will not happen until the new narrative arrives

The Morning After: EU reveals six tech giants that’ll be hit by its new competition laws

How to make your smartphone last more

Solana growth in danger for $ 15b Market Wipeout. Absects's presale promises to fix it

Categories

Important Links

Get daily news updates to your inbox!