Contextualizing ASR with LLM using phonetic retrieval-based augmentation
Large language models (LLMs) have demonstrated excellent ability to model multimodal signals, including audio and text, allowing the model to ...
Large language models (LLMs) have demonstrated excellent ability to model multimodal signals, including audio and text, allowing the model to ...
End-to-end (E2E) neural networks have emerged as flexible and accurate models for multilingual automatic speech recognition (ASR). However, as the ...
This paper was accepted into the IEEE Spoken Language technology (SLT) 2024 Workshop. In this paper, we propose an algorithm ...
*Equal taxpayers Parameter efficient fine-tuning (PEFT) for customizing automatic speech recognition (ASR) has recently shown promise for adapting general population ...
While automatic speech recognition (ASR) systems are widely used in many real-world applications, they often do not generalize well to ...
Podcasting has become a popular and powerful medium for storytelling, news and entertainment. Without transcriptions, podcasts may be inaccessible to ...
The evolution of technology in speech recognition has been marked by significant advances, but challenges such as latency (the delay ...
With the help of creative engineering and learning in context, large language models (LLMs) are known to generalize well to ...
In this article, we begin by training end-to-end automatic speech recognition (ASR) models using federated learning (FL) and examining the ...
Amazon Transcribe is a fully managed automatic speech recognition (ASR) service that makes it easy for you to add speech-to-text ...