Contextualizing ASR with LLM using phonetic retrieval-based augmentation
Large language models (LLMs) have demonstrated excellent ability to model multimodal signals, including audio and text, allowing the model to ...
Large language models (LLMs) have demonstrated excellent ability to model multimodal signals, including audio and text, allowing the model to ...