Episode

Contextual Biasing for LLM-Based ASR with Hotword Retrieval and Reinforcement Learning

YuXiang Kong,JunFeng Hou,Jian Tang,Bingqing Zhu,Jicheng Zhang,Shaofei Xue

Dec 26, 2025•11:57

eess.AS

No ratings yet

Abstract

Large language model (LLM)-based automatic speech recognition (ASR) has recently achieved strong performance across diverse tasks, yet contextual biasing for named entities and hotwords under large vocabularies remains challenging. In this work, we propose a scalable two-stage framework that integrates hotword retrieval with LLM-ASR adaptation. First, we extend the Global-Local Contrastive Language-Audio pre-trained model (GLCLAP) to retrieve a compact top-k set of hotword candidates from a large vocabulary via robustness-aware data augmentation and fuzzy matching. Second, we inject the retrieved candidates as textual prompts into an LLM-ASR model and fine-tune it with Generative Rejection-Based Policy Optimization (GRPO), using a task-driven reward that jointly optimizes hotword recognition and overall transcription accuracy. Experiments on hotword-focused test sets show substantial keyword error rate (KER) reductions while maintaining sentence accuracy on general ASR benchmarks, demonstrating the effectiveness of the proposed framework for large-vocabulary contextual biasing.

Links & Resources

View on arXiv Download PDF

Authors

YuXiang Kong JunFeng Hou Jian Tang Bingqing Zhu Jicheng Zhang Shaofei Xue

Cite This Paper

arXiv:2512.21828

Year:2025

Category:eess.AS

APA

Kong, Y., Hou, J., Tang, J., Zhu, B., Zhang, J., Xue, S. (2025). Contextual Biasing for LLM-Based ASR with Hotword Retrieval and Reinforcement Learning. arXiv preprint arXiv:2512.21828.

MLA

YuXiang Kong, JunFeng Hou, Jian Tang, Bingqing Zhu, Jicheng Zhang, and Shaofei Xue. "Contextual Biasing for LLM-Based ASR with Hotword Retrieval and Reinforcement Learning." arXiv preprint arXiv:2512.21828 (2025).