Neural contextual bias allows speech recognition models to leverage contextually relevant information, improving transcription accuracy. However, the bias mechanism is typically based on a cross-attention module between the audio and a catalog of bias inputs, which means that the computational complexity can pose serious practical limitations on the size of the bias catalog and consequently , in precision improvements. This work proposes an approach to cross-attention scoring based on vector quantization and enables efficient computational and memory use of large bias catalogs. We propose to use this technique in conjunction with a retrieval-based contextual bias approach. First, we use an efficient quantized retrieval module to preselect biased inputs based on audio. We then use recovered inputs to skew. Since the proposed approach is independent of the bias method, we investigated using full cross-attention, LLM prompts, and a combination of both. We show that retrieval-based preselection allows the system to efficiently exploit biased catalogs of several thousand entries, resulting in a relative error rate reduction of up to 71% in the recognition of personal entities. At the same time, the proposed approximation algorithm reduces computation time by 20% and memory usage by 85-95%, for lists of up to one million entries, compared to standard cross-attention of scalable products. .