Entity disambiguation (ED), which links mentions of ambiguous entities with their reference entities in a knowledge base, serves as a central component in entity linking (EL). Existing generative approaches demonstrate improved accuracy compared to classification approaches based on the standardized ZELDA benchmark. However, generative approaches suffer from the need for large-scale pre-training and inefficient generation. Most importantly, entity descriptions, which could contain crucial information to distinguish similar entities from each other, are often overlooked. We propose an encoder-decoder model to disambiguate entities with more detailed entity descriptions. Given the text and candidate entities, the encoder learns the interactions between the text and each candidate entity, producing representations for each candidate entity. The decoder then fuses the candidate entity representations and selects the correct entity. Our experiments, performed with several entity disambiguation benchmarks, demonstrate the strong and robust performance of this model, particularly +1.5% on the ZELDA benchmark compared to GENRE. Additionally, we integrated this approach into the reader/retrieval framework and observed +1.5% improvements in end-to-end entity binding on the GERBIL benchmark compared to EntQA.