Reference resolution is an important problem, essential for successfully understanding and handling contexts of different types. This context includes both previous turns and context pertaining to non-conversational entities, such as entities on the user’s screen or those running in the background. While LLMs have been shown to be extremely powerful for a variety of tasks, their use in reference resolution, particularly for non-conversational entities, remains underutilized. This paper demonstrates how LLMs can be used to build an efficient system for resolving references of various types, showing how reference resolution can be turned into a language modeling problem, despite involving entity forms such as those displayed on the screen that are traditionally not conducive to being reduced to a text-only modality. We demonstrate large improvements over an existing system with similar functionality across different reference types, with our smaller model achieving absolute gains of over 5% for on-screen references. We also benchmark against GPT-3.5 and GPT-4; Our smallest model achieved performance comparable to GPT-4, and our larger models substantially outperformed it.