Contrastive Learning from AI Reviews (CLAIR): A new approach to address underspecification in AI model alignment with Anchored Preference Optimization (APO)
The development of artificial intelligence (ai), particularly large language models (LLMs), is focused on aligning these models with human preferences ...