Recent technological advances have vastly improved the performance of artificial intelligence (AI) agents and models. One such technique for creating AI models that are capable of solving various problems is reinforcement learning (RL). Reinforcement learning is a domain of machine learning in which agents attempt to take action to maximize the accrued reward. In other words, RL works on the basis of a reward function and is responsible for major advancements in gaming AI, such as DeepMind’s unbeatable AI for Go gaming, AlphaGo. Despite the remarkable performance of AI agents using RL, they rely on trial and error to find an effective strategy. This suggests that an algorithm may spend several years floundering around in the search space until it comes up with a winning formula. Such scenarios limit the application of reinforcement learning to real world situations. Additionally, the performance improvements seen in AI agents often come at a cost of time, computational resources, and large amounts of data required to train these models.
Current AI models are quite inefficient compared to humans, who can instantly learn things through interaction, demonstration, and reading text documents such as instruction manuals. This observation sparked an idea among a team of researchers at Carnegie Mellon University (CMU) to drastically improve the speed of AI agents by having them read instruction manuals before attempting a challenge. His approach consists of a reward and reading framework that was used to train an AI agent to play the Atari video game. The AI agent was trained nearly 6,000 times faster than a leading-edge model developed by DeepMind by reading the instructions.
Instruction manuals can be extremely helpful in understanding valuable features and policies in a specific task environment and informing the user of any reward system. This served as the impetus for CMU researchers to focus on teaching AI agents how to learn policies for specific activities using human-written manuals to improve their performance and increase their efficiency. Additionally, due to their controlled environment and the fact that they feature a scoring system that can be used as a reward system in reinforcement learning algorithms, Atari video games have long been a cherished benchmark for gaming. reinforcement learning research. Combining these observations, CMU researchers introduced the Read and Reward framework that speeds up RL algorithms in Atari games by reading manuals published by Atari game developers.
The framework mainly consists of two modules, the first of which is a QA Extraction module used to extract and summarize important information from the game’s official instruction manual. The second module, the Reasoning module, receives the data after it has been successfully extracted from the first module. This module is a pre-trained language model with capabilities and size comparable to GPT-3 and evaluates object-agent interactions based on queries made against manual data. The reinforcement algorithm then uses these responses to provide rewards beyond the game’s inherent scoring structure. These additional rewards enhance the capabilities of the reinforcement learning algorithm by helping you learn the game faster.
The researchers used Skiing 6000, one of the most difficult Atari games for the AI to master, to test their strategy. In contrast to the previous next-generation Agent 57, which needed 80 billion frames to perform as well as a human, this new method only needed 13 million frames to dominate the game. However, it could only manage to score about half the points as the top method. However, while the novel approach falls short of the performance of the average person, it is still far superior to a number of other superior reinforcement learning approaches that were completely unable to grasp the game concepts.
CMU researchers said their study is the first of its kind to show that a fully automated reinforcement learning framework can benefit from the instruction manuals of a familiar game. The team has already started testing other 3D games like Minecraft, where they have seen some encouraging results. They hope that their approach can be extended to more complicated situations in future work. The research team fervently hopes that the AI community sees their work as a significant step forward in improving the effectiveness of AI agents based on reinforcement learning.
review the Paper and Reference article. All credit for this research goes to the researchers of this project. Also, don’t forget to join our 15k+ ML SubReddit, discord channeland electronic newsletterwhere we share the latest AI research news, exciting AI projects, and more.
Khushboo Gupta is a consulting intern at MarktechPost. He is currently pursuing his B.Tech at the Indian Institute of Technology (IIT), Goa. She is passionate about the fields of machine learning, natural language processing, and web development. She likes to learn more about the technical field by participating in various challenges.