The phrase “practice makes perfect” is usually reserved for humans, but it’s also a great maxim for robots newly deployed in unfamiliar environments.
Let’s imagine that a robot arrives at a warehouse. It comes equipped with the skills it was trained to do, such as placing an object, and now it needs to pick up items from a shelf it’s not familiar with. At first, the machine has a hard time doing this, as it needs to become familiar with its new environment. To improve, the robot will need to understand which skills within a general task it needs to improve and then specialize (or parameterize) that action.
A human could program the robot to optimize its performance, but researchers at MIT’s Computer Science and artificial intelligence Laboratory (CSAIL) and the artificial intelligence Institute have developed a more effective alternative. Presented at the Robotics: Science and Systems Conference last month, their “Estimate, Extrapolate, and Situate” (EES) algorithm allows these machines to practice on their own, which could help them get better at useful tasks in factories, homes, and hospitals.
Assessing the situation
To help robots get better at tasks like sweeping the floor, EES works with a vision system that locates and tracks the machine’s surroundings. The algorithm then calculates how reliably the robot performs an action (such as sweeping) and whether it would be worth practicing more. EES predicts how well the robot could perform the overall task if it perfects that particular skill, and finally practices. The vision system then checks to see if that skill was performed correctly after each attempt.
EES could prove useful in places like hospitals, factories, homes, or coffee shops. For example, if you wanted a robot to clean your living room, it would need help practicing skills like sweeping. However, according to Nishanth Kumar SM '24 and his colleagues, EES could help that robot improve without human intervention, using just a few practice trials.
“When we started this project, we asked ourselves whether this specialization would be possible over a reasonable number of samples on a real robot,” says Kumar, co-lead author of a paper. paper “We now have an algorithm that allows robots to significantly improve at specific skills in a reasonable amount of time with tens or hundreds of data points, an improvement over the thousands or millions of samples required by a standard reinforcement learning algorithm,” the paper describes.
See point sweep
EES’s knack for efficient learning became apparent when it was deployed on Boston Dynamics’ quadruped Spot during research tests at the artificial intelligence Institute. The robot, which has one arm attached to its back, completed manipulation tasks after practicing for a few hours. In one demonstration, the robot learned to safely place a ball and a hoop on a tilted table in about three hours. In another, the algorithm guided the machine to improve its ability to sweep up toys and dump them into a bin in about two hours. Both results appear to be an improvement on previous frameworks, which likely would have required more than 10 hours per task.
“We wanted the robot to gather its own experience so it could better choose which strategies would work well in its implementation,” says co-lead author Tom Silver SM ’20, PhD ’24, an electrical engineering and computer science (EECS) alumnus and CSAIL affiliate who is now an assistant professor at Princeton University. “By focusing on what the robot knows, we sought to answer a key question: In the library of skills the robot has, which one would be most useful to practice right now?”
EES could help speed up autonomous robot practice in new deployment environments, but for now it has some limitations. For starters, they used low tables, which made it easier for the robot to see its objects. Kumar and Silver also 3D-printed an attachable handle that made it easier for Spot to grab the brush. The robot missed some items and identified objects in the wrong places, so the researchers counted those errors as failures.
Giving tasks to robots
The researchers note that the speed of practice in physical experiments could be further accelerated with the help of a simulator. Instead of physically working on each skill autonomously, the robot could combine real and virtual practice. They hope their system will be faster and have less latency, designing EES to overcome the imaging delays the researchers experienced. In the future, they could investigate an algorithm that reasons over sequences of practice attempts rather than planning which skills to hone.
“Allowing robots to learn on their own is both incredibly useful and extremely challenging,” says Danfei Xu, an assistant professor in Georgia tech’s School of Interactive Computing and a research scientist at NVIDIA ai, who was not involved in this work. “In the future, home robots will be sold into all types of households and will be expected to perform a wide range of tasks. We can’t program everything they need to know in advance, so it’s essential that they can learn on the fly. However, letting robots explore and learn without guidance can be very slow and could have unintended consequences. Silver and colleagues’ research presents an algorithm that allows robots to practice their skills in an autonomous, structured way. This is a big step toward creating home robots that can continually evolve and improve on their own.”
Silver and Kumar’s co-authors are ai Institute researchers Stephen Proulx and Jennifer Barry, as well as four CSAIL members: Northeastern University PhD student and visiting scholar Linfeng Zhao, MIT EECS PhD student Willie McClinton, and MIT EECS professors Leslie Pack Kaelbling and Tomás Lozano-Pérez. Their work was funded, in part, by the ai Institute, the U.S. National Science Foundation, the U.S. Air Force Office of Scientific Research, the U.S. Office of Naval Research, the U.S. Army Research Office, and MIT Quest for Intelligence, with high-performance computing resources from the MIT SuperCloud and the Lincoln Laboratory Supercomputing Center.