Teaching mobile robots to navigate complex outdoor environments is essential for real world applications such as delivery or search and rescue. However, this is also a challenging problem as the robot needs to perceive its environment and then explore to identify feasible paths to the goal. Another common challenge is that the robot needs to navigate uneven terrain, such as stairs, curbs, or rocks on a trail, while avoiding obstacles and pedestrians. In our previous work, we investigated the second challenge by teaching a quadruped robot to tackle challenging uneven obstacles and various outdoor grounds.
In “IndoorSim-to-OutdoorReal: Learn to sail outdoors without any outdoor experience”, we present our recent work to address the robotic challenge of reasoning about the perceived environment to identify a viable navigation route in outdoor environments. We present a learning-based indoor-outdoor transfer algorithm that uses deep reinforcement learning to train a navigation policy in simulated indoor environments and successfully transfers that same policy to real outdoor environments. We also introduce Context-Maps (maps with user-created environmental observations), which are applied to our algorithm to enable efficient long-range navigation. We show that with this policy, robots can successfully navigate hundreds of meters in novel outdoor environments, around never-before-seen outdoor obstacles (trees, bushes, buildings, pedestrians, etc.) and in different weather conditions (sunny, cloudy, dusk). ).
PointGoal Navigation
User input can tell a robot where to go with commands like “go to android statue”, images showing a destination location, or simply selecting a point on a map. In this work, we specify the navigation target (a selected point on a map) as a coordinate relative to the robot’s current position (ie “go to ∆x, ∆y”), this is also known as the PointGoal visual navigation (PointNav) task. PointNav is a general formulation for navigation tasks and is one of the standard options for indoor navigation tasks. However, due to diverse imagery, uneven terrain, and long-distance targets in outdoor environments, training PointNav policies for outdoor environments is a challenging task.
Indoor to outdoor transfer
Recent successes in training robotic agents with wheels and legs to navigate in inside environments were possible thanks to the development of fast and scalable simulators and the availability of large-scale data sets from photorealistic 3D scans indoor environments. To build on these successes, we developed an indoor-outdoor transfer technique that allows our robots to learn from simulated indoor environments and be deployed in real outdoor environments.
To overcome the differences between simulated indoor environments and actual outdoor environments, we apply kinematic control and image augmentation techniques in our learning system. When we use kinematic control, we assume the existence of a reliable low level locomotion controller that you can control the robot to accurately reach a new location. This assumption allows us to directly move the robot to the target location during simulation training through a forward Euler integration and frees us from having to explicitly model the dynamics of the underlying robot in the simulation, dramatically improving the performance of simulation data generation. priority work has shown that kinematic control can lead to better transfer from sim to real compared to a dynamic control approachwhere the full dynamics of the robot are modeled and a low-level locomotion controller is required to move the robot.
Left kinematic control; Good: dynamic control |
We created an outdoor maze-like environment using found objects indoors for initial experiments, where we used Boston Dynamics robot locator for test navigation. We found that the robot was able to navigate new obstacles in the new outdoor environment.
The Spot robot successfully navigates around obstacles found in indoor environments, with a fully simulation-trained policy. |
However, when faced with unknown external obstacles that were not seen during the training, such as a large slope, the robot was unable to navigate the slope.
The robot cannot navigate uphill, as slopes are rare indoors and the robot has not been trained to tackle them. |
To enable the robot to go up and down slopes, we applied an image augmentation technique during simulation training. Specifically, we randomly tilted the robot’s simulated camera during training. It can be pointed up or down within 30 degrees. This increase effectively makes the robot perceive slopes even though the floor is level. Training on these perceived slopes allows the robot to navigate slopes in the real world.
By randomly tilting the camera angle during simulation training, the robot can now go up and down slopes. |
Since the robots were only trained in simulated indoor environments, where they typically need to walk to a target just a few meters away, we found that the learned network was unable to process longer-range inputs, for example, the policy was unable to advance to 100 meters in an empty space. To allow the policy network to handle the long-range inputs that are common for outdoor navigation, we normalize the target vector using the target distance log.
Context maps for long-range complex navigation
Putting it all together, the robot can navigate outdoors toward the goal, while walking over uneven ground and avoiding trees, pedestrians, and other outside obstacles. However, one key component is still missing: the robot’s ability to plan an efficient long-range route. At this scale of sailing, taking a wrong turn and going backwards can be costly. For example, we find that the local scan strategy learned by PointNav’s standard policies is insufficient to find a long-term goal and usually leads to a dead end (as shown below). This is because the robot navigates without the context of its environment, and the optimal path may not be visible to the robot from the start.
Environment context-free navigation policies do not handle complex, long-range navigation goals. |
To allow the robot to take context into account and deliberately plan an efficient route, we provide a Context Map (a binary image representing a top-down occupancy map of the region the robot is in) as additional observations. for the robot. . An example of a context map is provided below, where the black region indicates areas occupied by obstacles and the white region is traversable by the robot. The green and red circle indicates the start and destination location of the navigation task. Through the Context-Map, we can provide suggestions to the robot (for example, the narrow opening in the route ahead) to help it plan an efficient navigation route. In our experiments, we create the Context-Map for each route guided by Google Maps Satellite images We denote this variant of PointNav with environmental context, as Context-driven PointNav.
Context Map Example (good) for a navigation task (left). |
It is important to note that the Context Map does not need to be accurate because it only serves as a rough outline for planning. During navigation, the robot still needs to rely on its built-in cameras to identify and adapt its path to pedestrians, who are absent from the map. In our experiments, a human operator quickly sketches the Context Map from the satellite image, masking the regions to be avoided. This contextual map, along with other embedded sensory inputs, including depth images and position relative to the target, are fed to a neural network with attention models (i.e. transformers), which are trained using DD-PPOa distributed implementation of proximal policy optimizationin large-scale simulations.
Context-driven PointNav architecture consists of a 3-layer structure convolutional neural network (CNN) to process depth images from the robot’s camera, and a multilayer perceptron (MLP) to process the target vector. The features are passed to a closed recurring unit (GRU). We use an additional CNN encoder to process the context map (top-down map). We calculate the escalated point product care between the map and the depth image, and use a second GRU to process the served features (Context Attn., Depth Attn.). The result of the policy are linear and angular velocities that the Spot robot must follow. |
Results
We tested our system on three long-range outdoor navigation tasks. The provided context maps are rough, incomplete contours of the environment that omit obstacles such as cars, trees, or chairs.
With the proposed algorithm, our robot can successfully reach the distant target location 100% of the time, without a single collision or human intervention. The robot was able to navigate between pedestrians and real-world clutter not present in the context map, and navigate various terrains, including dirt and grass slopes.
route 1
Route 2
route 3
Conclusion
This work opens robotic navigation research to the less explored domain of diverse outdoor environments. Our indoor-to-outdoor transfer algorithm uses zero real-world experience and does not require the simulator to model predominantly outdoor phenomena (terrain, ditches, sidewalks, cars, etc.). Success in the approach comes from a combination of robust locomotion control, a low gap between simulation and reality in depth and map sensors, and large-scale training in simulation. We show that providing robots with high-level rough maps can enable long-range navigation in new outdoor environments. Our results provide compelling evidence to challenge the (admittedly reasonable) hypothesis that a new simulator must be designed for each new scenario we wish to study. For more information, see our project page.
Thanks
We would like to thank Sonia Chernova, Tingnan Zhang, April Zitkovich, Dhruv Batra, and Jie Tan for advising and contributing to the project. We would also like to thank Naoki Yokoyama, Nubby Lee, Diego Reyes, Ben Jyenis, and Gus Kouretas for their help with setting up the robot experiment.