How roboticists think about generative AI

(A version of this article first appeared in TechCrunch’s robotics newsletter, Actuator. Subscribe here.)

The topic of generative ai appears frequently in my newsletter, Actuator. I admit that a few months ago I was a little hesitant to spend more time on the topic. Anyone who’s been reporting on technology for as long as I have has lived through countless hype cycles and gotten burned before. Reporting on technology requires a healthy dose of skepticism, hopefully tempered by some enthusiasm about what can be done.

This time, it seemed like generative ai was waiting in the wings, biding its time, waiting for the inevitable crater of cryptocurrencies. While the blood was draining from that category, projects like ChatGPT and DALL-E were waiting, ready to be the focus of exciting reports, hope, criticism, doomerism, and all the different Kübler-Rossian stages of the tech bubble.

Those who follow my stuff know that I was never particularly bullish on cryptocurrencies. However, things are different with generative ai. For starters, there is near-universal agreement that ai/ML in general will play more centralized roles in our lives in the future.

Smartphones offer great information here. Computational photography is something I write about with some regularity. There have been great strides on that front in recent years, and I think many manufacturers have finally struck a good balance between hardware and software when it comes to improving the final product and lowering the bar for entry. Google, for example, does some really impressive tricks with editing features like Best Take and Magic Eraser.

Sure, they’re cool tricks, but they’re also useful, rather than being feature-for-feature. Going forward, however, the real trick will be integrating them seamlessly into the experience. With ideal future workflows, most users will have little or no notion of what is happening behind the scenes. They will be happy that it works. It’s the classic Apple playbook.

Generative ai offers a similar “wow” effect right from the start, which is another way it differs from its hype cycle predecessor. When your less tech-savvy relative can sit in front of a computer, type a few words into a dialogue field, and then watch the black box spit out paintings and short stories, not much conceptualization is required. That’s a big part of why all of this caught on as quickly as it did: most of the time, when everyday people are presented with cutting-edge technologies, they need to visualize what they would look like in five or ten years. years.

With ChatGPT, DALL-E, etc., you can experience it firsthand right now. Of course, the other side of the coin is how difficult it is to temper expectations. As much as people are inclined to endow robots with human or animal intelligence, without a fundamental understanding of ai, it is easy to project intentionality here. But that’s how things are going now. We start with the attention-grabbing headline and hope people stick around long enough to read about the machinations behind it.

Spoiler alert: nine times out of 10 they won’t and suddenly we’re spending months and years trying to get things back to reality.

One of the great things about my job is the ability to discuss these things with people much smarter than me. They take the time to explain things and I hope they do a good job translating it for readers (some attempts are more successful than others).

Once it became clear that generative ai has an important role to play in the future of robotics, I’ve been finding ways to introduce questions into conversations. I think most people in this field agree with the statement in the previous sentence, and it’s fascinating to see the breadth of impact they think it will have.

For example, in my recent conversation with Marc Raibert and Gill Pratt, the latter explained the role that generative ai plays in their approach to robot learning:

We’ve figured out how to do one thing, which is to use modern generative ai techniques that allow human demonstration of both position and force to essentially teach a robot from just a handful of examples. The code is not modified at all. What this is based on is something called diffusion policy. It is a work that we did in collaboration with Columbia and MIT. So far we have taught 60 different skills.

Last week, when I asked Nvidia’s VP and GM of Embedded and Edge Computing, Deepu Talla, why the company believes generative ai is more than a fad, he told me:

I think it speaks in the results. You can already see the improvement in productivity. You can compose an email for me. It’s not exactly correct, but I don’t have to start from scratch. He’s giving me 70%. There are obvious things that you can already see that are definitely a better step feature than how things were before. Summarizing something is not perfect. I’m not going to let you read it and summarize it for me. So, you can already see some signs of productivity improvements.

Meanwhile, during my last conversation with Daniela Rus, the director of MIT CSAIL explained how researchers are using generative ai to design the robots:

It turns out that generative ai can be quite powerful at solving even motion planning problems. You can get much faster solutions and much more fluid and humane control solutions than with model predictive solutions. I think that’s very powerful, because the robots of the future will be much less robotic. They will be much more fluid and human in their movements.

We have also used generative ai for design. This is very powerful. It’s also very interesting, because it’s not just about pattern generation for robots. You have to do something else. You can’t just generate a pattern based on data. Machines have to make sense in the context of physics and the physical world. For that reason, we connect them to a physics-based simulation engine to ensure that the designs meet the required constraints.

This week, a team from Northwestern Universityai-designs-new-robot-from-scratch-in-seconds/” target=”_blank” rel=”noopener”> released his own research in the design of robots generated by ai. The researchers showed how they designed a “robot that successfully walks in a matter of seconds.” There’s not much to see, as these things go, but it’s pretty easy to see how with additional research, the approach could be used to create more complex systems.

“We discovered a very fast design algorithm powered by ai that avoids the bottlenecks of evolution, without falling into the bias of human designers,” said research leader Sam Kriegman. “We told the ai that we wanted a robot that could walk on land. Then we simply press a button and that’s it! He generated a blueprint for a robot in the blink of an eye that looks nothing like any animal that has ever walked the earth. I call this process ‘instant evolution.’”

It was the ai program’s decision to put legs on the soft little robot. “It’s interesting because we didn’t tell the ai that a robot should have legs,” Kriegman added. “He rediscovered that legs are a good way to move on land. “Locomotion with the legs is, in fact, the most efficient form of terrestrial movement.”

“From my perspective, generative ai and physical/robotic automation are what are going to change everything we know about life on Earth,” Formant founder and CEO Jeff Linnell told me this week. “I think we are all aware of the fact that ai exists and we expect that all of our jobs, every company and every student will be affected. I think it’s symbiotic with robotics. You won’t have to program a robot. You are going to talk to the robot in English, you request an action and then they solve it. “It’s going to take a minute for that.”

Prior to Formant, Linnell founded and served as CEO of Bot & Dolly. The San Francisco-based company, best known for its work on Gravity, was absorbed by Google in 2013 when the software giant set out to accelerate the industry (best-laid plans, etc.). The executive tells me that his main takeaway from that experience is that it’s all about the software (given the arrival of DeepMind’s takeover of Intrinsic and Everyday Robots, I’m inclined to say Google agrees).