What would it take for AI to operate robots? : Short Wave : NPR

REGINA BARBER: You’re listening to Short Wave from NPR. Hey, Short Wavers, Regina Barber here. It seems like artificial intelligence is everywhere in our virtual lives. It’s in our search results, our phones. It’s trying to read my emails. But NPR science correspondent Geoff Brumfiel has noticed that AI isn’t just showing up online anymore. It’s starting to creep into reality.
GEOFF BRUMFIEL: Yep. I don’t know if you tuned in for Tesla’s big marketing event last year, Regina.
BARBER: No.
BRUMFIEL: But AI was there.
[AUDIO PLAYBACK]
ELON MUSK: Speaking of robots.
[END PLAYBACK]
BRUMFIEL: Tesla is obviously a car company.
BARBER: Yep.
BRUMFIEL: But Elon Musk, Tesla’s CEO, made a big part of the event about a humanoid robot powered by AI and called Optimus.
[AUDIO PLAYBACK]
MUSK: The software, the AI inference computer, it all actually applies to a humanoid robot.
[END PLAYBACK]
BRUMFIEL: And Google just unveiled another humanoid robot that operates using AI.
[AUDIO PLAYBACK]
ROBOT VOICE: We’re bringing Gemini 2.0’s intelligence to general purpose robotic agents in the physical world.
[END PLAYBACK]
BARBER: OK, Geoff, but even before AI came along, people and companies have been making big claims about robots.
BRUMFIEL: They have. They have. And the robots, as I’m sure you know, Gina, have always disappointed compared to the vision.
BARBER: Yeah, that’s true.
BRUMFIEL: And that’s why I set out to understand the truth about AI and robotics.
BARBER: The truth.
BRUMFIEL: And I think I kind of found it in a bowl of trail mix.
[MUSIC PLAYING]
BARBER: Today on the show, what happens when artificial intelligence moves out of the chat and into the real world.
BRUMFIEL: We’re looking at how AI could maybe revolutionize robotics.
BARBER: You’re listening to Short Wave, the science podcast from NPR.
[MUSIC PLAYING]
BARBER: OK, so Geoff. You were interested in finding out more about how AI works in robots. Where did you start?
BRUMFIEL: Well, I didn’t go to Tesla or Google, but I did drive right by them on my way to Stanford University.
BARBER: OK.
BRUMFIEL: And specifically the IRIS Laboratory, which stands for Intelligence through Robotic Interaction at Scale. I got a tour from a graduate student named Moo Jin Kim. Moo Jin works on a new kind of robot powered by AI, similar to the AI used in chatbots.
[AUDIO PLAYBACK]
MOO JIN KIM: It’s one step in the direction of, like, ChatGPT for robotics, but still a lot of work to do.
BRUMFIEL: OK. All right, well, you want to show me how it– show me what it can do?
KIM: For sure.
[END PLAYBACK]
BARBER: So, Geoff, what did the robot look like?
BRUMFIEL: Well, this wasn’t some humanoid robot that the big tech companies are rolling out. It’s just a pair of mechanical arms with pinchers.
BARBER: OK.
BRUMFIEL: But what made it interesting was that it’s powered by an AI model called OpenVLA. So, first, we should probably just say quickly, you know, a regular robot must be very, very carefully programmed. An engineer has to write it detailed instructions for every task you want it to perform.
BARBER: Yeah, and AI is supposed to change that.
BRUMFIEL: Exactly. And that’s what’s going on here. This robot is powered by a teachable AI neural network. The neural network operates kind of how scientists think the human brain might work. Basically, there are these mathematical nodes in the network that have billions of connections to each other in a way similar to how neurons in the brain are connected together. And so when you go to program this sort of thing, it’s simply about reinforcing the connections that matter between the nodes and weakening the other ones that don’t. So in practice, this means Moo Jin can just teach OpenVLA a task by showing it.
[AUDIO PLAYBACK]
KIM: So basically, whatever task you want to do, you just keep doing it over and over, maybe like 50 times or 100 times.
[END PLAYBACK]
BRUMFIEL: The robot’s AI neural network becomes tuned to that task, and then it can do it by itself.
BARBER: Yeah, it makes me think of this, like, smiling robot story we did. And that robot just watched like a lot of videos of people smiling, then it learned how to do it.
BRUMFIEL: Yeah, it’s exactly the same thing, except instead of just smiling, this robot’s actually doing stuff.
BARBER: Right.
BRUMFIEL: So to show me, Moo Jin brought out a tray of different kinds of trail mix, and I typed in what I wanted it to do.
[AUDIO PLAYBACK]
BRUMFIEL: OK, so scoop some green ones with the nuts into the bowl.
BARBER: Oh my gosh.
BRUMFIEL: See what happens.
[END PLAYBACK]
BARBER: OK. So Geoff, personally, I’ve been waiting for something like AI in robotics because you can teach it to do something. You can ask it to do something, to, like, make me an ice cream sundae or something, without, like, any fancy programming or special knowledge.
BRUMFIEL: That’s exactly it, you know. And this really is the dream of the researcher who runs this laboratory. Her name is Chelsea Finn.
[AUDIO PLAYBACK]
CHELSEA FINN: So in the long term, we want to develop software that would allow the robots to operate intelligently in any situation.
[END PLAYBACK]
BRUMFIEL: And by intelligently, she means the robot could understand a simple command, like scoop some green ones into a bowl or make me a sundae, and then execute in the real world.
[AUDIO PLAYBACK]
FINN: Even just to do very basic things, like being able to make a sandwich or being able to clean a kitchen or being able to restock grocery store shelves.
[END PLAYBACK]
BRUMFIEL: These are simple tasks that could help humans do their jobs or do tasks at home. Now, Chelsea also has co-founded a startup called Physical Intelligence. It recently demonstrated a mobile robot that could take laundry out of a dryer and fold it. Again, this robot was taught by humans training its powerful AI program.
BARBER: OK, so ice cream sundaes, is that too advanced? Is folding an easier start?
BRUMFIEL: I mean, I’d actually argue, Gina, that folding is harder.
BARBER: OK.
BRUMFIEL: Let me show you a video.
BARBER: OK. It’s going to the dryer. It’s pulling stuff out, putting it in a basket. It has the concentration I have when I’m going to do laundry. It almost looks like annoyed with folding, like I do. Oh my god, it’s doing really well, actually.
BRUMFIEL: Yes, it is, right? And this is a complicated task. It’s got to pull these clothes out. It’s got to figure out what they are and–
BARBER: It doesn’t even have a head, but I’m like giving it personality. It looks like it’s, like, oh, I just got to fold another one.
BRUMFIEL: [LAUGHS]
BARBER: OK. So is it really as simple as like just teaching a robot like what to do? Because if it was, wouldn’t these robots be everywhere?
BRUMFIEL: Yeah, I mean, right? It looks cool on the video. The truth is that, you know, when you get out and these robots are trying to do these tasks over and over again, they get confused. They misunderstand. They make mistakes, and they just get stuck.
BARBER: OK.
BRUMFIEL: So it might be able to fold laundry 90% of the time or 75% of the time. But the rest of the time, it’s going to make a big mess that then a human has to get in there and clean up.
BARBER: Got it. OK.
BRUMFIEL: I spoke to Ken Goldberg, a professor at the University of California at Berkeley, and he is pretty emphatic that AI-powered robots weren’t here yet.
[AUDIO PLAYBACK]
KEN GOLDBERG: Robots are not going to suddenly become the science fiction dream overnight.
[END PLAYBACK]
BARBER: OK, so like, tell me why, because, like, AI chatbots have gotten, like, way better, super fast. So why are these robots getting stuck?
BRUMFIEL: OK. So it’s true that AI has improved massively over the past couple years, but that’s because chatbots have a huge amount of data to learn from. They’ve taken basically the entire internet to train themselves how to write sentences and draw pictures. But Ken says–
[AUDIO PLAYBACK]
GOLDBERG: For robotics, there’s nothing. We don’t have anything to start with, right? There’s no examples online of robot commands being generated in response to robot inputs.
[END PLAYBACK]
BRUMFIEL: And if robots really need as much training data as their virtual chatbot friends, then having humans teach them one task at a time is going to take a really long time.
[AUDIO PLAYBACK]
GOLDBERG: You know, at this current rate, we’re going to take 100,000 years to get that much data.
[END PLAYBACK]
BARBER: What? OK, that’s so long. Like, are there any alternatives? There must be.
BRUMFIEL: Yeah, well, scientists are exploring them right now. And one might be to let the AI brain of the robot learn in a simulation. A researcher who’s trying this is a guy named Pulkit Agarwal. He’s at the Massachusetts Institute of Technology.
[AUDIO PLAYBACK]
PULKIT AGARWAL: The power of simulation is that we can collect, you know, very large amounts of data. For example, in three hours’ worth of simulation, we can collect 100 days’ worth of data.
[END PLAYBACK]
BRUMFIEL: So this is a really promising approach for some things, but it’s much more of a challenge for others. So, for example, let’s talk about walking. When you’re just dealing with the earth and your body, the physics of walking around is actually kind of simple.
[AUDIO PLAYBACK]
AGARWAL: When you’re doing locomotion, you know, you’re mostly on Earth. There’s no amount of force we can apply which will make the Earth move.
[END PLAYBACK]
BRUMFIEL: And so the simulation can do that reasonably well. But if you want your robot to, say, try and pick up a mug off a desk or something, that’s a lot more complicated.
BARBER: More forces.
[AUDIO PLAYBACK]
AGARWAL: If you apply the wrong forces, these objects can fly away very quickly.
[END PLAYBACK]
BRUMFIEL: Basically, your robot will fling things across the room if it doesn’t understand the weight and the size of what it’s carrying. And there’s more. You know, if your robot encounters anything that you haven’t simulated 100% perfectly, then it won’t know what to do. It’ll just break.
BARBER: OK, so it sounds like these, like, simulations have limits, and real-world training is going to take, like, a while. I can begin to see why AI robots aren’t going to, like, be here tomorrow.
BRUMFIEL: Exactly. And some researchers think there are even deeper problems, actually, with trying to put AI into robotics. One of them is Matthew Johnson-Roberson at Carnegie Mellon University in Pittsburgh.
[AUDIO PLAYBACK]
MATTHEW JOHNSON-ROBERSON: In my mind, the question is not, do we have enough data. It is more, what is the framing of the problem?
[END PLAYBACK]
BRUMFIEL: So getting back to chatbots for a minute, Matt says for all their incredible skills, the task we’re asking them to do is actually relatively simple. You know, you look at what a human user types and then try to predict the next words that user wants to see. Robots have so much more that they’re going to have to do than just compose a sentence.
BARBER: Right.
[AUDIO PLAYBACK]
JOHNSON-ROBERSON: Next-best-word prediction works really well, and it’s a very simple problem because you’re just predicting the next word. And it is not clear right now I can take 20 hours of GoPro footage and then produce anything sensible with respect to how a robot moves around in the world.
[END PLAYBACK]
BRUMFIEL: So in other words, the sci-fi tasks that we want our robots to do are so complicated compared to sentence writing, no amount of data may be enough unless researchers can find the right way to teach the robots.
BARBER: Or have the robots teach the robots.
BRUMFIEL: Yes, that’s also an option. They can teach themselves.
BARBER: OK. So Geoff, you’ve taken me from, like, optimist to pessimist. It’s the road I take every day.
[LAUGHTER]
BARBER: I’m starting to think that AI is, like, never going to work that well in robots, or, like, it’s going to be a really long time.
BRUMFIEL: You know, I’m sorry if I’ve, like, turned you into a pessimist here, Gina, and then–
BARBER: It happens.
BRUMFIEL: And I’m going to have to sort of whipshaw you back, because AI is already finding its way into robotics in ways that are really interesting. So for example, Ken Goldberg has co-founded a package-sorting company. And just this year, they started using AI image recognition to pick the best points for their robots to grab the packages.
BARBER: Ooh, OK.
BRUMFIEL: Yeah, and it’s working really well, he told me. And I think we’re going to see a lot of that, AI being used for parts of the robotic problem, you know, walking or vision or whatever. It’s going to make big progress. It just may not arrive everywhere all at once. And to really end on a high note here, let’s get back to that Stanford lab. Remember, I asked it to grab some trail mix, right?
BARBER: Yeah, yeah.
BRUMFIEL: So the robot correctly identified the right bin, to Moo Jin Kim’s relief.
[AUDIO PLAYBACK]
KIM: Usually, that spot right there, where it identifies the object and goes to it, that’s the part where we hold our breath in.
[END PLAYBACK]
BRUMFIEL: And then very, very slowly and kind of hesitantly, it reached out with its claw and picked up the scoop.
[AUDIO PLAYBACK]
[ITEMS DROP INTO BOWL]
BARBER: [GASPS] It’s doing it.
BRUMFIEL: Moo Jin, did I just program a robot?
KIM: You did. It looks like it’s working.
[END PLAYBACK]
BRUMFIEL: And to my mind, it’s incredible. Like, remember, nobody really programmed the robot exactly.
BARBER: Right.
BRUMFIEL: This is all neural network learning how to move the claws and respond to the commands on its own. And to me, it’s pretty wild that that works at all. And I think it’s going to lead to some very cool developments.
BARBER: I’m excited to hear more, Geoff. Thank you so much for bringing this reporting to us.
BRUMFIEL: Thank you very much.
BARBER: We’ll link Geoff’s full story, which has robot videos, in our episode notes. This episode was produced by Berly McCoy, edited by our showrunner Rebecca Ramirez, and fact-checked by Tyler Jones. Jimmy Keeley was the audio engineer. Beth Donovan is our senior director, and Collin Campbell is our senior vice president of podcasting strategy.
BRUMFIEL: I’m Geoff Brumfiel.
BARBER: I’m Regina Barber. Thank you for listening to Short Wave from NPR.
[MUSIC PLAYING]
Copyright © 2025 NPR. All rights reserved. Visit our website terms of use and permissions pages at www.npr.org for further information.
NPR transcripts are created on a rush deadline by an NPR contractor. This text may not be in its final form and may be updated or revised in the future. Accuracy and availability may vary. The authoritative record of NPR’s programming is the audio record.
link