In April, a New York startup called Runway ai introduced technology that allows people to generate videos, such as a cow at a birthday party or a dog chatting on a smartphone, simply by typing a sentence in a box on the screen. of a computer.
The four-second videos were blurry, choppy, distorted and disturbing. But they were a clear sign that ai technologies would produce increasingly compelling videos in the months and years to come.
Just 10 months later, OpenAI, a San Francisco startup, unveiled a similar system that creates videos that look like something out of a Hollywood movie. One demonstration included short videos, created in minutes, of woolly mammoths trotting through a snowy meadow, a monster watching a melting candle and a Tokyo street scene apparently filmed by a camera panning the city.
OpenAI, the company behind chatbot ChatGPT and still image generator DALL-E, is among many companies racing to improve this type of instant video generator, including startups like Runway and tech giants like Google and Meta, which owns from Facebook and Instagram. The technology could speed up the work of experienced filmmakers while completely replacing less experienced digital artists.
It could also become a quick and cheap way to create misinformation online, making it even more difficult to know what is real on the Internet.
“I'm absolutely terrified that this kind of thing would influence a close election,” said Oren Etzioni, a professor at the University of Washington who specializes in artificial intelligence. He is also the founder of True Media, a nonprofit organization that works to identify online misinformation in political campaigns.
OpenAI calls its new system Sora, after the Japanese word for sky. The team behind the technology, including researchers Tim Brooks and Bill Peebles, chose the name because it “evokes the idea of unlimited creative potential.”
In an interview, they also said that the company was not yet releasing Sora to the public because it was still working to understand the dangers of the system. Instead, OpenAI is sharing the technology with a small group of academics and other outside researchers who will “equip” it, a term for looking for ways it can be misused.
“The intent here is to provide a preview of what's on the horizon, so people can see the capabilities of this technology and we can get feedback,” Dr. Brooks said.
OpenAI is already tagging videos produced by the system with watermarks that identify them as ai-generated, but the company acknowledges that they can be removed. They can also be difficult to detect. (The New York Times added “ai-Generated” watermarks to videos with this story.)
The system is an example of generative ai, which can instantly create text, images and sounds. Like other generative ai technologies, OpenAI's system learns by analyzing digital data; in this case, videos and subtitles that describe what those videos contain.
OpenAI declined to say how many videos the system learned from or where they came from, except to say that the training included both publicly available videos and videos licensed from copyright holders. The company says little about the data used to train its technologies, probably because it wants to maintain an advantage over its competitors and has been sued several times for using copyrighted material.
(The New York Times sued OpenAI and its partner, Microsoft, in December, alleging copyright infringement of news content related to artificial intelligence systems.)
Sora generates videos in response to short descriptions, such as “a gorgeously rendered paper world of a coral reef, teeming with colorful fish and sea creatures.” Although videos can be impressive, they are not always perfect and can include strange and illogical images. The system, for example, recently generated a video of someone eating a cookie, but the cookie never got smaller.
DALL-E, Midjourney, and other still image generators have improved so rapidly in recent years that they now produce images almost indistinguishable from photographs. This has made it harder to identify misinformation online, and many digital artists complain that it has made it harder for them to find work.
“We all laughed in 2022 when Midjourney first came out and said, 'Oh, that's cute,'” said Reid Southen, a film concept artist in Michigan. “Now people are losing their jobs because of Midjourney.”