If you can't capture what you want to look for with just an image, Google Lens will now let you take a video and even use your voice to ask about what you're seeing. The feature will display an ai overview and search results based on the video content and your question. It's rolling out today to Search Labs on Android and iOS.
Google first previewed using video for searching at I/O in May. As an example, Google says that someone who is curious about the fish they are seeing in an aquarium can bring their phone closer to the exhibit, open the Google Lens app, and then press and hold the shutter button. Once Lens starts recording, you can say your question: “Why are you swimming together?” Google Lens then uses the Gemini ai model to provide a response, similar to the one you see in the GIF below.
Talking about the technology behind the feature, Rajan Patel, vice president of engineering at Google, said The edge that Google is capturing the video “as a series of image frames and then applying the same computer vision techniques” previously used in Lens. But Google is going a step further by passing the information to a “custom” Gemini model developed to “understand multiple frames in sequence… and then provide a web-rooted response.”
There's still no support for identifying sounds in a video, like if you're trying to identify a bird you're listening to, but Patel says that's something Google has been “experimenting with.”
Google Lens is also updating its photo search feature with the ability to ask a question using your voice. To try it out, point your camera at your subject, hold down the shutter button, and then ask your question. Before this change, you could only type your question in Lens after taking a photo. Voice questions are rolling out globally on Android and iOS, but are currently only available in English.