Gemini Live Gains Real-Time Visual AI

As part of the April Pixel Drop back on April 7th, Google announced a set of new features for their Gemini Assistant. Normally, when a users triggers Gemini they simply speak to their phone and they receive a written response than can also be played back using Google’s new text-to-speech technology - which does sound very believable.

There’s also Gemini Live though. This option, hidden behind an icon on the bottom right, allows users to begin a very natural conversation with Gemini. You just talk to it like it’s a person. You can even interrupt its response to ask a follow-up question or divert the conversation somewhere else.

Multi-Modal

Now, much like Gemini proper, Gemini Live is multi-modal. What this means is it’s not just able to respond to what you’re saying to it. You can now feed Gemini Live a live feed of your camera and it will recognize in real time what you’re showing it.

To test this, I pointed my S25 Ultra at a rose bush we had just transplanted that was not looking so hot. I asked it if it looked like it needed to be trimmed back and explained that we had transplanted it. It spotted the dead brown leaves and the greenish-brown dying leaves and told me what I needed to do. It instructed me to remove all the dead and dying leaves and to make sure I cut at a 45° angle. I asked it how low I should cut it and it explained that as this bush had been transplanted and looked rough that we should cut it back to half it’s current height or less so that it could focus on growing out its roots.

I paused Live and went to work. When I was done, I showed it what we had and it seemed happy with how it looked. After I went inside I did some googling and it seems like the advice it gave, which I blindly followed, was pretty spot on. My only regret is that I didn’t screen record doing this first bush as it was the most visually in distress and Gemini did the best job there with recognizing what was going on.

If what you need to show Gemini isn’t in the real world, but is instead on your screen, you can show it that too. With your permission, it can simply watch whatever is on your screen and use that as context just the same.

A Bit Buried

One small issue I see is that as Gemini adds new features I will say that they do run the risk of becoming too complicated and confusing. After my wife saw me testing Gemini Live with Camera, she wanted to try it. She held the power button on her Pixel 9 to trigger Gemini and then had no idea what to do next. The Icon that triggers Live does very little let users know what the icon can do for them and that’s something that could certainly hold back adoption.

Users shouldn’t have to dig around to discover cool things. The OS should draw them to the cool things and make them aware of what they are.

Glad It’s There

This strikes me as a feature that I probably won’t use often, but in the right scenario, it might be very useful. I was really pleasantly surprised at how well it handled guiding me through my pruning and I can imagine other scenarios where it might genuinely be useful. I just wish I had screen recorded that first bush..

Keep in mind that this upgrade to Gemini Live is coming free to all Pixel 9 users, but all other Pixel devices will need a Gemini Advanced subscription. The feature will also be available in all Gemini-supported regions and languages, which you can view here

Previous
Previous

What’s New in Android 16 Beta 4

Next
Next

OnePlus's Foldable Strategy: Patience Over Iteration