Media

How AI powers the Pixel 4


Google’s Pixel 4 and Pixel 4 XL go on sale this week after debuting at a hardware event in New York last week. And as with previous versions of Google’s flagship smartphone, artificial intelligence powers many exclusive new features and improvements.

On-device machine learning is a main theme for the latest Made by Google hardware. The Pixel 4 uses Neural Core, a TPU chip upgrade from the Pixel 3’s Visual Core chip. Pixel 4 comes with lots of preexisting AI-enabled features, like Now Playing for song detection, but there are four major changes: speech recognition, next-gen Google Assistant, a range of new camera features, and facial recognition to unlock the phone and make payments.

Camera

Smartphone makers no longer sell phones. They sell cameras. That’s why Google spent about as much time talking about the Pixel 4 camera at its unveiling last week as it did talking about the rest of the phone.

AI recommends things like Night Sight in low-light settings and powers depth prediction for Portrait Mode images.

Portrait Mode in Pixel 4 is as sharp as ever.

Above: Portrait Mode with Pixel 4 XL

Depth prediction Portrait Mode shots with Pixel 4 seem stronger than results with previous Pixel phones.

Night Sight also gets improvements. If you’re taking a handheld shot, Night Sight can deliver impressive results that some have called “mad science.”

But if you prop up a Pixel 4 so it’s still or use some sort of tripod, Night Sight can last three minutes or longer, delivering not just more crisp low-light imagery, but actually taking photos of stars in the sky. Our initial tests found that this is no exaggeration.

Photos of the night sky can be taken with Pixel 3, 3a, and 4.

Above: Night Sight with astrophotography on Pixel 4 XL

Another big difference compared to cameras on other Pixel phones is that you can shoot ultra 4K video, and a tap and hold of the camera button records video. (Previously, a tap and hold of the camera button took dozens of photos.) A swipe down for extended control lets you enable things like Frequent Faces.

Machine learning-based white balance was first introduced for Pixel 3 and continues with Pixel 4 to deliver pictures with accurate color temperature.

Above: Super Res Zoom shot of Oakland Tribune Tower from four blocks away in Oakland, California

Image Credit: Khari Johnson / VentureBeat

Super Res Zoom is another major feature for the Pixel 4 and uses a new telephoto lens with up to 8 times zoom for improved results compared to previous shots with digital zoom.

Frequent Faces records and stores data about people you photograph regularly in order to shape Top Shot photo recommendation results.

Facial recognition

Facial recognition powers a number of features in the Pixel 4, like Face Unlock to open the phone or make a payment and the Featured Faces feature for recognizing people you take pictures of the most.

With Motion Sense radar that’s triggered when the phone senses movement, Google claims Face Unlock is faster than Apple’s iPhone Face ID. Facial recognition with Pixel 4 launches with the ability to verify Google Pay transactions with a face scan, something that was not available at launch for Apple Pay users. But Google’s first-ever facial recognition for smartphone system is experiencing some major growing pains.

Sleuthing by BBC found last week that Face Unlock works on people even when their eyes are closed, a concern for some users.

The fact is, even in a Touch ID world of fingerprint scans, a person with bad intent can force someone to open their phone, but it may be easier to see a face than scan a finger.

The most likely misuse of a design flaw like this is probably a spouse unlocking their partner’s phone, but it’s easy to think of malicious vulnerabilities when it’s widely known that Pixel 4 will work on the faces of people who are unconscious, sleeping, or dead.

Google did not initially plan to make any changes to this feature, but on Sunday the company announced a fix that requires detection of blinking eyes and will be released as part of a software update in the coming months, a company spokesperson told VentureBeat in an email.

Another potential area for improvement is the performance of Google’s facial recognition on people with dark skin.

Weeks before the release of the Pixel 4, the New York Daily News reported that contractors working for Google used questionable tactics to improve its facial recognition’s ability to recognize people with dark skin, such as being less than upfront about how the face scans would be used or referring to the scans used to improve Google’s facial recognition as a game. The contractor Randstad apparently attended the BET Awards in Los Angeles and rounded up homeless people in Atlanta by handing out $5 Starbucks gift cards.

These revelations drew the attention of the Atlanta city attorney and raised questions about what constitutes a fair price for an image of a person’s face.

Amid ongoing investigation, the face scan collection program has been suspended.

Future updates may lead to performance improvements. On my dark skin, the Pixel 4 was very consistent in ideal conditions with balanced lighting, but there were moments when in reasonable lighting Face Match repeatedly failed to recognize me. After failure to carry out Face Unlock multiple times in a row, the phone suggested that I delete my face profile and create a new one.

Reenrollment helped some, and no formal count of Face Unlock success or failure instances were recorded as part of this Pixel 4 review, but opening my phone was a routinely hit-and-miss exercise when scanning my face in bed in the morning, in a vehicle at night, with overhead lighting, or in other common scenarios with less-than-ideal lighting.

Above: A selfie taken after a failed Face Unlock with Pixel 4 XL

Image Credit: Khari Johnson / VentureBeat

The setup for facial recognition on your phone takes about 30 seconds of pointing your nose slowly in different directions to complete a face scan. That’s more extensive than the Face Match capture process on a Nest Hub Max smart display, likely because facial recognition replaces the fingerprint scanner that used to be the primary means of unlocking a Pixel phone and because in that scenario the facial recognition only needs to tell the difference between up to 6 people per household.

Low performance of facial recognition on people with dark skin is an industry-wide problem. More audits and analysis of how Google’s facial recognition performs on people with light and dark skin tones will be conducted as the phone becomes publicly available.

Despite my own experience regularly encountering failed Face Unlock attempts, it’s far too early to refer to it as a failure — as some journalists have chosen to do — because Google’s just getting started with facial recognition.

If my own experience is any indication, replacing Touch ID fingerprint scans with Face Unlock comes with tradeoffs.

Next-gen Google Assistant

As Google showcased last week, the new Google Assistant can open apps, search the web, get directions, and send Google Assistant search query results to contacts.

Next-gen Google Assistant uses Continued Conversation to enable multi-turn dialogue. This means that after you say the initial “Okay, Google” wake word, Google Assistant will carry out the command and then continue listening for additional commands until you say “Stop” or “Thank you” to end the exchange.

Continued Conversation has been available for some time now on smart displays and speakers, but when used on a smartphone, it supplies a stream of cards and content. This makes for a different experience than you get with a smart display that changes the on-screen imagery after each question. A stream that lets you scroll back and forth and complete actions highlights your own stream of conscious.

This means that you can very quickly go from asking Google a question about any given topic to diving into the topic and continuing to learn more to sharing with friends or acting upon that information. You can also interact with an app or website while Google Assistant runs in the background, a true multimodal experience.

There are still shortcomings. Tell Google Assistant to share with a friend and it may only take a screenshot. I asked Google to share an email and podcast episode with a friend on different occasions and it just took a picture. This works when Google is sharing a weather report, for example, but not for things like a website or email. A URL link is almost always more helpful.

Also of note: The new Google Assistant uses an on-device language model and Neural Core to make it faster than its counterpart in other smartphones, but that doesn’t mean the end of latency. The new Google Assistant can still encounter delays due to slow Wi-Fi or data connection.

And the new Google Assistant will not be available at launch for G Suite accounts. It seems odd the new Google Assistant designed to make you as efficient as possible is unable to work with G Suite.

Finally, the new Google Assistant can interact with apps and does a better job of surfacing your Google Photos content, but it’s still not contextually aware. So if you open Google Maps and then say “Find the nearest flower shop,” your assistant will exit Google Maps and return to a web search to share results.

The new assistant also gets a slightly different look in Pixel 4, appearing only as a glowing set of Google primary colors at the bottom of the screen.

Google Assistant with Pixel 4 makes room for real-time transcription of your words. This is helpful for confirming that the assistant correctly heard your request, and the movement of words on the screen lets you know the assistant is listening and establishes a kind of rhythm to follow for each voice command.

Speech recognition

For years, it’s been true that you can use conversational AI to turn speech into words faster than you can type with your thumbs on a smartphone. Speech-to-text transcription can be found in an increasing number of places, and with the Pixel 4 you get automated transcription of people speaking in videos.

Live Caption provides text transcription of audio in podcasts, audio messages, and video from your camera roll to YouTube. There are occasional misses here, but this is a helpful feature when you can’t listen to audio but still want to enjoy videos and other content.

A simple tap and hold of the text that appears on the screen can move it around, and a double tap expands to show more text.

The new Recorder app can also transcribe your voice recordings, a feature that allows you to search audio files for words and export text transcripts. The Recorder app uses real-time speech-to-text transcription, and it sometimes makes mistakes, which is in line with other speech transcription services. Recorder can also automatically identify keywords in transcripts and recommend audio message titles based on keywords in transcripts, or on music, applause, or speech.

Downsides: The Recorder app does not do a great job of breaking up or labeling speakers in a conversation, so transcribed words can blend into one another from time to time. Software updates to Recorder should probably address the fact that it adds no timestamp to transcription text that can be exported.

Each of these new features uses natural language understanding technology that’s been available for years in GBoard for writing in a Google Doc or sending a message.





READ SOURCE

This website uses cookies. By continuing to use this site, you accept our use of cookies.