Facebook has revealed additional details about the development of Portal, its first in-house video calling device, including the tidbit that early prototypes included a motor that let Portal swivel to face video subjects.
If the plan was to sell lots of video calling devices, it’s probably best that a swiveling Portal was never released. Following Facebook’s persistent parade of controversy, privacy concerns raised by Portal were bad enough without the device turning to face you when it senses your presence.
A motorized camera was also seen as impractical because it did not improve Portal’s reliability, Facebook engineers said in a blog post today.
Work to build Portal began two years ago as part of the secretive Building 8 project for exploring hardware products at Facebook’s headquarters in Menlo Park, California, Portal team lead and Facebook VP Rafa Camargo told VentureBeat ahead of the device’s debut last fall.
The Portal team is now part of Facebook’s AR/VR division.
Portal video calls with Facebook Messenger rely on Smart Camera computer vision, zooming and moving to frame shots and account for each of the people in a room — even people up to 20 feet from the camera. A 140-degree field of vision for the fish-eyed wide angle camera lens means Portal doesn’t need to move in order to see what’s happening.
Smart Volume is also used to amplify, reduce, and modulate volume to optimize call sound.
Smart Camera runs on-device machine learning with Mask R-CNN2Go, a computer vision system derived from Mask R-CNN, which won the Best Paper award at the International Conference on Computer Vision (ICCV) in 2017.
The system uses pose recognition to scan 30 frames per second to look for human subjects and properly frame each shot. In addition to artificial intelligence, the camera takes into consideration the way people respond to camera movement and advice from filmmakers about how to frame shots.
The Portal AI and Mobile Vision teams at Facebook took steps to address the fact that a system made to run with GPUs had to be reduced to a few megabytes to be small enough to work on-device with Qualcomm’s Snapdragon Neural Processing Engine.
“To compensate, we developed several strategies, including improving low-light performance by applying data augmentation on low-light examples in the training data set and balancing multiple pose-detection approaches (such as detecting a subject’s head, trunk, and entire body). And we used additional preprocessing to differentiate between multiple people in proximity to one another,” said the blog post written by software engineers Rahul Nallamothu and Eric Hwang, research scientist Peter Vajda, and engineering director Matt Uyttendaele.
In the future, Facebook plans to make Smart Camera intelligent about the context of video calls so it can better determine how to frame a shot.
“We may want to frame a shot differently and use different types of camera movements when a person is cooking in the kitchen, compared with when he or she is watching TV on the couch,” the post reads.
Portal competes with smart displays from Amazon’s Echo Show and Google’s Home Hub, as well as displays from manufacturers like JBL, Sony, and Lenovo. Smart displays interact with AI assistants like Google Assistant and Amazon’s Alexa. Portal is also able to speak with Alexa.
In a series of updates announced late last year, Portal gained a series of new features, including a web browser, Instant Games, and custom video call control.