Humans interact with each other every day, and as such, are able to recognize a wide range of complex facial expressions. We constantly analyse and communicate through body language, even if subconsciously. What if we could give a computer that same capability? What if technology could understand an image in the same intrinsic way humans can?
At Cubic Motion, we use computer vision technology to quickly translate human expressions into data and apply the results to almost any medium – from video games to film, VR and holograms. Our work can be seen in God of War, Hellblade: Senua’s Sacrifice, Spider-Man, and other blockbusters. Put simply, computers can be taught to read and record complex human expressions, ready to instantly reproduce this spectrum of emotion in a digital human.
But what makes for a believable, truly human re-creation? Here’s what we’ve learned.
Capturing every nuance of performance
Translating computer vision data into a truly photoreal character involves a myriad of technologies and animation techniques – from hair shading to skin textures, lighting, and rendering. Every facet requires strict attention to detail and must be brought into a cohesive whole, if any one element drops out of sync, the entire illusion will disappear.
Eye movement, for example, needs special attention. Accurate modeling of the sclera and cornea is key to making a character’s eyes as lifelike as possible. We can scan an actor’s eyeball for greater accuracy, making sure light reacts with the pupil correctly. Eyeball shape also affects skin around the eyes as a character looks around – this “soft eye” effect is important for realism. Finally, pupil dilation is needed for close-up shots, and you can do this by capturing detailed eyeball movements from high resolution head-mounted cameras. Putting all of these nuances together, eyes convey a significant amount of emotion in the tiniest of twitches. They’re known as the “window to the soul,” after all, and it’s important to get them right.
The process of replicating live performance in a digital character has become more and more nuanced. Our creation, Doc Ock from Insomniac Games’ Spider-Man is a prime example. Based on the iconic Marvel villain, Dr. Otto Octavius, this character is actually a true-to-life digital double of the actor, William Salyers. He is well matched to Doc Ock’s age and body type, so you get all the same skin creases and appropriate shapes. Why is this so important? Because when somebody smiles, and their face starts to crease up, those are the markers for a smile we are used to seeing in reality. Those markers can make or break a digital human, and our success with Doc Ock was recognized.
Rules of thumb for high-quality characters
Nowadays, we’re seeing a convergence in broadcast and game production technology, especially as the gap in visual quality starts to narrow between these two mediums. Many feature films and TV series use real-time renderers like Unreal Engine to turn over fast iterations of any given shot. Meanwhile, game developers are starting to provide the same feature-length, realistic narratives usually found on the big screen.
Players are beginning to expect high-quality animation across the entirety of a game – not just in cinematic cutscenes. Developers are racing to deliver on audience demand and that’s why believable, true-to-life characters have become so sought after. For most studios, it’s standard to create 60 minutes or so of beautifully rendered cinematics. But step out of a cutscene, talk to an NPC, and there will be a significant drop-off in quality. The illusion is ruined. The future will come down to animation techniques and technologies that can deliver realism at large scale, capable of creating over 100,000 lines of performance to keep gamers immersed.
When producing a massive volume of characters at high quality, there are certain rules of thumb. At Cubic Motion, we’ve built up enough experience to know that consistency is key. Make sure to use the same scanning process across all talent and build a universal rig across your cast of characters. Well-designed, consistent mesh topology is also a must.
We also abide by the mantra “data, data, data” … and incredibly high resolution. Capture as much as possible from your subject at the very start of production, ensuring that scanned FACS poses are accurate, even if this means capturing more data than you think you need. Don’t rely on a post production clean-up pass – some of the performance will inevitably be lost.
Your process must be modular; meaning that tools, skills and people should be deployed to very specific parts of the character pipeline. If you create a team of experts in specific areas, such as processing high-density photogrammetry data, it will lead to consistency and efficiency. If you want a massive volume of characters at high quality, try not to have generalists working on multiple parts of the pipeline.
Maintaining the cinematic illusion
The volume of shots can be very high when capturing performances for video games. Inevitably, these performances are broken into individual lines of dialogue. It’s vital to blend between these lines to make a character’s attitude and expressions appear continuous. We recommend having a varied library of idle animations that can be triggered in between the specific performance animations, then trial different blending algorithms to make the transition as smooth as possible. It’s a real collaboration between art and technology teams.
For those looking to invest in expensive, high-quality scanning techniques to make these game characters look more realistic, you’ll have to give the same level of attention to performance and movement within the cinematics. There’s no point in creating a photoreal avatar whose movements then fall apart under a procedural animation system – because, once again, that would kill the fantasy.
Putting the pipeline in place
In some ways, we’ve been waiting for studios to realize that digital humans are achievable now. The pipeline and technology are already in place, it’s simply a matter of letting people know this level of fidelity is possible.
We can create immersion across hundreds of NPC characters and thousands of lines of performance. We can put digital humans to use in gaming, VR, customer service, teaching, film, TV and even holograms – just look at this recreation of His Late Highness Sheikh Zayed, providing facial animation for our client New Dimension Productions. Digital humans can make for a truly engaging experience.
David Barton is an executive producer at Cubic Motion with extensive experience developing studios in the US and once opened a location in Santa Monica to over 40 staff.