Not long ago, Andy Wood, the president of a British computer graphics company called Cubic Motion, was in a meeting helping pitch its technology to a major games studio in Los Angeles. To showcase what they were capable of, Cubic Motion’s team had created an animated cutscene using one of the game studio’s trademark properties: a stylized female half-human, half-animal character.
The demo had gone well, the pre-rendered cutscene convincingly delivering the studio’s script. Then the Cubic Motion team executed its agreed-upon master stroke; the stunt they hoped would earn them the contract.
Having delivered its pre-written lines, the character in the cutscene broke the fourth wall. It turned to the camera, seeming to peer out at the business executives in the room, and began to address them individually. This wasn’t a pre-programmed trick, either. The pre-rendered cutscene, it transpired, was not pre-rendered at all. It was acted out live by a performer in Manchester, on the other side of the world, who was being transformed in real-time into the game studio’s character. When the character went “off-script” and started interacting with the people in the room, the effect was staggering.
“I’ve never seen a reaction like that,” Wood told Digital Trends. “There was an audible gasp from everyone in the room. Someone grabbed my arm because they just couldn’t believe what was happening.” Cubic Motion won the contract.
To appreciate why this is so significant, you need to know a little bit about the way that motion capture characters are typically, well, captured. Motion capture actors are normally recorded far in advance. They wear form-fitting body suits that are covered in dozens of tiny dots, either LEDs or things that resemble miniature ping-pong balls. These are called markers, and are used to provide animators with reference points for tracking a body’s motion in three-dimensional time and space. Once this process has been carried out, animators can set to work fleshing out the finished graphics: quite literally joining the dots until they have something that resembles whatever character they are hoping to create.
Cubic Motion’s approach is different. The company has developed marker-less technology for real-time tracking of models; designed to instantly transform them into robust completed renders, running at a silky smooth 60 frames per second. This involves some impressive machine learning algorithms, which are able to take an image of a face and separate its component parts, digitally mark up the different elements, and allow it to be captured in a near-unprecedented amount of detail.
“As far as we’re aware, there’s nobody in the world able to achieve facial capture through computer vision in the way that we do it to the level of real-time fidelity that Cubic Motion do,” Wood continued.
While there’s a good chance you haven’t heard the name Cubic Motion before, you’ll almost certainly know its work. Its world-beating facial animation technology has been used in games such as God of War, Insomniac’s Spider-Man, Anthem, and Blood & Truth. And there’s a whole lot more to come.
The trouble with Gollum
For many people reading this, Gollum from Peter Jackson’s Academy Award-winning Lord of the Rings movies trilogy remains a high-point for motion-captured performances. It’s one of the first times that audiences saw a CGI character who delivered a performance every bit as emotionally charged as any of the more recognizably human characters in the cinematic trilogy.
“I often use Gollum [from the Lord of the Rings trilogy] as a benchmark of a credible motion capture-delivered character,” Wood said. “But when Andy Serkis played Gollum, he was tracked using motion capture that required him to wear multiple markers on his face. The team did a stunning job, but there are limitations. You can’t put dots for markers on people’s eyes, for instance. You can’t track tongues, and so on. To achieve the finished effect, a huge team of artists were needed to do things like hand animate his eye movements because these couldn’t be tracked. With our marker-less tracking, every expression from an eye blink to a pupil dilation can all be recorded — and, furthermore, in real-time.”
Cubic Motion doesn’t do all of this in isolation, of course. It works with other industry-leading companies such as Epic Games, Tencent, 3Lateral, and Vicon, who have developed everything from high-definition scanners to the Unreal game engine. But Cubic Motion’s contribution is profound. Its technology provides the enabling underpinnings that brings together all the other parts. No amount of high-definition dimples, perfectly shaded skin or beautifully rendered peach fuzz on a human face could make up for movement that looks rigid and unnatural.
“If you can capture the essence of a human being, and digitally transfer that onto a digital character or double, it’s like a transference of the soul,” Wood continued. “It allows you to take all the characteristics of a human and transfer it onto something non-human in a way that is compelling and utterly recognizable.”
Soul is a concept that comes up a surprising amount when it comes to technological reproduction. There are reports from anthropologists about indigenous tribes, first encountering western explorers, who did not want to be photographed due to their belief that this photographic process would somehow steal their soul. They needn’t have worried, since hundreds of years later we’re still struggling with technology to do this. Cubic Motion believes that it is on the right path to help. The ability to instantly track the movement of a human face and immediately transfer it to a CGI character, photo-realistic or otherwise, is a game-changer. It makes it possible to capture the incredibly nuanced performances of an actor in a way that was unimaginable just a few years ago.
In 2018, the company showed off its technology by debuting a photorealistic reproduction of a young Chinese woman, called Siren, whose real facial features were tracked in more than 200 locations and reproduced digitally without any perceptible delay. You might stop short of calling it soul transference, but there’s no doubt that the results have a realistically lifelike energy to them. No eyeball-drawing animators required.
How will this be used?
This isn’t just about the cost benefits of saving animators time, however. It hints at a future in which all of us can be scanned and reproduced in astonishing fidelity and placed into digital worlds in unparalleled levels of detail. This could be used for game features allowing players to be scanned and inserted into a game in hitherto-unimaginable levels of photo accuracy.
But there are plenty of other scenarios in which this technology could find a home. Consider, for instance, being able to interact with a virtual avatar of your doctor, perhaps via augmented or virtual reality, for a virtual appointment. Or imagine that you have an 8-year-old kid who struggles with math. What if they could be given extra tutoring courtesy of an on-screen avatar resembling a photo-realistic version of themselves at 12, having buckled down and gotten to grips with the subject? Heck, us adults could access a fitness app presented by a realistic avatar resembling us in 18-months, if only we get serious about eating healthy and hitting the gym three days a week. “You could create a self-fulfilling prophecy to give you self-belief,” Wood said.
These sound like the stuff of science-fiction right now, but it’s not as unfeasibly futuristic as it might sound. For instance, a Japanese company called Gatebox markets devices not dissimilar to an Amazon Echo or Google Home — only featuring tiny animated assistants designed to interact with their owners. As it happens, the effects of having inspirational avatars modeled on users have been investigated as well. At New York University, a study into long-term decision-making found that users who interacted with a personalized avatar, artificially aged to look older, thought more carefully about decisions that would impact their future. They were more likely to save money for retirement than those who saw an avatar the same age as themselves.
Of course, such avatars needn’t remind us of the creeping hand of death on our shoulder. Technology such as this could just as easily transform us into a younger, better looking version of ourselves. Or into a person of another gender — or even another species, real or imagined. The sky’s the limit.
Cubic Motion’s real-time facial capture technology is, at present, far out of the hands of everyday consumers. Wood himself has a full 3D facial scan, captured using the company’s state-of-the-art 360 degree scanning technology, which he shows off at product demos. (Somewhat inconveniently, this means that he must re-shave his beard prior to each demo to ensure his human self perfectly matches the fidelity of his digital double. Such are the sacrifices that pioneers must make.) But for most of us this is not an option. The camera rig Cubic Motion uses, which is just one part of the overall package, would cost you or I hundreds of thousands of dollars.
However, self-scanning technology, at a much lower fidelity, is found on today’s smartphones. The iPhone X-series models, for instance, boast something not dissimilar for making possible Animojis and Memojis based on the movement of user’s faces. Cubic Motion has also recently launched Persona, its proprietary capture and helmet-mounted system, which can now be licensed out by other companies. Persona features all the technology, such as in-built infrared lighting, to let others do this work for themselves.
As sci-fi writer William Gibson has pointed out: the future’s here, it’s just not very evenly distributed yet. But as components get cheaper and technology continues to advance, we should expect that to change. Probably faster than we expect.
Spanning the uncanny valley
Back in 1970, the Japanese robotics professor Masahiro Mori put forth the observation that we now refer to as the “uncanny valley.” It’s the idea that, as we build robots or develop 3D computer animations that closely resemble humans, a slight divergence from the real thing can cause feelings of eeriness or even revulsion in viewers. There is good reason for this. A large part of our brain is dedicated to encoding faces. Kids as young as three months learn to categorize faces, long before they can distinguish between other categories of object. As a result, a face that looks not quite like a face unnerves us more than, say, a computer-generated leaf that doesn’t look entirely accurate.
Does Cubic Motion’s impressively lifelike work mean that it has bridged this uncanny valley effect? Wood is optimistic, but he’s not ready to proclaim victory just yet. “Whether we’ve actually made it to the other side of the uncanny valley just yet is for the public to decide,” he said.
To paraphrase a (possibly apocryphal) line from the American essayist Ralph Waldo Emerson, crossing the uncanny valley is more of a journey than a destination. Like the goal of artificial intelligence researchers of replicating “intelligence” in a computer, creating a totally lifelike digital human is a challenging, non-objective benchmark to achieve. “Anything we think of as digitally real now, we will probably look back on with different eyes in a few year’s time and be able to spot things we can improve on,” Wood explained. “If you go back to movies like Jason and the Argonauts or King Kong, when people first experienced them they looked pretty credible. People walked away thinking the special effects were remarkable.”
For all the charm of stop-motion animation, neither of these effects today look wholly convincing. Nor, we could argue, do many of the photographic methods that we have for recording human likenesses. The softened black-and-white portrait of an actor in a 1940s movie, with all the characterful flaws of celluloid, look positively abstract next to the unforgiving HD of a modern television broadcast. And modern HD will, as crazy as it sounds, one day look dated as well; most likely in the face of fully three-dimensional entertainment.
“But the journey is so advanced now that it’s very obvious that it will get there,” Wood said. And when we do, Cubic Motion very much hopes to be on the finish line. Presumably with our digitized human doppelgängers cheering it on.