Facial performance capture, a key component of visual effects for movies and computer games, can be obtained using just a single camera with a new methodology developed by Disney Research.
By creating a model that takes the underlying facial anatomy and skin thickness into account, the single-camera method is able to capture facial expressions with the robustness of traditional multi-view methods.
What's more, the method has demonstrated the unprecedented ability for a single camera to capture extreme deformations caused by external forces, such as a jet of compressed air forcing a man's cheek to ripple and flutter.
"There has been a trend in facial performance capture toward methods that use fewer cameras and less hardware, giving actors more freedom to perform," said Markus Gross, vice president at Disney Research. "But usually that means a trade off in the level of detail and accuracy. No hardware setup could be simpler than our new one-camera method, yet we've shown that it can obtain results that rival, if not exceed, more traditional methods."
The researchers will present their anatomically constrained deformation model July 24 at the ACM International Conference on Computer Graphics & Interactive Techniques (SIGGRAPH) in Anaheim, Calif.
Thabo Beeler, senior research scientist at Disney Research, said computer reconstructions of facial performances based on a single camera usually require building a highly detailed model of the actor's face and acquiring and encoding perhaps a hundred expressions. Despite the substantial time and effort necessary to develop such a model, it still is unlikely to encode all of the actor's expressions, requiring additional workarounds.
The new Disney model, by contrast, doesn't rely on pre-computed facial motions. Instead, it considers the face's underlying bone structure and skin thickness.
"These anatomical factors constrain the face to physically valid expressions and helps counteract depth ambiguities that plague single-camera tracking," Beeler said.
The researchers were able to build their models with a minimal number of facial scans and expressions.
The power of these anatomical constraints is demonstrated in the example of the compressed-air jet causing ripples to propagate up a man's cheek. Capturing such an event with a single camera would be virtually impossible using a model with pre-computed expressions because the deformations caused by the jet is far beyond anything that would have been previously recorded. The Disney method was able to reproduce the rippling, even if it wasn't totally accurate all of the time.
The researchers demonstrated their single-camera technique using a variety of cameras, including a GoPro camera and an iPhone camera. "We believe that our anatomically constrained local deformation model will have a substantial impact on different areas of facial performance capture and animation," Beeler said.
Derek Bradley, research scientist at Disney Research, added "Since our method requires substantially fewer expressions to be acquired, processed and integrated, our approach greatly reduces the effort required from actors and artists alike."