Neural Relighting and Expression Transfer On Video Portraits
Photo-realistic video portrait reenactment benefits virtual production and numerous VR/AR experiences. The task remains challenging as the reenacted expression should match the source while the lighting should be adjustable to new environments. We present a neural relighting and expression transfer technique to transfer the facial expressions from a source performer to a portrait video of a target performer while enabling dynamic relighting. Our approach employs 4D reflectance field learning, model-based facial performance capture and target-aware neural rendering. Specifically, given a short sequence of the target performer's OLAT, we apply a rendering-to-video translation network to first synthesize the OLAT result of new sequences with unseen expressions. We then design a semantic-aware facial normalization scheme along with a multi-frame multi-task learning strategy to encode the content, segmentation, and motion flows for reliably inferring the reflectance field. This allows us to simultaneously control facial expression and apply virtual relighting. Extensive experiments demonstrate that our technique can robustly handle challenging expressions and lighting environments and produce results at a cinematographic quality.
READ FULL TEXT