Removing the Quality Tax in Controllable Face Generation

A simple model for StyleGAN-level quality with full 3DMM control

WACV 2024 & AI for Content Creation Workshop @ CVPR 2023

Yiwen Huang

Zhiqiu Yu

Xinjie Yi

Yue Wang



Conditional 2D generation results via 3DMM controllable parameters, including both subtle expression changes and large pose and illumination changes.


Abstract

3DMM conditioned face generation has gained traction due to its well-defined controllability; however, the trade-off is lower sample quality: Previous works such as DiscoFaceGAN and 3D-FM GAN show a significant FID gap compared to the unconditional StyleGAN, suggesting that there is a quality tax to pay for controllability. In this paper, we challenge the assumption that quality and controllability cannot coexist. To pinpoint the previous issues, we mathematically formalize the problem of 3DMM conditioned face generation. Then, we devise simple solutions to the problem under our proposed framework. This results in a new model that effectively removes the quality tax between 3DMM conditioned face GANs and the unconditional StyleGAN.



Our simple approach uses differentiable renderer RDR and 3DMM parameter estimator FR with only two objectives.
Our overall model architecture.






Bibtex Reference

@InProceedings{Huang_2024_WACV,
author = {Huang, Yiwen and Yu, Zhiqiu and Yi, Xinjie and Wang, Yue and Tompkin, James},
title = {Removing the Quality Tax in Controllable Face Generation},
booktitle = {Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)},
month = {January},
year = {2024},
pages = {5364-5373}
}

Acknowledgements

We would like to thank Yu Deng for generous correspondence and assistance in correctly reimplementing Disentanglement Score metrics. Funding was provided by a Brown Office of the Vice President for Research Seed award.