3DMM conditioned face generation has gained traction due to its well-defined controllability; however, the trade-off is lower sample quality: Previous works such as DiscoFaceGAN and 3D-FM GAN show a significant FID gap compared to the unconditional StyleGAN, suggesting that there is a quality tax to pay for controllability. In this paper, we challenge the assumption that quality and controllability cannot coexist. To pinpoint the previous issues, we mathematically formalize the problem of 3DMM conditioned face generation. Then, we devise simple solutions to the problem under our proposed framework. This results in a new model that effectively removes the quality tax between 3DMM conditioned face GANs and the unconditional StyleGAN.
@InProceedings{Huang_2024_WACV,
author = {Huang, Yiwen and Yu, Zhiqiu and Yi, Xinjie and Wang, Yue and Tompkin, James},
title = {Removing the Quality Tax in Controllable Face Generation},
booktitle = {Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)},
month = {January},
year = {2024},
pages = {5364-5373}
}
We would like to thank Yu Deng for generous correspondence and assistance in correctly reimplementing Disentanglement Score metrics. Funding was provided by a Brown Office of the Vice President for Research Seed award.