Facial Expression Manipulation via Latent Space Generative Adversarial Network

Wafaa  Razzaq

Wafaa Razzaq College of Nursing, University of Thi-Qar, Nasiriyah, Iraq

Keywords: StyleGAN2, Latent Space, Facial Expressions

Abstract

Style Generative Adversarial Network (StyleGAN) stands out as the state-of-the-art architecture for generating highly realistic synthetic faces. Its implementation projects an image into its latent space, which can be manipulated by means of directional curves modifying features of the original image. However, its high dimensionality makes the manual search for a directionality that produces a given feature or gesture impractical. This work proposes a pseudo-auto encoder type neural architecture that manipulates the latent projection by alternating the appearance of the face. This is done by encoding the facial gesture with Action Units vectors. A dynamic of expressions was achieved that allows the transition from one gesture to another without having to go through the neutral, improving the naturalness of the gestural dynamics.

References

T. Karras, S. Laine, M. Aittala, J. Hellsten, J. Lehtinen, and T. Aila, “Analyzingand improving the image quality of stylegan,” in CVPR, pp. 8110–8119, 2020. https://doi.org/10.48550/arXiv.1912.04958

Abdal, Rameen, et al. "Styleflow: Attribute-conditioned exploration of stylegan-generated images using conditional continuous normalizing flows." ACM Transactions on Graphics (ToG) 40.3 (2021): 1-21.‏ https://doi.org/10.1145/3447648

Oliva, Aude, and Phillip Isola. "Ganalyze: Toward visual definitions of cognitive image properties." Journal of Vision 20.11 (2020): 297-297.‏ https://doi.org/10.1167/jov.20.11.297

Khosla A, Xiao J, Isola P, Torralba A, Oliva A (2012) Image memorability and visual inception. SIGGRAPH Asia. https://doi.org/10.1145/2407746.2407781

A. Voynov and A. Babenko, “Unsupervised discovery of interpretable directions inthe gan latent space,” in International conference on machine learning, 2020. https://doi.org/10.48550/arXiv.1811.10597

C. Tzelepis, G. Tzimiropoulos, and I. Patras, “Warpedganspace: Finding non-linearrbf paths in gan latent space,” in ICCV, 2021. https://doi.org/10.48550/arXiv.2109.13357

Liu, Yunfan, et al. "Towards spatially disentangled manipulation of face images with pre-trained StyleGANs." IEEE Transactions on Circuits and Systems for Video Technology 33.4 (2022): 1725-1739.‏https://doi.org/10.1109/TCSVT.2022.3213662

Speck, Daniel, et al. "The Importance of Growing Up: Progressive Growing GANs for Image Inpainting." 2023 IEEE International Conference on Development and Learning (ICDL). IEEE, 2023.‏ https://doi.org/10.1109/ICDL55364.2023.10364530

Y. Fan, F. Tian, X. Tan, and H. Cheng, “Facial expression animation throughaction units transfer in latent space,” Computer Animation and Virtual Worlds, 2020. https://doi.org/10.1002/cav.1946

Tang, Bingyin, and Fan Feng. "Efficient and expressive high-resolution image synthesis via variational autoencoder-enriched transformers with sparse attention mechanisms." Journal of Electronic Imaging 33.3 (2024): 033002-033002.‏ https://doi.org/10.1117/1.JEI.33.3.033002

Ding, Saisai, et al. "High-resolution dermoscopy image synthesis with conditional generative adversarial networks." Biomedical Signal Processing and Control 64 (2021): 102224.‏ https://doi.org/10.1016/j.bspc.2020.102224

P. Ekman and W. V. Friesen, “Facial action coding system,” Environmental Psychology & Nonverbal Behavior, 1978. https://psycnet.apa.org/doi/10.1037/t27734-000

Imamverdiyev, Yadigar N., and Firangiz I. Musayeva. "Analysis of generative adversarial networks." Problems of Information Technology (2022): 22-30.‏ http://doi.org/10.25045/jpit.v13.i1.03

Barzilay, Noa, Tal Berkovitz Shalev, and Raja Giryes. "MISS GAN: A multi-IlluStrator style generative adversarial network for image to illustration translation." Pattern Recognition Letters 151 (2021): 140-147.‏https://doi.org/10.1016/j.patrec.2021.08.006

S. Du, Y. Tao, and A. M. Martinez, “Compound facial expressions of emotion,”Proceedings of the National Academy of Sciences, vol. 111, no. 15, pp. E1454–E1462, 2014. https://doi.org/10.1073/pnas.1322355111

K. Zhang, Z. Zhang, Z. Li, and Y. Qiao, “Joint face detection and alignment usingmultitask cascaded convolutional networks,” IEEE signal processing letters, vol. 23,no. 10, pp. 1499–1503, 2016. https://doi.org/10.1109/LSP.2016.2603342

D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization. iclr. 2015. https://doi.org/10.48550/arXiv.1412.6980