High-Resolution Locally Controllable Portrait Style Transfer

<p>Guoquan Jiang<sup>1</sup>, Huan Xu<sup>2</sup>, Zhanqiang Huo<sup>1</sup></p>

doi:10.25236/AJCIS.2025.080403

Academic Journal of Computing & Information Science, 2025, 8(4); doi: 10.25236/AJCIS.2025.080403.

High-Resolution Locally Controllable Portrait Style Transfer

Author(s)

Guoquan Jiang¹, Huan Xu², Zhanqiang Huo¹

Corresponding Author:

Huan Xu

Affiliation(s)

¹School of Software, Henan Polytechnic University, Jiaozuo, 454000, China

²School of Computer Science and Technology, Henan Polytechnic University, Jiaozuo, 454000, China

Download PDF
|
Download: 21
|
View: 1809

Abstract

Most of the current portrait style transfer algorithms focus on the overall portrait style transfer for low-resolution images. However, in real life, users need to have fine control over specific regions of images with different resolutions, and the stylized images generated by existing methods have problems such as structure loss, local contour deformation, and color rendering errors. Therefore, this paper proposes a high-resolution locally controllable portrait style transfer model. By introducing a novel U-block, the method is not only suitable for local portrait style transfer but also can effectively handle the overall portrait style transfer task. By adopting two U-shaped encoders with different structures, this method constructs a unique generator structure, so that the structural features of the content domain and the style domain can be learned more fully, and the problem of structure loss in the stylized image is reduced. In addition, we propose a local portrait style transfer module, which allows users to perform accurate local style transfer based on the segmentation masks of different regions. To further improve the effect of local feature fusion and reduce the distortion of local contour, a Local feature fusion module (LFFM) is designed, which abandons the traditional feature splicing method and improves the quality of local stylized images by using a style attention mechanism. Finally, to reduce artifacts and color rendering errors, a local portrait style loss is introduced as a constraint to ensure that the style transfer region accurately learns the target style, while keeping the original structural features of other regions unchanged by histogram matching. The results of comparative and ablation experiments on four different style datasets show that the proposed method achieves excellent performance in global and local portrait style transfer, which verifies the effectiveness of the proposed method in portrait style transfer.

Keywords

portrait style transfer; U-shaped encoder; Local feature fusion module; Local portrait style loss

Cite This Paper

Guoquan Jiang, Huan Xu, Zhanqiang Huo. High-Resolution Locally Controllable Portrait Style Transfer. Academic Journal of Computing & Information Science (2025), Vol. 8, Issue 4: 19-32. https://doi.org/10.25236/AJCIS.2025.080403.

References

[1] Rosin P L, Lai Y-K. Non-photorealistic rendering of portraits[C]//Proceedings of the workshop on Computational Aesthetics. Goslar, DEU: Eurographics Association, 2015: 159-170.

[2] Li B, Zhu Y, Wang Y, et al. AniGAN: Style-Guided Generative Adversarial Networks for Unsupervised Anime Face Generation[J]. IEEE Transactions on Multimedia, 2022, 24: 4077-4091.

[3] Fišer J, Jamriška O, Simons D, et al. Example-based synthesis of stylized facial animations[J]. ACM Trans. Graph., 2017, 36(4): 155:1-155:11.

[4] Karras T, Aittala M, Laine S, et al. Alias-free generative adversarial networks[C]//Proceedings of the 35th International Conference on Neural Information Processing Systems. Red Hook, NY, USA: Curran Associates Inc., 2021: 852-863.

[5] Karras T, Laine S, Aila T. A style-based generator architecture for generative adversarial networks[C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2019: 4401-4410.

[6] Karras T, Laine S, Aittala M, et al. Analyzing and improving the image quality of stylegan[C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2020: 8110-8119.

[7] Kong F, Pu Y, Lee I, et al. Unpaired Artistic Portrait Style Transfer via Asymmetric Double-Stream GAN[J]. IEEE Transactions on Neural Networks and Learning Systems, 2023, 34(9):5427-5439.

[8] Song G, Luo L, Liu J, et al. AgileGAN: stylizing portraits by inversion-consistent transfer learning[J]. ACM Trans. Graph., 2021, 40(4): 117:1-117:13.

[9] Yang S, Jiang L, Liu Z, et al. Pastiche master: Exemplar-based high-resolution portrait style transfer[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2022: 7693-7702.

[10] Khowaja S A, Mujtaba G, Yoon J, et al. Face-past: Facial pose awareness and style transfer networks[J]. arXiv preprint arXiv:2307.09020, 2023, 1.

[11] Qi T, Fang S, Wu Y, et al. Deadiff: An efficient stylization diffusion model with disentangled representations[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2024: 8693-8702.

[12] Isola P, Zhu J Y, Zhou T, et al. Image-to-image translation with conditional adversarial networks[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2017: 1125-1134.

[13] Zhu J Y, Park T, Isola P, et al. Unpaired image-to-image translation using cycle-consistent adversarial networks[C]//Proceedings of the IEEE international conference on computer vision. 2017: 2223-2232.

[14] Chen J, Liu G, Chen X. AnimeGAN: A Novel Lightweight GAN for Photo Animation[C]. Li K, Li W, Wang H, et al., eds.//Artificial Intelligence Algorithms and Applications. Singapore: Springer, 2020: 242-256.

[15] Gatys L A, Ecker A S, Bethge M. Image style transfer using convolutional neural networks[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2016: 2414-2423.

[16] Xing Y, Li J, Dai T, et al. Portrait-aware artistic style transfer[C]//2018 25th IEEE International Conference on Image Processing (ICIP). IEEE, 2018: 2117-2121.

[17] Qin X, Zhang Z, Huang C, et al. U2-Net: Going deeper with nested U-structure for salient object detection[J]. Pattern Recognition, 2020, 106: 107404.

[18] Zhang Q L, Yang Y B. Sa-net: Shuffle attention for deep convolutional neural networks[C]//ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2021: 2235-2239.

[19] He K, Zhang X, Ren S, et al. Deep residual learning for image recognition[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2016: 770-778.

[20] Deng Y, Tang F, Dong W, et al. Stytr2: Image style transfer with transformers[C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2022: 11326-11336.

[21] Zheng S, Gao P, Zhou P, et al. Puff-net: Efficient style transfer with pure content and style feature fusion network[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2024: 8059-8068.

[22] Chong M J, Forsyth D. Jojogan: One shot face stylization[C]//European Conference on Computer Vision. Cham: Springer Nature Switzerland, 2022: 128-152.

[23] Yi R, Liu Y J, Lai Y K, et al. Unpaired portrait drawing generation via asymmetric cycle mapping[C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2020: 8217-8225.