Welcome to Francis Academic Press

Academic Journal of Computing & Information Science, 2025, 8(3); doi: 10.25236/AJCIS.2025.080302.

Robust Audio Watermarking Based on Invertible Neural Network

Author(s)

Jiji Zhu

Corresponding Author:
Jiji Zhu
Affiliation(s)

School of Information Science and Engineering, Chongqing Jiaotong University, Chongqing, China

Abstract

Audio watermarking technology leverages the characteristics of the human auditory system and the original audio carrier to imperceptibly embed watermark information into the audio. Traditional watermarking algorithms employ signal processing techniques and are limited by the experience of the model designer. In contrast, deep learning–based neural network audio watermarking algorithms offer greater adaptability and versatility, and their robustness can be enhanced through simulated attacks, marking an important direction for future development in audio watermarking technology. Related research primarily focuses on balancing the imperceptibility, robustness, and embedding capacity of watermark information. The audio watermarking model designed in this paper emphasizes imperceptibility and robustness. Imperceptibility is enhanced by designing a discriminator that ensures the human ear cannot distinguish between the original and watermarked audio. Robustness is improved by developing a simulated attack block, which provides strong resistance against multiple types of attacks, and by mitigating the damage caused by the attack layer through the invertible design of a neural network assisted by a balancing block. This study achieves high imperceptibility and strong robustness based on an invertible neural network. The experimental results demonstrate that the model performs well in terms of both watermark embedding and extraction accuracy, as well as anti-attack performance.

Keywords

Audio Watermarking, Invertible Neural Network, Imperceptibility, Balance Block

Cite This Paper

Jiji Zhu. Robust Audio Watermarking Based on Invertible Neural Network. Academic Journal of Computing & Information Science (2025), Vol. 8, Issue 3: 10-17. https://doi.org/10.25236/AJCIS.2025.080302.

References

[1] Kumar K P, Kanhe A. An adaptive embedding approach for high imperceptible and robust audio watermarking using framelet transform and SVD[J]. Circuits, Systems, and Signal Processing, 2023, 42(9): 5684-5713.

[2] Liu X, Li X, Shi C, et al. A novel SVD-based adaptive robust audio watermarking algorithm[J]. Multimedia Tools and Applications, 2024: 1-23.

[3] Hua G, Huang J, Shi Y Q, et al. Twenty years of digital audio watermarking—a comprehensive review[J]. Signal processing, 2016, 128: 222-242.

[4] Jing J, Deng X, Xu M, et al. Hinet: Deep image hiding by invertible network[C]//Proceedings of the IEEE/CVF international conference on computer vision. 2021: 4733-4742.

[5] Moritz M, Olán T, Virtanen T. Noise-to-Mask Ratio Loss for Deep Neural Network Based Audio Watermarking[C]//2024 IEEE 5th International Symposium on the Internet of Sounds (IS2). IEEE, 2024: 1-6.

[6] Chen G, Wu Y, Liu S, et al. Wavmark: Watermarking for audio generation[J]. arXiv preprint arXiv:2308.12770, 2023.

[7] Li P, Zhang X, Xiao J, et al. IDEAW: Robust Neural Audio Watermarking with Invertible Dual-Embedding[J]. arXiv preprint arXiv:2409.19627, 2024.

[8] Liu C, Zhang J, Fang H, et al. Dear: A deep-learning-based audio re-recording resilient watermarking[C]//Proceedings of the AAAI Conference on Artificial Intelligence. 2023, 37(11): 13201-13209.

[9] Dinh L, Krueger D, Bengio Y. Nice: Non-linear independent components estimation[J]. arXiv preprint arXiv:1410.8516, 2014.

[10] Kingma D P, Dhariwal P. Glow: Generative flow with invertible 1x1 convolutions[J]. Advances in neural information processing systems, 2018, 31.

[11] Lan Y, Shang F, Yang J, et al. Robust image steganography: hiding messages in frequency coefficients[C]//Proceedings of the AAAI Conference on Artificial Intelligence. 2023, 37(12): 14955-14963.

[12] Montajabi Z, Ghassab V K, Bouguila N. Invertible Neural Network-Based Video Compression[C]//ICPRAM. 2023: 558-564.

[13] Lugmayr A, Danelljan M, Van Gool L, et al. Srflow: Learning the super-resolution space with normalizing flow[C]//Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part V 16. Springer International Publishing, 2020: 715-732.

[14] Zhu X, Li Z, Zhang X Y, et al. Residual invertible spatio-temporal network for video super-resolution[C]//Proceedings of the AAAI conference on artificial intelligence. 2019, 33(01): 5981-5988.

[15] Liu Y, Qin Z, Anwar S, et al. Invertible denoising network: A light solution for real noise removal[C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2021: 13365-13374.