Welcome to Francis Academic Press

Academic Journal of Engineering and Technology Science, 2022, 5(12); doi: 10.25236/AJETS.2022.051209.

Application of region of interest extraction method based on deep learning in UAV high performance image compression

Author(s)

Jianguo Chen, Xiaoxing Guo, Yuhan Qian

Corresponding Author:
Yuhan Qian
Affiliation(s)

Aerospace Times Feihong Technology Co., Ltd, Beijing, 100094, China

Abstract

UAV has been widely used in detecting targets, but the image transmission of UAV still faces the problems of distortion and frame loss. Most image compression methods based on deep learning are lossy compression, and lossy compression reduces image quality in exchange for higher compression ratio. In order to improve the quality of the region of interest (ROI) in the reconstructed image with a certain bit rate, an importance map extraction module is embedded in the encoder, and the importance map is generated by extracting the output features of the last layer of the encoder. Finally, the mask is generated to guide the efficient allocation of bit rate in the process of drop coding. At the same time, a decoder enhancement module is embedded in the decoder output to predict the high frequency components in the reconstructed image and improve the quality of the reconstructed image by enhancing the details in the reconstructed image. The experimental results show that the proposed method is superior to the comparison method when multi-scale structural similarity (MS-SSIM) is used as the evaluation index, and the proposed method achieves better visual perception quality.

Keywords

UAV; Image compression; Deep learning; Convolutional neural network; Region of interest; Decoder enhancement

Cite This Paper

Jianguo Chen, Xiaoxing Guo, Yuhan Qian. Application of region of interest extraction method based on deep learning in UAV high performance image compression. Academic Journal of Engineering and Technology Science (2022) Vol. 5, Issue 12: 62-73. https://doi.org/10.25236/AJETS.2022.051209.

References

[1] Jiang F, Tao W, Liu S, et al. An end-to-end compression framework based on convolutional neural networks [J]. IEEE Transactions on Circuits and Systems for Video Technology, 2017, 28(10): 3007-3018.

[2] Wallace G K. The JPEG still picture compression standard [J]. IEEE transactions on consumer electronics, 1992, 38(1): xviii-xxxiv.

[3] Christopoulos C A, Ebrahimi T, Skodras A N. JPEG2000: the new still picture compression standard[C]//Proceedings of the 2000 ACM workshops on Multimedia. 2000: 45-49.

[4] FABRICE B. BPG Image format.https://bellard.org/bpg/.2015.

[5] Ballé J, Laparra V, Simoncelli E P. End-to-end optimized image compression [J]. arXiv preprint arXiv:1611.01704, 2016.

[6] Cui Z, Wang J, Gao S, et al. Asymmetric gained deep image compression with continuous rate adaptation[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattem Recognition. 2021: 10532-10541.

[7] Ororbia A G, Mali A, Wu J, et al.  Learned neural iterative decoding for lossy image compression systems[C]//2019 Data Compression Conference (DCC).  IEEE, 2019: 3-12.

[8] Brand F, Fischer K, Kaup A. Rate-Distortion Optimized Learning-Based Image Compression using an Adaptive Hierachical Autoencoder with Conditional Hyperprior [C]// Proceedings of the IEEE/ CVF Conference on Computer Vision and Pattern Recognition. 2021: 1885-1889.

[9] Toderici G, O'Malley S M, Hwang S J, et al. Variable rate image compression with recurrent neural networks [J]. arXiv preprint arXiv: 1511. 06085, 2015.

[10] Yang J, Yang C, Ma Y, et al. Learned low bit-rate image compression with adversarial mechanism[C]// Proceedings of the IEEE/ CVF Conference on Computer Vision and Pattern Recognition Workshops. 2020: 140-141.

[11] Johnston N, Vincent D, Minnen D, et al. Improved lossy image compression with priming and spatially adaptive bit rates for recurrent networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2018: 4385-4393.

[12] Ballé J, Minnen D, Singh S, et al. Variational image compression with a scale hyperprior[J]. arXiv preprintarXiv:1802.01436, 2018.

[13] Minnen D, Ballé J, Toderici G D. Joint autoregressive and hierarchical priors for learned image compression[J]. Advances in neural information processing systems, 2018, 31.

[14] Joshi K, Yadav R, Allwadhi S. PSNR and MSE based investigation of LSB[C]//2016 International Conferenceon Computational Techniques in Information and Communication Technologies (ICCTICT). IEEE, 2016: 280-285.

[15] Wang Z, Simoncelli E P, Bovik A C. Multiscale structural similarity for image quality assessment[C] //The Thrity-Seventh Asilomar Conference on Signals, Systems & Computers, 2003.Ieee, 2003, 2: 1398-1402.

[16] Cheng Z, Sun H, Takeuchi M, et al. Learned image compression with discretized gaussian mixture likelihoods and attentionmodules[C]//Proceedings of theIEEE/CVFConference on Computer Vision and Pattern Recognition. 2020:7939-7948.

[17] He D, Zheng Y, Sun B, et al. Checkerboard context model for efficient learned image compression[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2021: 14771-14780.

[18] Toderici G, Vincent D, Johnston N, et al. Full resolution image compression with recurrent neural networks[C].//Proceedings of the IEEE conference on Computer Vision and Pattern Recognition. 2017: 5306-5314.

[19] Lin C, Yao J, Chen F, et al. A spatial rmn codec for end-to-end image compression [C]//Proceedingsof the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2020: 13269-13277.

[20] Islam K, Dang L M, Lee S, et al. Image Compression with Recurrent Neural Networkand Generalized Divisive Normalization[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2021: 1875-1879.

[21] Mentzer F, Agustsson E, Tschannen M, et al. Conditional probability models for deep image compression[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2018: 4394-4402.

[22] Li M, Zuo W, Gu S, et al.  Learning convolutional networksfor content-weighted image compression [C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2018: 3214-3223.

[23] Liu J, Lu G, Hu Z, et al.  A unified end-to-end framework for efficient deep image compression[J].  arXiv preprintar Xiv: 2002. 03370, 2020.

[24] Van Oord A, Kalchbrenner N, Kavukcuoglu K. Pixelrecurrent neural networks[C]//International conference on machine learning.  PMLR, 2016: 1747-1756.

[25] Van den Oord A, Kalchbrenner N, Espeholt L, et al.Conditional image generation with pixelcnn decoders[J]. Advances in neural information processing systems, 2016, 29.

[26] Deng J, Dong W, Socher R, et al. Imagenet: A large-scale hierarchical image database[C]//2009 IEEE conferenceoncomputer vision and pattern recognition. Ieee, 2009: 248-255.

[27] Russakovsky O, Deng J, Su H, et al. Imagenet large scalevisual recognition challenge[J]. International journal of computer vision, 2015, 115(3): 211-252.

[28] Timofte R, De Smet V, Van Gool L. A+: Adjusted anchoredneighborhood regression for fast super-resolution[C]//Asian conference on computer vision. Springer, Cham, 2014: 111-126.

[29] Theis L, Shi W, Cunningham A, et al. Lossy image compression with compressive autoencoders[J]. arXiv preprintarXiv:1703.00395, 2017.