Welcome to Francis Academic Press

Academic Journal of Computing & Information Science, 2021, 4(1); doi: 10.25236/AJCIS.2021.040112.

Object tracking in siamese network with attention mechanism and Mish function

Author(s)

Fangbin Zhang1, *, Xiaofeng Wang2

Corresponding Author:
Fangbin Zhang
Affiliation(s)

1College of Information Engineering, Shanghai Maritime University, Shanghai, 201306, China

2Shanghai Maritime University, 201306, China

*Corresponding author: [email protected]

Abstract

In order to improve the recognition and tracking ability of the fully-convolutional siamese networks for object tracking in complex scenes, this paper proposes an improved object tracking algorithm with channel attention mechanism and Mish activation function. First, the channel attention mechanism is introduced into the model, and different weights are assigned to each channel to improve the network’s representation ability. At the same time, the Mish function is used to replace the ReLU activation function in the network. The smooth Mish function can make better information enter the network, thereby obtaining better accuracy and generalization. Finally, the gradient centralization is embedded in the stochastic gradient function, so as to improve the generalization performance of the network and make the training more efficient and stable. The experiment was performed on the OTB50 and VOT2018 data sets, and the improved algorithm achieved better performance than the original algorithm.

Keywords

Object tracking, channel attention mechanism, Mish function, gradient centralization

Cite This Paper

Fangbin Zhang, Xiaofeng Wang. Object tracking in siamese network with attention mechanism and Mish function. Academic Journal of Computing & Information Science (2021), Vol. 4, Issue 1: 75-81. https://doi.org/10.25236/AJCIS.2021.040112.

References

[1] BOLME D S, BEVERIDE J R, DRAPER B A, et al. Visual object tracking using adaptive correlation filters [C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2010: 2544 - 2550.

[2] HENRIQUES J F, CASEIRO R, MARTINS P, et al. Highspeed tracking with kernelized correlation filters [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2014, 37( 3) : 583 - 596.

[3] DANELLJAN M, KHAN F S, FELSBERG M, et al. Adaptive color attributes for real-time visual tracking [C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2014: 1090 - 1097.

[4] DANELLJAN M, BHAT G, KHAN F S, et al. ECO: Efficient convolution operators for tracking [C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2017: 6638 - 6646.

[5] DANELLJAN M, HAGER G, SHAHBAZ KHAN F, et al. Convolutional features for correlation filter based visual tracking [C]//Proceedings of the IEEE International Conference on Computer Vision Workshops. 2015: 58-66.

[6] NAM H, HAN B. Learning multi-domain convolutional neural networks for visual tracking [C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016: 4293 - 4302.

[7] FAN H, LING H B. SANet: Structure-aware network for visual tracking [C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops. 2017: 2217 - 2224.

[8] TAO R, GAVVES E, SMEULDERS A W M. Siamese instance search for tracking [C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016: 1420 - 1429.

[9] BERTINETTO L, VALMADRE J, HENRIQUES J F, et al. Fully convolutional siamese networks for object tracking [C]//Proceedings of the European Conference on Computer Vision. 2016: 850 - 865.

[10] LI B, YAN J, WU W, et al. High performance visual tracking with siamese region proposal network [C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2018: 8971-8980.

[11] PRAJIT RAMACHANDRAN, BARRET ZOPH, QUOC V LE. Searching for activation functions [C]//arXiv preprint arXiv: 1710.05941, 2017.

[12] DIGANTA MISRA, LANDSKAPE. A Self Regularized Non-Monotonic Activation Function [C]// arXiv preprint arXiv: 1908.08681v3, 2020.

[13] HU J, SHEN L, SUN G. Squeeze-and-excitation networks [C]//IEEE Conference on Computer Vision and Pattern Recognition, 2018: 7132-7141.

[14] LI X, WANG W, HU X, et al. Selective Kernel Networks [C]//IEEE Conference on Computer Vision and Pattern Recognition, 2019: 510-519.

[15] WOO S, PARK J, LEE J Y, et al. Cbam: Convolutional block attention module [C]//European Conference on Computer Vision, 2018: 3-19.

[16] YONG H, HUANG J, et al. Gradient Centralization: A New Optimization Technique for Deep Neural Networks [C]// arXiv preprint arXiv: 2004.01461v2,2020.