Welcome to Francis Academic Press

Academic Journal of Computing & Information Science, 2024, 7(2); doi: 10.25236/AJCIS.2024.070205.

An Improved Philips Algorithm

Author(s)

Wenze Qiu, Yiqing Xia, Qiwei He, Xu Zhao

Corresponding Author:
Wenze Qiu
Affiliation(s)

East China University of Science and Technology, Shanghai, China

Abstract

As a classical audio fingerprint algorithm, the Philips algorithm has been widely used. However, the feature extraction module of the Philips algorithm is inefficient and time-consuming, and the feature binarization algorithm is susceptible to noise interference, which affects the correct rate of audio fingerprint matching. To solve the problems mentioned, this paper proposes an improved Philips audio fingerprint algorithm. The Gammatone filter bank is used to analyze the frequency spectrum of the audio signal to simulate the frequency-selective characteristic of the basilar membrane of the human ear, and a graph-convolutional neural network is introduced to extract the global features between different frequency bands. The distance correlation coefficient is used to calculate the distance between the audio feature matrices to achieve the matching of audio fingerprints. The experimental results show that compared with the original Philips algorithm, the algorithm proposed in this paper achieves lower time consumption and stronger noise immunity.

Keywords

Audio matching; Gammatone filter; Graph Convolutional Networks (GCN); Distance correlation

Cite This Paper

Wenze Qiu, Yiqing Xia, Qiwei He, Xu Zhao. An Improved Philips Algorithm. Academic Journal of Computing & Information Science (2024), Vol. 7, Issue 2: 33-42. https://doi.org/10.25236/AJCIS.2024.070205.

References

[1] Sun N, Zhao W, Chen M, et al. An Improved Algorithm of Philips Audio Fingerprint Retrieval [J].Computer Engineering, 2018, 44 (1): 280-284.

[2] Balado F, Hurley N J, McCarthy E P, et al. Performance of philips audio fingerprinting under additive noise[C]//2007 IEEE International Conference on Acoustics, Speech and Signal Processing-ICASSP'07. IEEE, 2007, 2: II-209-II-212.

[3] Wei Xiong, Xiaoqing Yu, Jianhua Shi. An improved audio fingerprinting algorithm with robust and efficient[C]//IET International Conference on Smart and Sustainable City 2013 (ICSSC 2013). Shanghai, 2013. DOI: 10.1049/cp.2013.1960.

[4] Patterson, Roy D., Walters, Thomas C., Monaghan, Jessica, et al. Auditory speech processing for scale-shift covariance and its evaluation in automatic speech recognition[C]. //Proceedings of 2010 IEEE International Symposium on Circuits and Systems: ISCAS 2010, Paris, France, 30 May - 2 June 2010, Pages 1-736, [v.1]. :IEEE, 2010:3813-3816.

[5] Moore B C J. An introduction to the psychology of hearing[M]. Brill, 2012.

[6] Ranjan R, Thakur A .Analysis of Feature Extraction Techniques for Speech Recognition System[J]. International Journal of Innovative Technology and Exploring Engineering, 2019.

[7] Wang D L, Brown G J. Computational auditory scene analysis: Principles, algorithms, and applications [M]. Wiley-IEEE press, 2006. 

[8] Zhao X, Shao Y, Wang D L .CASA-Based Robust Speaker Identification [J]. IEEE Transactions on Audio Speech and Language Processing, 2012, 20(5):1608-1616. DOI:10.1109/TASL.2012.2186803.

[9] Zhao X, Shao Y, Wang D L. Robust speaker identification using a CASA front-end[C]//2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2011 :5468-5471. DOI:10.1109/ICASSP.2011.5947596.

[10] Shao Y, Wang D L .Robust speaker identification using auditory features and computational auditory scene analysis[C]//IEEE International Conference on Acoustics, Speech and Signal Processing. IEEE, 2008.DOI:10.1109/ICASSP.2008.4517928.

[11] Hu K, Wu J, Li Y, et al. FedGCN: Federated Learning-Based Graph Convolutional Networks for Non-Euclidean Spatial Data[J].Mathematics, 2022, 10. 

[12] Duvenaud D, Maclaurin D , Aguilera-Iparraguirre J ,et al.Convolutional Networks on Graphs for Learning Molecular Fingerprints[J].MIT Press, 2015.DOI:10.48550/arXiv.1509.09292.

[13] Wang C, Pan S, Long G, et al. Mgae: Marginalized graph autoencoder for graph clustering[C]//Proceedings of the 2017 ACM on Conference on Information and Knowledge Management. 2017: 889-898.

[14] Székely, Gábor J, Rizzo M L , Bakirov N K. Measuring and testing dependence by correlation of distances. [C]//ACM Symposium on Virtual Reality Software & Technology. ACM, 2007. DOI:10.1145/1315184.1315209.

[15] Ngo S H, S. Kemény,A. Deák.Performance of the ridge regression method as applied to complex linear and nonlinear models[J].Chemometrics & Intelligent Laboratory Systems, 2003, 67(1):69-78. DOI:10.1016/S0169-7439(03)00062-5.

[16] Zhen X, Meng Z, Chakraborty R, et al. On the versatile uses of partial distance correlation in deep learning [C]//European Conference on Computer Vision. Cham: Springer Nature Switzerland, 2022: 327-346.