Academic Journal of Computing & Information Science, 2025, 8(4); doi: 10.25236/AJCIS.2025.080406.
Cheng Long
Microsoft Bellevue, WA 98004, USA
This paper provides insights into deployment strategies for deep learning models to enable scalable and robust cloud services. It first describes the fundamentals of deep learning models and cloud-based services, highlighting the challenges of cloud deployment. Then, various deployment strategies are systematically presented, including model compression and optimisation, containerisation and orchestration, serverless deployment and distributed deployment. The paper introduces performance evaluation metrics and demonstrates the practical application of these strategies through real-world case studies (e.g., image classification service deployment and natural language processing-based chatbot deployment). The paper concludes with a summary of lessons learnt and future research directions, aiming to provide valuable insights for effective deployment of deep learning models in cloud-based services, improving their scalability and robustness while remaining cost-effective.
Deep Learning Models; Cloud Services; Deployment Strategies; Scalability; Robustness
Cheng Long. Deep Learning Model Deployment Strategies for Scalable and Robust Cloud-Based Services. Academic Journal of Computing & Information Science (2025), Vol. 8, Issue 4: 49-55. https://doi.org/10.25236/AJCIS.2025.080406.
[1] Lu-da Zhao, Yi-hua Hu, Nan-xiang Zhao, et al. Research status and prospects of compression and deployment acceleration methods for LiDAR point cloud deep learning models (Invited) [J]. Progress in Laser and Optoelectronics, 2024, 61(20):2011005. DOI:10.3788/LOP241166.
[2] Liu Meizhen. Research on Natural Language Understanding Algorithms for Cloud Service Robots [D]. Shandong University, 2023.
[3] Li Shu, Ji Xingyuan, Chu Xiaoxue, et al. Polarimetric remote sensing cloud detection using a deep learning network for multi-dimensional information fusion [J]. Acta Optica Sinica, 2025, 45(12).
[4] Wang Wei, Xu Long, Chen Zhuo. A survey of feature compression techniques in the middle layer of deep learning models [J]. Computer Applications Research, 2023, 40(5):1281-1291. DOI:10.19734/j.issn.1001-3695.2022.09.0493.
[5] Liang Zhenqi. Research on optimization of load balancing on cloud platforms based on deep learning [J]. Information Recording Materials, 2024, 25(2):69-71.
[6] Zheng Yongjian, Tong Yaping. A data analysis method based on cloud computing and deep learning: CN202310825962.X[P].CN116543563A[2025-02-28].
[7] Xu Jiawei, Zhang Chao, Yang Liang, et al. An industrial nondestructive testing system and method based on cloud-edge collaboration and deep learning. CN202211270332.2 [2025-02-28].
[8] Lv Hang, Li Yang, Zhang Jucheng, et al. Design of an arrhythmia classification system based on deep learning [J]. Software Engineering, 2023, 26(2):46-49. DOI:10.19644/j.cnki.issn2096-1472.2023.002.009.
[9] Chen Wei, Ren Peng, An Wenni, et al. A method for target detection in intelligent edge-based coal mine monitoring images based on spatial attention mechanism [J]. Coal Science and Technology, 2025, 52(S2):201. DOI:10.12438/cst.2022-2140.
[10] Liu Jinrui, Du Yuncheng. A review of text generation based on large language models [J]. Artificial Intelligence and Robotics Research, 2025, 14(1):14. DOI:10.12677/airr.2025.141019.