Research on Key Technologies for Cross-Cloud Federated Training of Large Language Models

<p>Haowei Yang<sup>1</sup>, Mingxiu Sui<sup>2</sup>, Shaobo Liu<sup>3</sup>, Xinyue Qian<sup>4</sup>, Zhaoyang Zhang<sup>5</sup>, Bingying Liu<sup>6</sup></p>

doi:10.25236/AJCIS.2024.071106

Academic Journal of Computing & Information Science, 2024, 7(11); doi: 10.25236/AJCIS.2024.071106.

Research on Key Technologies for Cross-Cloud Federated Training of Large Language Models

Author(s)

Haowei Yang¹, Mingxiu Sui², Shaobo Liu³, Xinyue Qian⁴, Zhaoyang Zhang⁵, Bingying Liu⁶

Corresponding Author:

Haowei Yang

Affiliation(s)

¹University of Houston, Cullen College of Engineering, Industrial Engineering, Houston, USA

²University of Iowa, Department of Mathematics, Iowa City, USA

³Independent Researcher, Broomfield, USA

⁴Independent Researcher, New York, USA

⁵University of California San Diego, Computational Science, San Diego, USA

⁶Duke University, Interdisciplinary Data science, McLean, USA

Download PDF
|
Download: 55
|
View: 2309

Abstract

With the rapid development of natural language processing technology, large language models have demonstrated exceptional performance in various application scenarios. However, training these models requires significant computational resources and data processing capabilities. Cross-cloud federated training offers a new approach to addressing the resource bottlenecks of a single cloud platform, allowing the computational resources of multiple clouds to collaboratively complete the training tasks of large models. This study analyzes the key technologies of cross-cloud federated training, including data partitioning and distribution, communication optimization, model aggregation algorithms, and the compatibility of heterogeneous cloud platforms. Additionally, the study examines data security and privacy protection strategies in cross-cloud training, particularly the application of data encryption and differential privacy techniques. Through experimental validation, the proposed technical framework demonstrates enhanced training efficiency, ensured data security, and reduced training costs, highlighting the broad application prospects of cross-cloud federated training.

Keywords

Large language models, cross-cloud federated training, federated learning, data security

Cite This Paper

Haowei Yang, Mingxiu Sui, Shaobo Liu, Xinyue Qian, Zhaoyang Zhang, Bingying Liu. Research on Key Technologies for Cross-Cloud Federated Training of Large Language Models. Academic Journal of Computing & Information Science (2024), Vol. 7, Issue 11: 42-49. https://doi.org/10.25236/AJCIS.2024.071106.

References

[1] Li, Zhenglin, et al. "Stock market analysis and prediction using LSTM: A case study on technology stocks." Innovations in Applied Engineering and Technology (2023): 1-6.

[2] Mo, Yuhong, et al. "Large Language Model (LLM) AI Text Generation Detection based on Transformer Deep Learning Algorithm." International Journal of Engineering and Management Research 14.2 (2024): 154-159.

[3] Song, Jintong, et al. "A comprehensive evaluation and comparison of enhanced learning methods." Academic Journal of Science and Technology 10.3 (2024): 167-171.

[4] Liu, Tianrui, et al. "Spam detection and classification based on distilbert deep learning algorithm." Applied Science and Engineering Journal for Advanced Research 3.3 (2024): 6-10.

[5] Dai, Shuying, et al. "The cloud-based design of unmanned constant temperature food delivery trolley in the context of artificial intelligence." Journal of Computer Technology and Applied Mathematics 1.1 (2024): 6-12.

[6] Mo, Yuhong, et al. "Make Scale Invariant Feature Transform “Fly” with CUDA." International Journal of Engineering and Management Research 14.3 (2024): 38-45.

[7] He, Shuyao, et al. "Lidar and Monocular Sensor Fusion Depth Estimation." Applied Science and Engineering Journal for Advanced Research 3.3 (2024): 20-26.

[8] Liu, Jihang, et al. "Unraveling large language models: From evolution to ethical implications-introduction to large language models." World Scientific Research Journal 10.5 (2024): 97-102.

[9] Qi Z, Ma D, Xu J, et al. Improved YOLOv5 Based on Attention Mechanism and FasterNet for Foreign Object Detection on Railway and Airway tracks[J]. arXiv preprint arXiv:2403.08499, 2024.

[10] Xiang A, Huang B, Guo X, et al. A neural matrix decomposition recommender system model based on the multimodal large language model[J]. arXiv preprint arXiv:2407.08942, 2024.

[11] Liu S, Zhu M. Meta inverse constrained reinforcement learning: Convergence guarantee and generalization analysis[C]//The Twelfth International Conference on Learning Representations. 2023.

[12] Li, Shaojie, Yuhong Mo, and Zhenglin Li. "Automated pneumonia detection in chest x-ray images using deep learning model." Innovations in Applied Engineering and Technology (2022): 1-6.

[13] Tang X, Wang Z, Cai X, et al. Research on heterogeneous computation resource allocation based on data-driven method[J]. arXiv preprint arXiv:2408.05671, 2024.

[14] Mo K, Chu L, Zhang X, et al. DRAL: Deep Reinforcement Adaptive Learning for Multi-UAVs Navigation in Unknown Indoor Environment[J]. arXiv preprint arXiv:2409.03930, 2024.

[15] Yan H, Wang Z, Bo S, et al. Research on image generation optimization based deep learning[J]. 2024.

[16] Zhao Y, Hu B, Wang S. Prediction of Brent crude oil price based on LSTM model under the background of low-carbon transition[J]. arXiv preprint arXiv:2409.12376, 2024.

[17] Zhang W, Huang J, Wang R, et al. Integration of Mamba and Transformer--MAT for Long-Short Range Time Series Forecasting with Application to Weather Dynamics[J]. arXiv preprint arXiv:2409.08530, 2024.

[18] Zhao Q, Hao Y, Li X. Stock Price Prediction Based on Hybrid CNN-LSTM Model[J]. 2024.

[19] Mo, Yuhong, et al. "Password complexity prediction based on roberta algorithm." Applied Science and Engineering Journal for Advanced Research 3.3 (2024): 1-5.

[20] Tan C, Wang C, Lin Z, et al. Editable Neural Radiance Fields Convert 2D to 3D Furniture Texture[J]. International Journal of Engineering and Management Research, 2024, 14(3): 62-65.

[21] Wang, Zixuan, et al. "Improved Unet model for brain tumor image segmentation based on ASPP-coordinate attention mechanism." arXiv preprint arXiv:2409.08588 (2024).

[22] Li X, Yang Y, Yuan Y, et al. Intelligent vehicle classification system based on deep learning and multi-sensor fusion[J]. 2024.

[23] Yuan Y, Huang Y, Ma Y, et al. Rhyme-aware Chinese lyric generator based on GPT[J]. arXiv preprint arXiv:2408.10130, 2024.

[24] Zheng H, Zhang Q, Gong Y, et al. Identification of prognostic biomarkers for stage iii non-small cell lung carcinoma in female nonsmokers using machine learning[J]. arXiv preprint arXiv:2408.16068, 2024.

[25] Xiang A, Qi Z, Wang H, et al. A Multimodal Fusion Network For Student Emotion Recognition Based on Transformer and Tensor Product[J]. arXiv preprint arXiv:2403.08511, 2024.

[26] Zhang Q, Qi W, Zheng H, et al. CU-Net: a U-Net architecture for efficient brain-tumor segmentation on BraTS 2019 dataset[J]. arXiv preprint arXiv:2406.13113, 2024.

[27] Liu H, Xie R, Qin H, et al. Research on Dangerous Flight Weather Prediction based on Machine Learning[J]. arXiv preprint arXiv:2406.12298, 2024.

[28] Li Z, Wang B, Chen Y. Incorporating economic indicators and market sentiment effect into US Treasury bond yield prediction with machine learning[J]. Journal of Infrastructure, Policy and Development, 2024, 8(9): 7671.

[29] Li X, Chang J, Li T, et al. A vehicle classification method based on machine learning[J]. 2024.

[30] Dang B, Ma D, Li S, et al. Deep Learning-Based Snore Sound Analysis for the Detection of Night-time Breathing Disorders[J].

[31] Mo, Yuhong & Zhang, Yuchen & Li, Hanzhe & Wang, Han & Yan, Xu. (2024). Prediction of heart failure patients based on multiple machine learning algorithms. Applied and Computational Engineering. 75. 1-7. 10.54254/2755-2721/75/20240498.

[32] Liu H, Shen F, Qin H, et al. Research on Flight Accidents Prediction based Back Propagation Neural Network[J]. arXiv preprint arXiv:2406.13954, 2024.

[33] Lai S, Feng N, Sui H, et al. FTS: A Framework to Find a Faithful TimeSieve[J]. arXiv preprint arXiv:2405.19647, 2024.

[34] Wang H, Li J, Li Z. AI-Generated Text Detection and Classification Based on BERT Deep Learning Algorithm[J]. arXiv preprint arXiv:2405.16422, 2024.

[35] Chen Y, Yan S, Liu S, et al. EmotionQueen: A Benchmark for Evaluating Empathy of Large Language Models[C]//Findings of the Association for Computational Linguistics ACL 2024. 2024: 2149-2176.

[36] Ma D, Wang M, Xiang A, et al. Transformer-Based Classification Outcome Prediction for Multimodal Stroke Treatment[J]. arXiv preprint arXiv:2404.12634, 2024.

[37] Qiao Y, Li K, Lin J, et al. Robust domain generalization for multi-modal object recognition[J]. arXiv preprint arXiv:2408.05831, 2024.

[38] Li K, Wang J, Wu X, et al. Optimizing Automated Picking Systems in Warehouse Robots Using Machine Learning[J]. arXiv preprint arXiv:2408.16633, 2024.

[39] Xie T, Wan Y, Huang W, et al. Darwin series: Domain specific large language models for natural science[J]. arXiv preprint arXiv:2308.13565, 2023.

[40] Li X, Ma Y, Huang Y, et al. Integrated Optimization of Large Language Models: Synergizing Data Utilization and Compression Techniques[J]. 2024.

[41] Xie T, Wan Y, Zhou Y, et al. Creation of a structured solar cell material dataset and performance prediction using large language models[J]. Patterns, 2024, 5(5).

[42] Li K, Chen J, Yu D, et al. Deep Reinforcement Learning-based Obstacle Avoidance for Robot Movement in Warehouse Environments[J]. arXiv preprint arXiv:2409.14972, 2024.

[43] Xie T, Wan Y, Huang W, et al. Large language models as master key: unlocking the secrets of materials science with GPT[J]. arXiv preprint arXiv:2304.02213, 2023.

[44] Liu S, Zhu M. Learning multi-agent behaviors from distributed and streaming demonstrations[J]. Advances in Neural Information Processing Systems, 2024, 36.

[45] Xie T, Wan Y, Lu K, et al. Tokenizer Effect on Functional Material Prediction: Investigating Contextual Word Embeddings for Knowledge Discovery[C]//AI for Accelerated Materials Design-NeurIPS 2023 Workshop. 2023.

[46] Wan Y, Ajith A, Liu Y, et al. SciQAG: A Framework for Auto-Generated Scientific Question Answering Dataset with Fine-grained Evaluation[J]. arXiv preprint arXiv:2405.09939, 2024.

[47] Liu S, Zhu M. Distributed inverse constrained reinforcement learning for multi-agent systems[J]. Advances in Neural Information Processing Systems, 2022, 35: 33444-33456.

[48] Wang Z, Zhu Y, Li Z, et al. Graph neural network recommendation system for football formation[J]. Applied Science and Biotechnology Journal for Advanced Research, 2024, 3(3): 33-39.

[49] Li Z, Wang B, Chen Y. A Contrastive Deep Learning Approach to Cryptocurrency Portfolio with US Treasuries[J]. Journal of Computer Technology and Applied Mathematics, 2024, 1(3): 1-10.

[50] Dang B, Zhao W, Li Y, et al. Real-Time pill identification for the visually impaired using deep learning[J]. arXiv preprint arXiv:2405.05983, 2024.

[51] Ma D, Li S, Dang B, et al. Fostc3net: A lightweight yolov5 based on the network structure optimization[J]. arXiv preprint arXiv:2403.13703, 2024.

[52] Song X, Wu D, Zhang B, et al. Zeroprompt: streaming acoustic encoders are zero-shot masked lms[J]. arXiv preprint arXiv:2305.10649, 2023.