基于分布式算力互联的大模型后训练成本优化技术综述

doi:10.12267/j.issn.2096-5931.2026.02.007

信息通信技术与政策 ›› 2026, Vol. 52 ›› Issue (2): 44-52.doi: 10.12267/j.issn.2096-5931.2026.02.007

专题:算力互联网技术发展与产业应用上一篇下一篇

基于分布式算力互联的大模型后训练成本优化技术综述

A review of post-training cost optimization technology for large language models based on distributed computing optimization

宁柯宇¹, 马飞², 李哲², 董晓慧²

1.电信科学技术研究院，北京 100191
2.中国信息通信研究院云计算与数字化研究所，北京 100191

收稿日期:2026-01-13 出版日期:2026-02-25 发布日期:2026-03-06
通讯作者: 马飞中国信息通信研究院云计算与数字化研究所云计算部主任、高级工程师，主要从事云计算领域的政府支撑、行业研究、标准制定、测试评估等工作
作者简介:
宁柯宇, 电信科学技术研究院硕士研究生在读，主要从事大语言模型、智能体互联等方面研究工作
李哲, 中国信息通信研究院云计算与数字化研究所云计算部副主任、高级工程师，主要从事云计算、AI云大模型工程化、行业云平台等方面研究工作
董晓慧, 中国信息通信研究院云计算与数字化研究所云计算部工程师，主要从事云计算、AI云大模型工程化、云成本优化等方面研究工作

NING Keyu¹, MA Fei², LI Zhe², DONG Xiaohui²

1. Telecommunications Science and Technology Research Institute，Beijing 100191，China
2. Cloud Computing and Digitalization Research Institute，China Academy of Information and Communications Technology，Beijing 100191，China

Received:2026-01-13 Online:2026-02-25 Published:2026-03-06

摘要/Abstract

摘要：

在算力互联网加速发展的背景下，大语言模型后训练阶段的算力成本持续攀升，已成为制约技术普惠化的关键瓶颈。首先，通过系统性梳理后训练成本优化技术体系，结合算力互联网的跨域协同特性构建降低算力、存储与数据开销的综合框架；其次，对现有主流技术的局限性进行分析，并总结该领域的演进趋势，探讨分布式算力互联环境下大模型后训练成本优化技术的新方向。

关键词: 算力互联网, 大语言模型, 后训练, 成本优化

Abstract:

Amidst the rapid development of the Internet of computing，escalating computational costs during the post-training phase of large language models （LLMs） have become a critical bottleneck hindering widespread technology adoption. First，by systematically organizing and training cost optimization technology system，a comprehensive framework is constructed to reduce computational，storage，and data overheads，leveraging the cross-domain collaboration characteristics of the computing power internet. Second，the limitations of existing mainstream techniques are analyzed，and the evolution trends in this field are summarized to explore new directions for post-training cost optimization techniques of large models in distributed computing power interconnection environments.

Key words: internet of computing, large language models, post-training, cost optimization.

中图分类号:

宁柯宇, 马飞, 李哲, 董晓慧. 基于分布式算力互联的大模型后训练成本优化技术综述[J]. 信息通信技术与政策, 2026, 52(2): 44-52.

NING Keyu, MA Fei, LI Zhe, DONG Xiaohui. A review of post-training cost optimization technology for large language models based on distributed computing optimization[J]. Information and Communications Technology and Policy, 2026, 52(2): 44-52.

导出引用管理器 EndNote|Ris|BibTeX

链接本文:

http://ictp.caict.ac.cn/CN/10.12267/j.issn.2096-5931.2026.02.007

http://ictp.caict.ac.cn/CN/Y2026/V52/I2/44

图/表 2

参考文献 20

[1]	栗蔚, 张博圣, 孙松林, 等. 算力互联网架构: 基于熵平衡支持算力资源跨域互联的下一代网络架构[J]. 通信学报, 2025, 46(9)1-16.
[2]	张慧敏. DeepSeek-R1是怎样炼成的?[J]. 深圳大学学报(理工版), 2025, 42(2):226-232.
[3]	HU E J, SHEN Y, WALLIS P, et al. LoRA: low-rank adaptation of large language models[J]. arXiv Preprint, arXiv:2106.09685,2021.
[4]	DETTMERS T, PAGNONI A, HOLTZMAN A, et al. QLoRA:efficient finetuning of quantized llms[J]. arXiv Preprint, arXiv: 2305.14314x1, 2023.
[5]	XIA M, ZHONG Z, CHEN D. Sheared LLaMA: accelerating language model pre-training via structured pruning[J]. arXiv Preprint, arXiv:2310.06694, 2023.
[6]	CHEN L, LI S, ZOU X. AlpaGasus: training a better alpaca with fewer data[J]. arXiv Preprint, arXiv:2307.08701v5, 2024.
[7]	SHOEYBI M, PATWARY M, PURI R, et al. Megatron-LM:training multi-billion parameter language models using model parallelism[C]//Proceedings of SC20: International Conference for High Performance Computing, Networking, Storage and Analysis.IEEE Press, 2020:1-15.
[8]	RAJBHANDARI S, RASLEY J, RUWASE O, et al. ZeRO:memory optimizations toward training trillion parameter models[C]//Proceedings of SC20: International Conference for High Performance Computing, Networking, Storage and Analysis.IEEE Press, 2020:1-15.
[9]	MICIKEVICIUS P, NARANG S, ALBEN J, et al. Mixed precision training[J]. arXiv Preprint, arXiv:1710. 03740, 2018.
[10]	HOULSBY N, GIURGIU A, JASTRZEBSKI S, et al. Parameter-Efficient transfer learning for NLP[C]// Proceedings of the 36th International Conference on Machine Learning. Long Beach: PMLR Press, 2019:2790-2799.
[11]	LESTER B, AL-RFOU R, CONSTANT N. The power of scale for parameter-efficient prompt tuning[C]// Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing. Punta Cana: Association for Computational Linguistics, 2021:3045-3059.
[12]	BLALOCK D, ORTIZ J G M, FRANKLE J, et al. What is the state of neural network pruning?[J]. arXiv Preprint, arXiv:2003.03033, 2020.
[13]	FRANTAR E, ASHKBOOS S, HOEFLER T, et al. GPTQ: accurate post-training quantization for generative pre-trained transformers[J]. arXiv Preprint, arXiv:2210.17323v2, 2023.
[14]	邵仁荣, 刘宇昂, 张伟, 等. 深度学习中知识蒸馏研究综述[J]. 计算机学报, 2022, 45(8):1638-1673.
[15]	LUO J, WU B, LUO X, et al. A Survey on efficient large language model training: from data-centric perspectives[C]// Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics. Bangkok: Association for Computational Linguistics, 2025:30904-30920.
[16]	REIMERS N, GUREVYCH I. Sentence-BERT:sentence embeddings using Siamese BERT-networks[C]// Proceedings of 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing. New York: Association for Computational Linguistics, 2019:3982-3992.
[17]	BRODER A Z. On the resemblance and containment of documents[C]// Proceedings of the Conference on Compression and Complexity of Sequences. Salerno: IEEE Computer Society, 1997:21-29.
[18]	WANG Y, KORDI Y, MISHRA S, et al. Self-Instruct: aligning language models with self-generated instructions[C]// Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics. Toronto: Association for Computational Linguistics, 2023:8658-8668.
[19]	DAI D, DENG C Q, ZHAO C G, et al. DeepSeekMoE: towards ultimate expert specialization in mixture-of-experts language models[J]. arXiv Preprint, arXiv: 2401.06066v1, 2023.
[20]	EVCI U, GALE T, MENICK J, et al. Rigging thelottery: making all tickets winners[C]// Proceedings of the 37th International Conference on Machine Learning. Vienna: PMLR, 2020:2943-2952.

基于分布式算力互联的大模型后训练成本优化技术综述

A review of post-training cost optimization technology for large language models based on distributed computing optimization

RichHTML

PDF

可视化

摘要/Abstract

引用本文

使用本文

图/表 2

参考文献 20

相关文章 8

编辑推荐

Metrics

本文评价

[1]	王睿宁, 孔松, 吴倩琳, 韩非. 算力互联网安全风险与治理路径研究[J]. 信息通信技术与政策, 2026, 52(2): 10-17.
[2]	杨一禾, 马飞, 李哲, 董晓慧. 算力互联网赋能生成式AI视频的内容安全研究[J]. 信息通信技术与政策, 2026, 52(2): 69-77.
[3]	李子健, 李佩珊, 毛馨纬, 曾卫民, 王彦丹, 孟拓, 王润岩, 闫丹. 算力互联网驱动产业智能化升级的范式、路径及趋势[J]. 信息通信技术与政策, 2026, 52(2): 86-96.
[4]	葛坚, 牛晓燕, 毕然, 黄雍涛. 低碳AI:大模型的绿色训练与推理优化方法研究[J]. 信息通信技术与政策, 2025, 51(6): 44-51.
[5]	刘东方, 杨天开, 常正, 郝鹏飞. 大模型赋能政府投资项目评审的机制创新与实践探索[J]. 信息通信技术与政策, 2025, 51(12): 48-56.
[6]	郭亮, 王月, 李洁. 大模型算力体系构建与关键技术分析[J]. 信息通信技术与政策, 2025, 51(11): 81-88.
[7]	张熙, 李朝卓, 许诺, 张力天. 面向可信大语言模型智能体的安全挑战与应对机制[J]. 信息通信技术与政策, 2025, 51(1): 33-37.
[8]	任现, 薛峰, 方云飞, 蒋忠林, 陈勇, 唐炜. 基于大语言模型与智能体的智能座舱研究与实践[J]. 信息通信技术与政策, 2024, 50(12): 58-63.