信息通信技术与政策

信息通信技术与政策

信息通信技术与政策 ›› 2025, Vol. 51 ›› Issue (2): 30-39.doi: 10.12267/j.issn.2096-5931.2025.02.005

专题:算力网络 上一篇    下一篇

算力中心云服务架构与关键技术研究*

Research on cloud service architecture and key technologies of computing center

赵栖平1,2, 丁飞1, 王诗怡1, 王瑞1, 吴迪1, 刘志帅2   

  1. 1.南京邮电大学通信与网络技术国家工程研究中心,南京 210003
    2.中移(苏州)软件技术有限公司,苏州 215163
  • 收稿日期:2025-01-14 出版日期:2025-02-25 发布日期:2025-03-04
  • 通讯作者: 丁飞,南京邮电大学现代邮政学院副院长,智慧物联网应用技术研究院副院长,教授,长期从事群智感知、智能计算与网络等领域相关研究工作。
  • 作者简介:
    赵栖平,南京邮电大学智慧物联网应用技术研究院双创导师,中移(苏州)软件技术有限公司解决方案经理,从事移动云行业解决方案与业务运营等领域相关研究工作;
    王诗怡 ,南京邮电大学电子信息专业硕士研究生在读,主要从事多智能体系统和智能计算等领域相关研究工作;
    王瑞,南京邮电大学电子信息专业硕士研究生在读,主要从事智能计算与网络等领域相关研究工作;
    吴迪,南京邮电大学电子信息专业硕士研究生在读,主要从事算力网络和智能计算等领域相关研究工作;
    刘志帅,中移(苏州)软件技术有限公司方案经理,长期从事移动云业务运营与教育行业方案创新等领域相关研究工作。
  • 基金资助:
    *国家自然科学基金(No.61871446);江苏省重点研发计划(No.BE2020084-1);教育部产学合作协同育人项目(No.231107173103040);南京邮电子大学研究生教育教学改革重点课题(No.JGKT24-C018);南京邮电大学教学改革研究重点项目(No.JG10424JX09)

ZHAO Xiping1,2, DING Fei1, WANG Shiyi1, WANG Rui1, WU Di1, LIU Zhishuai2   

  1. 1. National Local Joint Engineering Research Center for Communication and Network Technology, Nanjing University of Posts and Telecommunications, Nanjing 210003, China
    2. China Mobile (Suzhou) Software Technology Co., Ltd., Suzhou 215163, China
  • Received:2025-01-14 Online:2025-02-25 Published:2025-03-04

摘要:

算力中心包含大规模的服务器、存储设备等硬件设施,为处理复杂的计算任务提供了统一的运算平台。如何设计并实现兼容通用算力、超级算力和智能算力的算力中心云架构是当前研究的一个热点方向。重点分析了教育云驱动的科研范式发展与演进趋势,以及不同代表性学科对算力需求的分析,同时给出业界在算力网络、超算中心、智算平台以及数据治理方面的研究进展。在此基础上,提出一种算力中心云总体网络架构,阐述了该架构的基本结构、安全防护以及开放服务设计,并给出跨域场景下算力并网的典型对接方案。该架构能够满足科学计算在异构算力纳管、数据模型并发训练、分布式推理、高性能计算机仿真科研应用服务等多场景的需求。基于Spine-Leaf两层设备的扁平化网络设计,整合通用算力、超级算力、智能算力等平台能力,以能够形成异构融合、高性能计算和存储、能力开放的新型云服务网络架构。

关键词: 算力中心云, 云原生, 高性能计算, 算力并网, 算力调度

Abstract:

The computing power center includes large-scale hardware facilities such as servers and storage devices, providing a unified computing platform for processing complex computing tasks. How to design and implement a computing power center cloud architecture compatible with universal computing power, super computing power and intelligent computing power is a hot research direction. This paper focuses on analyzing the development and evolution trend of the research paradigm driven by education cloud, as well as the demand for computing power of different representative disciplines. And this paper gives the research progress of the industry in computing power network, supercomputing center, smart computing platform and data governance. On this basis, this paper proposes an overall network architecture of computing power center cloud, describes the basic structure, security protection and open service design of this architecture, and gives a typical docking scheme of computing power connected to the grid in cross-domain scenarios. The architecture can meet the needs of scientific computing in heterogeneous computing force management, data model concurrent training, distributed reasoning, high-performance computer simulation, and scientific research application services. Based on the flat network design of Spine-Leaf two-tier devices, integrating the platform capabilities of general computing power, super computing power, intelligent computing power and other platform capabilities, it will be able to form a new cloud service network architecture with heterogeneous integration, high-performance computing and storage, and open capabilities.

Key words: computing power centric cloud, cloud-native, high performance computing, grid connected computing power, computing power scheduling

中图分类号: