信息通信技术与政策

信息通信技术与政策

信息通信技术与政策 ›› 2025, Vol. 51 ›› Issue (10): 2-6.doi: 10.12267/j.issn.2096-5931.2025.10.001

专题:先进计算创新应用与生态建设 上一篇    下一篇

大模型推理驱动下的算存协同发展研究

Research on the development of compute-storage collaboration driven by large model inference

周兰, 陈磊   

  1. 中国信息通信研究院信息化与工业化融合研究所, 北京 100191
  • 收稿日期:2025-09-10 出版日期:2025-10-25 发布日期:2025-11-06
  • 作者简介:
    周兰, 中国信息通信研究院信息化与工业化融合研究所副总工程师,高级工程师,主要从事先进计算、集成电路、智能终端等领域的研究和支撑工作
    陈磊, 中国信息通信研究院信息化与工业化融合研究所沉浸技术部主任,主要从事人工智能终端、软件等领域的研究和支撑工作

ZHOU Lan, CHEN Lei   

  1. Informatization and Industrialization Integration Research Institute, China Academy of Information and Communications Technology, Beijing 100191, China
  • Received:2025-09-10 Online:2025-10-25 Published:2025-11-06

摘要:

随着大模型能力的持续升级和推理应用的不断深化,数据处理规模急剧扩大,数据处理需求也日益多元化,这对存储与算力的交互能力提出了更高要求。针对当前大模型推理场景下更多数据、更大规模以及更长上下文窗口对存储系统形成的新要求,深度分析以算代存、以存替算两种模式下的实现机制、重点技术和应用实践;进而结合当前技术产业基础和应用场景需求,指出根据访问延迟和带宽需求构建的分层、体系化协同存储模式对后续算存协同发展的重要意义。旨在探索算存协同的具体实现机制及演进路径,为推动提升智算集群利用效率、更好支撑大模型推理发展提供有益参考。

关键词: 大模型推理, AI存储, 键值缓存, 算存协同

Abstract:

With the continuous enhancement of large model capabilities and the deepening of inference applications, the scale of data processing has expanded drastically, and data processing requirements have become increasingly diversified, this has imposed higher demands on the collaborative between storage and computing power. In response to the new demands on storage systems by larger data volumes, larger model sizes, and longer context windows in current large model inference scenarios, this study first conducts an in-depth analysis of the implementation mechanisms, key technologies, and practical applications of both “computing-in-place-of-storage” and “storage-in-place-of-computing”, Subsequently, by integrating the current technological and industrial foundation as well as application scenario requirements, this paper proposes that based on access latency and bandwidth demands,a hierarchical and systematic collaborative storage model for the future development of computing-storage synergy is important. This paper aims to explore the specific implementation mechanisms and evolutionary pathways of compute-storage collaboration, providing valuable references for promoting the improvement of intelligent computing cluster utilization efficiency and better supporting the development of large model inference.

Key words: large model inference, AI storage, KV Cache, computing-storage collaboration

中图分类号: