Research on development trend of key technologies and industry of intelligent computing

doi:10.12267/j.issn.2096-5931.2024.06.002

Abstract

Abstract:

Against the backdrop of the explosive development of generative Artificial Intelligence (AI) large language models, the importance of intelligent computing is becoming increasingly prominent. Currently, large language models with hundreds of billions or even trillions of parameters have raised demanding requirements for intelligent computing in terms of chip computing power, memory capacity, and interconnection speed during the training process of massive data. Starting from the demand for key technologies of intelligent computing, this study focuses on analyzing the development trend and the current industry situation, in dimensions such as computing chip technology, software technology, and interconnection technology. Combining with the opportunities and challenges faced by the current industry development, it proposes future development strategies for intelligent computing.

Key words: intelligent computing, AI chip, high-speed interconnection, AI

CLC Number:

TN38

ZHANG Qian, HUANG Huang, MIAO Zicong. Research on development trend of key technologies and industry of intelligent computing[J]. Information and Communications Technology and Policy, 2024, 50(6): 10-16.

Add to citation manager EndNote|Ris|BibTeX

URL:

http://ictp.caict.ac.cn/EN/10.12267/j.issn.2096-5931.2024.06.002

http://ictp.caict.ac.cn/EN/Y2024/V50/I6/10

Figures/Tables 4

References 16

[1]	SEVILLA J, HEIM L, HO A, et al. Compute trends across three eras of machine learning[C]// 2022 International Joint Conference on Neural Networks. IEEE, 2022.
[2]	KAPLAN J, MCCANDLISH S, HENIGHAN T, et al. Scaling laws for neural language models[J]. arXiv Preprint arXiv:2001.08361, 2020.
[3]	Meta. Building Meta’s GenAI infrastructure[EB/OL]. (2024-03-12)[2024-04-20]. https://engineering.fb.com/2024/03/12/data-center-engineering/building-metas-genai-infrastructure/.
[4]	NVIDIA. NVIDIA H100 tensor core GPU architecture[EB/OL]. (2024-04-07)[2024-04-20]. https://resources.nvidia.com/en-us-tensor-core/gtc22-whitepaper-hopper.
[5]	NVIDIA. NVIDIA DGX B200[EB/OL]. 2024[2024-04-20]. https://www.nvidia.com/en-us/data-center/dgx-b200/.
[6]	IT之家. 三星完成16层混合键合堆叠验证,有望在HBM4内存广泛应用[EB/OL]. (2024-04-07)[2024-04-20]. https://new.qq.com/rain/a/20240407A06DNT00.
[7]	尹首一. 人工智能芯片概述[J]. 微纳电子与智能制造, 2019, 1(2):7-11.
[8]	马立伟. 深度学习驱动的领域专用架构[J]. 中国科学:信息科学, 2019, 49(3):334-341.
[9]	陆淳. LLM推理算法简述[EB/OL]. (2024-03-11)[2024-04-20]. https://zhuanlan.zhihu.com/p/685794495.
[10]	黎子毅, 李克森, 李雨芮, 等. 人工智能算子接口标准化研究[J]. 人工智能, 2020(3):18-25.
[11]	崔晨, 吴迪, 陶业荣, 等. 多GPU系统的高速互联技术与拓扑发展现状研究[J]. 航空兵器, 2024, 31(1):23-31.
[12]	张鹏飞, 田雯, 武振宇, 等. AI大模型场景下智能计算技术选型分析[J]. 电信工程技术与标准化, 2024, 37(1):3-7.
[13]	车碧瑶, 张永航, 廖怡, 等. 运营商大模型硬件基础设施创新及RDMA流量控制技术研究[J]. 信息通信技术与政策, 2024, 50(2): 26-32.
[14]	Cisco. Cisco global cloud index: forecast and methodology, 2016-2021[R], 2018.
[15]	中国移动通信研究院. 面向超万卡集群的新型智算技术白皮书(2024年)[R], 2024.
[16]	WILL R. 200万注册开发者与无数突破:NVIDIA开发者计划的“高光时刻”NVIDIA 开发者团队每年都以更快的速度不断壮大,随着开发者人数达到200万,这些NVIDIA开发者们所追求着的“突破”也达到前所未有的量级[EB/OL]. (2020-08-19)[2024-04-20]. https://blogs.nvidia.cn/blog/2-million-registered-developers-breakthroughs/.

类别	指令集	核心单元	生态	代表产品
GPGPU	千条以上	通用计算单元图形计算单元专用执行单元	生态系统复杂行业应用广泛行业标准成熟	英伟达B200 超威半导体(Advanced Micro Devices, AMD)公司MI300
DSA	百条以内	专用执行单元	客户适配期长软件投入大产品“快”,应用“慢”	谷歌公司TPU v5系列芯片 Groq公司LPU

类别	指令集	核心单元	生态	代表产品
GPGPU	千条以上	通用计算单元图形计算单元专用执行单元	生态系统复杂行业应用广泛行业标准成熟	英伟达B200 超威半导体(Advanced Micro Devices, AMD)公司MI300
DSA	百条以内	专用执行单元	客户适配期长软件投入大产品“快”,应用“慢”	谷歌公司TPU v5系列芯片 Groq公司LPU

算子分类	算子功能
基础数学操作	张量变换、线性代数、信号处理
神经网络操作	卷积、池化、正规则、激活函数
机器学习操作	支持向量机、决策树
其他AI操作	图操作、类脑操作

算子分类	算子功能
基础数学操作	张量变换、线性代数、信号处理
神经网络操作	卷积、池化、正规则、激活函数
机器学习操作	支持向量机、决策树
其他AI操作	图操作、类脑操作

NVLink	第二代	第三代	第四代	第五代
发布时间/年	2017	2020	2022	2024
带宽/GB/s	300	600	900	1 800
链路数	6	12	18	18
每条链路双向带宽/GB/s	40	50	50	100
支持芯片架构	Volta	Ampere	Hopper	Blackwell