Large model computing infrastructure technological trends, key challenges, and development trajectories

doi:10.12267/j.issn.2096-5931.2024.06.001

Abstract

Abstract:

Starting from the latest technological development trends of large models, this paper first analyzes the architectural characteristics and computing power demand features of multimodal, long sequence, and mixture of experts models. Further, it focuses on the requirements of the latest large models for massive computing power scale and complex communication patterns. It quantitatively analyzes the current development problems and technical challenges faced by large model computing infrastructure from two aspects: computating efficiency and cluster interconnection technology. Finally, it proposes a high-quality computing infrastructure development trajectory oriented by applications, centered on systems, and targeted at efficiency.

Key words: multimodal model, long sequence model, mixture of experts model, computating efficiency, cluster interconnection, high-quality computing power

CLC Number:

TN929.11

ZHANG Zheng, FENG Shaofei. Large model computing infrastructure technological trends, key challenges, and development trajectories[J]. Information and Communications Technology and Policy, 2024, 50(6): 2-9.

Add to citation manager EndNote|Ris|BibTeX

URL:

http://ictp.caict.ac.cn/EN/10.12267/j.issn.2096-5931.2024.06.001

http://ictp.caict.ac.cn/EN/Y2024/V50/I6/2

Figures/Tables 4

References 12

[1]	VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[J]. arXiv Preprint. arXiv:1706.03762, 2023. DOI:10.48550/arXiv.1706.03762.
[2]	DEVLIN J, CHANG M W, LEE K, et al. BERT: pre-training of deep bidirectional transformers for language understanding[J]. arXiv Preprint. arXiv:1810.04805, 2018. DOI:10.48550/arXiv.1810.04805.
[3]	YEKTA M M J. The general intelligence of GPT-4, its knowledge diffusive and societal influences, and its governance[J]. Meta-Radiology, 2024, 2(2):20-37.
[4]	SUN C, MYERS A, VONDRICK C, et al. VideoBERT: a joint model for video and language representation learning[C]. Seoul: IEEE CVF International Conference on Computer Vision (ICCV), 2019:7463-7472. DOI:10.1109/ICCV.2019.00756.
[5]	KORTHIKANTI V, CASPER J, LYM S, et al. Reducing activation recomputation in large transformer models[J]. arXiv Preprint. arXiv:2205.05198, 2022. DOI:10.48550/arXiv.2205.05198.
[6]	FEDUS W, ZOPH B, SHAZEER N. Switch transformers: scaling to trillion parameter models with simple and efficient sparsity[J]. arXiv Preprint. arXiv:2101.03961, 2021. DOI:10.48550/arXiv.2101.03961.
[7]	TOUVRON H G, LAVRIL T, IZACARD G, et al. LLaMA: open and efficient foundation language models[J]. arXiv Preprint. arXiv:2302.13971, 2023. DOI: 10.48550/arXiv.2302.13971.
[8]	LIEBER O, LENZ B, BATA H, et al. Jamba: a hybrid transformer-mamba language model[J]. arXiv Preprint. arXiv:2403.19887, 2024.
[9]	LEPIKHIN D, LEE H J, XU Y, et al. GShard: scaling giant models with conditional computation and automatic sharding[J]. arXiv Preprint. arXiv:2006.16668, 2020. DOI:10.48550/arXiv.2006.16668.
[10]	PATEL D, WONG G. GPT-4architecture, infrastructure, training dataset, costs, vision, MoE[EB/OL]. (2023-07-10)[2024-04-20]. https://www.semianalysis.com/p/gpt-4-architecture-infrastructure.
[11]	GHOLAMI A, YAO Z, KIM S, et al. AI and memory wall[J/OL]. arXiv Preprint. arXiv:2403.14123v1, 2024. https://arxiv.org/html/2403.14123v1.
[12]	国家信息中心. 智能计算中心创新发展指南[Z], 2023.

[1]	SHUO Tianluan, DONG Yimin. Research on the innovation trend of database in the era of artificial intelligence [J]. Information and Communications Technology and Policy, 2024, 50(6): 17-22.
[2]	SUN Chengqian, WANG Yang. Research on evaluation index system of enterprise data asset assessment [J]. Information and Communications Technology and Policy, 2024, 50(6): 30-36.
[3]	LU Nan, LIANG Linjun, ZHAI Zhaorui, LI He. Research on the application of advanced computing technology in the field of industrial digital carbon management [J]. Information and Communications Technology and Policy, 2024, 50(6): 37-44.
[4]	LI Feng, QIAO Chungeng. A design of monitoring service platform for computing power industry based on large language model technology [J]. Information and Communications Technology and Policy, 2024, 50(6): 45-53.
[5]	XIE Lina, LIU Rui, TAN Yinan, ZHAO Yan. Research on development level of digital infrastructure in SCO states based on combination weighting-TOPSIS model [J]. Information and Communications Technology and Policy, 2024, 50(6): 54-61.
[6]	DU Yanyan, YAO Minqi, CHEN Wenqi. Comprehensive application practice of all factor connected vehicles industry based on 5G/5G-A [J]. Information and Communications Technology and Policy, 2024, 50(6): 62-69.
[7]	SHI Lei, YUAN Zhongju, YI Qiang, LI Shi. Reflections on HarmonyOS application security and personal information protection [J]. Information and Communications Technology and Policy, 2024, 50(6): 70-75.
[8]	CHENG Liwei, ZHANG Qian, LIANG Liyan, ZHOU Jie, YU Qingmin, HUANG Junjie. Demand for high-speed communication and related application scenarios of brain-computer interfaces [J]. Information and Communications Technology and Policy, 2024, 50(5): 12-17.
[9]	ZHANG Jian, CHEN Xiaohan, CAO Heng, YE Yong. Comparative study on supply capacity of digital economy of countries along the Belt and Road Initiative in new situation [J]. Information and Communications Technology and Policy, 2024, 50(5): 79-86.
[10]	HONG Qi’an, YU Rundong, YU Shengbo, LONG Xiangyu. A review of data element circulation for intelligent connected vehicles [J]. Information and Communications Technology and Policy, 2024, 50(3): 27-34.
[11]	ZHAO Yue, WEN Gui, ZHOU Guangtao, XIN Liang, LEI Kairu, ZHAO Chun. Research on AVP system based on 5G+AI [J]. Information and Communications Technology and Policy, 2024, 50(3): 45-52.
[12]	LI Boxiong, YU Rundong, YU Shengbo, GUO Meiying. Research on the system-level testing and operation & maintenance framework for the Internet of Vehicles infrastructure [J]. Information and Communications Technology and Policy, 2024, 50(3): 66-72.
[13]	LI Jian, RUAN Di. Analysis of thermal challenges in servers for diverse computing power [J]. Information and Communications Technology and Policy, 2024, 50(2): 46-54.
[14]	LIU Rui, LI Ningdong, YU Meize, XIE Lina. Research on the development of computing power industry under international complex situation [J]. Information and Communications Technology and Policy, 2024, 50(2): 63-67.
[15]	LUO Jian, LEI Bo, ZHENG Xiuli, DU Zongpeng, LI Jinglin. Scenarios and general technical requirements of Network 5.0 [J]. Information and Communications Technology and Policy, 2023, 49(12): 12-20.