Information and Communications Technology and Policy

Information and Communications Technology and Policy

Information and Communications Technology and Policy ›› 2024, Vol. 50 ›› Issue (12): 13-20.doi: 10.12267/j.issn.2096-5931.2024.12.003

Previous Articles     Next Articles

Analysis of large language model architecture evolution

WANG Yuntao   

  1. Artificial Intelligence Research Institute, China Academy of Information and Communications Technology, Beijing 100191, China
  • Received:2024-11-04 Online:2024-12-25 Published:2025-01-02

Abstract:

This paper systematically reviews and analyzes the significant innovation directions based on the Transformer architecture. It examines the evolution of large language model architecture from three dimensions: innovation within the Transformer architecture itself, fusion innovation with other architectures, and innovations in non-Transformer architecture. This paper also provides an outlook on the future development directions of foundation models.

Key words: large model architecture, Transformer, attention mechanism, architectural innovation

CLC Number: