Information and Communications Technology and Policy

Information and Communications Technology and Policy

Information and Communications Technology and Policy ›› 2026, Vol. 52 ›› Issue (5): 22-31.doi: 10.12267/j.issn.2096-5931.2026.05.004

Previous Articles     Next Articles

A methodological framework for high-quality dataset construction based on multimodal fusion and AI assistance

WANG Dong, YANG Huafeng, LIU Weichen, LI Kang, LIU Jingqian, LIU Shiwei   

  1. China Telecom Corporation Limited, Beijing 100033, China
  • Received:2026-03-20 Online:2026-05-25 Published:2026-05-28

Abstract:

Building high-quality datasets for AI applications often faces four practical challenges: unclear alignment with business goals, fragmented implementation, limited technical infrastructure, and excessive annotation costs, this paper presents a methodology that addresses these issues through a three-layer framework—demand mapping, intelligent governance, and value realization—implemented on China Telecom’s Knowledge Service Platform. The methodology has been validated in high-end equipment manufacturing and consumer goods industries, cutting dataset construction time, offering a practical pathway for enterprise data asset development in the era of data marketization.

Key words: high-quality dataset, multimodal fusion, AI-assisted annotation, data asset management

CLC Number: