Information and Communications Technology and Policy

Information and Communications Technology and Policy

Information and Communications Technology and Policy ›› 2026, Vol. 52 ›› Issue (5): 58-68.doi: 10.12267/j.issn.2096-5931.2026.05.008

Previous Articles     Next Articles

A review of research on Chinese datasets for ethical evaluation of large language models

TIAN Xiaoyu1, LI Wenyu2, BI Chunli2, FU Na2, ZHANG Leilei2   

  1. 1 China Academy of Telecommunication Technology, Beijing 100191, China
    2 Intellectual Property and Innovation Development Center, China Academy of Information and Communications Technology, Beijing 100191, China
  • Received:2026-01-25 Online:2026-05-25 Published:2026-05-28

Abstract:

As cutting-edge achievements in the field of artificial intelligence, large language models have drawn significant attention from both academia and industry regarding their associated ethical issues. Consequently, the number of ethical evaluation datasets for large language models in the Chinese context has gradually increased, presenting substantial value for in-depth research. However, the current lack of systematic review and analysis of such datasets makes it difficult for researchers to accurately select suitable datasets and effectively identify shortcomings in existing resources. This paper examines 50 Chinese ethical evaluation datasets for large language models released between August 2021 and March 2025. It conducts a comprehensive comparative analysis covering release dates, creation details, content information, open-source situation, domains covered, and ethical scenarios. This study aims to provide direction for optimizing and constructing future datasets.

Key words: large language models, ethical evaluation, Chinese datasets

CLC Number: