The following document is a Chinese government policy designed to encourage the growth data labeling industry, an important enabler for the AI sector. The policy endorses government intervention in the market to create data labeling “bases” and to identify and support the most promising data labeling companies. The language of this document implies that the Chinese government envisions data labeling as a high-tech industry requiring highly educated talent, rather than as a cottage industry involving hordes of often poorly educated, poorly paid human annotators. But the policy does not mention specific data labeling techniques and technologies to prioritize or invest in.
An archived version of the Chinese source text is available online at: https://perma.cc/E33V-9F7M
Implementation Opinions of the National Development and Reform Commission and Other Ministries on Promoting the High-Quality Development of the Data Labeling Industry
NDRC-National Data Administration (发改数据) [2024] Document No. 1822
To all development and reform commissions, data management departments, finance departments (bureaus), and human resources and social security departments (bureaus) of all provinces, autonomous regions, province-level municipalities, cities with independent planning status under the national economic and social development plan, and the Xinjiang Production and Construction Corps:
The data labeling1 industry is the emerging industry of data processing and handling, including data filtering, cleaning, categorization, annotation, tagging, and quality checking. The incubation and enlargement of the data labeling industry plays an important supportive role in raising the quality of data supply and promoting artificial intelligence (AI) innovation and development. The following opinions are put forward in order to promote the high-quality development of the data labeling industry.
I. Overall Requirements
Guided by Xi Jinping Thought on Socialism with Chinese Characteristics for a New Era, fully implement the spirit of the 20th Party Congress and the Second and Third Plenums of the 20th Chinese Communist Party (CCP) Central Committee, completely, accurately, and comprehensively implement the new concept of development (新发展理念), coordinate development and security, make promoting the development and utilization of data to empower economic and social development the main line, and strive to cultivate new business formats (新业态) for data labeling, thereby laying out new lanes in the digital science and technology (S&T) race, and creating new international competitive advantages for the industry. In developing the data labeling industry, we shall: Adhere to the working principles of combining effective markets and assertive government (有为政府), combining systematic planning and key breakthroughs, and combining open collaboration and secure development; and give full play to China’s advantages in terms of massive data and abundant application scenarios, thereby strengthening our posture of being demand-led and innovation-driven, and accelerating ecosystem incubation. By 2027, the data labeling industry’s specialization, intelligentization (智能化), and S&T innovation ability will have been significantly improved, the scale of the industry will have increased greatly, its compound annual growth rate will exceed 20%, a group of influential S&T data labeling enterprises will have been incubated, a batch of innovation vehicles linking industry, academia, research institutes, and users (产学研用) will have been created, a number of data labeling bases with notable achievements and distinctive features will have been built, and a relatively complete data labeling industrial ecosystem will have been formed, thereby creating a new pattern in which innovation factors of production (要素) are agglomerated, the upstream and downstream of the production chain are linked, and regional development is coordinated.
To view the rest of this translation, download the pdf below.
Download Full Translation
Implementation Opinions of the National Development and Reform Commission and Other Ministries on Promoting the High-Quality Development of the Data Labeling Industry- Translator’s note: The Chinese term 数据标注 can be translated as “data labeling” or “data annotation.” This translation opts for the former.