Due to the lack of direct assessment metrics, existing studies on the intensity of agricultural policies often utilize indicators such as Gross Domestic Product (GDP) of agriculture or the quantity of agricultural policies as measures. Optimizing methods for analyzing the intensity of agricultural policies will significantly impact parameter selection in agricultural policy research and the evaluation of policy effectiveness. In this study, we constructed a Chinese Agricultural Policy Corpus using agricultural policies released by various governmental agencies at the national level in China from 1982 to April 2023. We quantified the values of agricultural domain terms in the corpus and evaluated the intensity of each agricultural policy document. The validation results of this study indicate a strong correlation between the intensity of agricultural policies and agricultural GDP. The trend in agricultural GDP changes lags behind policy intensity by 2.5 years (at a 95% confidence level), thus validating the rationality of our constructed corpus, agricultural policy scoring dataset, and methodology.
In recent decades, the field of agriculture has undergone continuous progress and innovation to meet the increasing global demand for food, maintaining the stability and sustainability of food supply has become one of the significant challenges faced by global society1. Following rapid socio-economic transformation and the implementation of family planning policies in China, a trend of population aging has emerged. Despite ongoing changes in Chinese agriculture, small-scale farming remains a major component2. According to official data from the China State Council Census Office in 2022, only 19% of the agricultural labor force in China was below 40 years old in 2020. Many rural areas heavily rely on elderly individuals aged 60 and above for agricultural labor. With China being the most populous developing country globally, where approximately 35% of the population is engaged in agricultural labor (a proportion significantly higher than the 2.5% in the United States), the importance of agriculture to China is self-evident3. In recent years, China has made significant strides in the field of agriculture, thanks to the implementation of a series of agricultural policies that play a crucial role in the country’s economic and social development.
Quantitative analysis of the scale of agricultural policies directly influences the assessment of policy outcomes, providing a method for monitoring policy effects and improving the agricultural policy system. Since 2015, the Chinese government has introduced the Precision Poverty Alleviation (TPA) plan, aiming to elevate the living standards of the entire population above the national poverty line by 20204. Given the substantial number of agricultural laborers in China, future agricultural policy formulation will face increasingly diverse demands, resulting in the growing complexity of policy development5,6. With the widespread adoption of big data and artificial intelligence, there is an urgent need to research an effective process and methods for quantifying agricultural policies.
In response to policy changes7,8, the outcomes of policy announcements9,10, and discussions regarding governmental involvement in policy formulation11, numerous scholars have shown considerable interest in policy documents. In recent years, an increasing body of research advocates for the use of policy outcome data to assess their impacts through more intuitive methods and to measure policy effects and outcomes more accurately12. Scholars such as Chen Mei et al.13 have found that with the digitization process, policy texts are rapidly increasing, making the evaluation of their design rationality and potential optimization space critical. Quantitative assessment of policies has become an indispensable scientific tool, allowing for a comprehensive and impartial examination of existing policies and guiding future improvement directions. Sun Yan et al.14, by utilizing a decade’s worth of Chinese modern agricultural policy texts, employed a classification method to construct an analytical framework and conducted quantitative analysis, deepening the understanding of policy essence. It is hoped that through quantitative analysis of agricultural policy texts, policies can be refined and made more efficient, driving the modernization of agriculture. However, current research on the overall quantitative assessment of policy data remains insufficient. Therefore, there is an urgent need to broaden the research horizon, strengthen the application and study of quantitative evaluation methods for policies to meet the constantly increasing demands of data governance. Furthermore, there is no unified measurement standard when systematically evaluating the intensity of agricultural policies across different spatial and temporal contexts15. Based on the current state of knowledge, there has yet to be a comprehensive systematic study capable of fully collecting and publicly disclosing Chinese agricultural policy data.
To address the information and methodological gaps in this field, this study comprehensively collects relevant documents on important agricultural policies at the national level in China, constructing an agricultural policy corpus. We introduce for the first time the temporal dimension and clustering algorithm DBSCAN on the basis of the LDA (Latent Dirichlet Allocation) model to partition the topics. By leveraging the new machine learning model EvoLDA-DB (Evolutionary LDA with DBSCAN), we construct a comprehensive dataset of agricultural policy corpus to achieve a quantitative assessment of the intensity of agricultural policies. The dataset comprises the agricultural policy corpus at the national level in China over the past 40 years. We propose a novel method to obtain quantified datasets of agricultural policy corpus words and scores for each document. This approach to computing policy intensity can be applied not only for inter-country comparisons but also across different domains. Building upon the multiclustered partition of corpus words provided by this study, it offers more practical support for further analysis of different aspects or dimensions within this field. Furthermore, the dataset of this study provides detailed content of Chinese agricultural policy documents categorized by different government departments, aiding researchers in reducing the time and effort required to access information on Chinese agricultural policies.