Key Associated Characteristics and Industry Heterogeneity of Corporate Data Asset Potential in China-Evidence from Generative Models and Explainable Machine Learning

Junhong Luo, Huijuan Zheng, Junxi Huang

Abstract


This study takes A-share listed firms that recognized data resources as assets in their 2024 annual reports as the initial sample. To address the issue of limited sample size, a generative model is adopted for data augmentation. Based on the augmented data, an interpretable machine learning framework is developed to explore the main firm characteristics associated with data asset potential, as well as the nonlinear patterns underlying these associations. The results show that innovation investment and firm size are the two most relevant characteristics. Their marginal contributions to data asset potential exhibit clear nonlinear threshold effects. In contrast, conventional profitability metrics have weak explanatory power, and their influence is contingent on the level of innovation input. A subsequent analysis covering the entire market reveals that the information technology sector and the scientific research and technical services sector tend to have substantially higher data asset potential. This study not only offers new empirical evidence on the correlates of data asset potential but also provides practical insights for the design of data factor market institutions, asset valuation practices, investment decisions, and corporate strategic planning.


Full Text:

PDF


DOI: https://doi.org/10.22158/rem.v11n1p69

Refbacks

  • There are currently no refbacks.


Copyright (c) 2026 Junhong Luo, Huijuan Zheng, Junxi Huang

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License.

Copyright © SCHOLINK INC.  ISSN 2470-4407 (Print)  ISSN 2470-4393 (Online)