Big data analytics may be fueling corporate greenwashing
Big data has been widely hailed as a breakthrough tool for environmental accountability, promising cleaner production systems, sharper regulatory oversight and stronger ESG transparency. Corporations across major economies have invested heavily in analytics platforms, positioning digital transformation as proof of their sustainability credentials. However, new evidence suggests that the same technology designed to improve environmental performance may also be helping firms polish their green image without improving their actual impact.
In the study Making It Look Green: Big Data Analytics, External Pressure, and Corporate Greenwashing, published in the journal Sustainability, researchers find that firms with stronger big data capabilities are significantly more likely to engage in greenwashing, particularly under certain types of external pressure
Big data as a tool of impression management
The study's argument is based on two theoretical foundations: institutional theory and impression management theory. Institutional theory explains how companies seek legitimacy in the eyes of regulators, investors and the public. Impression management theory focuses on how organizations strategically shape the information they present to appear favorable.
Rather than treating big data analytics as inherently virtuous, the authors argue that it can reinforce three classic modes of impression management: self-serving bias, symbolic management and accounting rhetoric.
Self-serving bias occurs when firms selectively highlight favorable outcomes while downplaying or externalizing negative results. Big data analytics makes this easier and cheaper. Through advanced filtering, sentiment analysis and automated text generation, companies can emphasize positive environmental metrics while minimizing discussion of violations or inefficiencies. The result is selective transparency that stays within formal disclosure boundaries while shaping public perception.
Symbolic management involves ceremonial actions that signal responsibility without substantive operational change. Big data tools enable firms to monitor social media trends, regulatory signals and public sentiment in real time. Companies can then rapidly adjust environmental messaging, publish tailored sustainability narratives and align their reports with current policy themes. This responsiveness creates an image of alignment with environmental goals even when actual practices remain unchanged.
Accounting rhetoric represents a more technical strategy. With modeling software and advanced visualization, firms can generate complex environmental forecasts, dashboards and scenario simulations. These outputs convey scientific rigor and objectivity. Yet the same models can omit hidden environmental trade-offs or rely on optimistic assumptions about future performance. By framing data as neutral truth, companies reinforce legitimacy while potentially masking gaps between promises and practice.
The study measures big data analytics capability through textual analysis of annual reports. Using machine learning techniques and a curated list of more than 200 digital-related keywords, the authors assess how frequently firms reference big data technologies and analytics in their Management Discussion and Analysis sections. This proxy captures firms’ digital orientation and declared analytics capability.
Greenwashing is measured as the standardized difference between symbolic environmental efforts and substantive environmental actions. Substantive efforts include measurable governance activities such as pollution control systems, waste management, environmental emergency mechanisms and clean production initiatives. Symbolic efforts include environmental concepts, goals, awards and public statements, adjusted downward for negative signals such as environmental violations and regulatory penalties. A larger gap between symbolic and substantive actions indicates greater greenwashing.
The baseline regression results are consistent and statistically significant. Across multiple specifications with firm and year fixed effects, the coefficient linking big data analytics to greenwashing remains positive and significant at the highest conventional levels. The magnitude of the effect remains stable after controlling for firm age, size, profitability, leverage, governance structure and growth opportunities.
When regulation constrains and markets amplify
The study does not stop at identifying the relationship. It explores how different types of external pressure shape firms’ use of digital tools.
The authors distinguish between constraint-based non-market pressures and opportunity-based market pressures. Non-market pressures include government environmental regulation and media coverage. These forces operate through a logic of punishment. Firms that are exposed for misleading environmental claims face fines, reputational damage and regulatory scrutiny.
The data show that stronger government environmental regulation weakens the positive effect of big data analytics on greenwashing. Similarly, higher levels of media coverage significantly dampen the relationship. In cities where government reports emphasize environmental keywords more frequently, or where firms receive greater mainstream media attention, the space for opportunistic use of analytics narrows.
This suggests that regulatory oversight and media scrutiny increase the detection risk associated with data-driven greenwashing. Under such conditions, companies are more likely to pursue substantive environmental improvements or adopt more cautious disclosure strategies.
On the other hand, market pressures operate differently. Institutional investors and financial analysts evaluate companies through disclosed ESG metrics and forward-looking narratives. In these environments, legitimacy is often granted through persuasive disclosure rather than direct observation of environmental performance.
The study finds that institutional ownership significantly strengthens the positive association between big data analytics and greenwashing. Analyst attention has a similar amplifying effect. When firms are closely followed by capital market actors, they have stronger incentives to craft compelling sustainability narratives that resonate with investor expectations.
Big data analytics enhances firms’ ability to meet these expectations. Through selective emphasis on metrics favored by ESG rating agencies and investors, companies can align their disclosures with dominant templates and benchmarks. The reward logic of capital markets therefore increases the expected benefit of symbolic environmental performance.
The opposite moderating effects reveal a critical insight. Digital technologies do not operate in a vacuum. Their ethical impact depends on the surrounding institutional logic. Constraint-based pressures push firms toward substantive compliance. Opportunity-based pressures can encourage symbolic conformity.
Robust tests and broader implications
To ensure that the results are not driven by measurement choices or reverse causality, the authors conduct extensive robustness checks.
Alternative greenwashing measures are employed, including a dummy variable based on high environmental rhetoric combined with regulatory penalties, divergence between different ESG rating agencies and a tone-based metric comparing positive language to substantive environmental actions. In all cases, big data analytics retains a positive and statistically significant association with greenwashing.
Alternative proxies for digital capability are also tested. These include the proportion of big data keywords in reports, the number of digital-related patent applications and the share of digital intangible assets such as software and data platforms. The findings remain consistent across these measures.
To address potential endogeneity, the study uses the designation of National Big Data Pilot Zones as an instrumental variable. Cities selected by the Chinese government for digital development initiatives provide firms with greater infrastructure and policy support for analytics adoption. Instrumental variable regressions confirm that the positive link between big data analytics and greenwashing persists even when accounting for potential reverse causality.
Dynamic panel estimation using system generalized method of moments further accounts for persistence in greenwashing behavior. A Heckman correction addresses sample selection bias arising from voluntary environmental disclosure. Across all models, the central relationship holds.
The findings carry significant theoretical implications.
- They challenge the widely held assumption that digitalization necessarily enhances transparency and accountability. More data does not automatically produce more truth. When embedded in legitimacy-driven environments, digital tools can facilitate deception.
- The study integrates impression management theory with institutional theory to explain how digital capabilities reshape the cost-benefit structure of greenwashing. Big data reduces the marginal cost of producing tailored disclosures while increasing their persuasive power. Whether firms deploy these capabilities for genuine improvement or symbolic compliance depends on external incentives.
- The research advances understanding of digital sustainability by distinguishing between different institutional pressures. It demonstrates that regulatory and media oversight can redirect digital tools toward substantive accountability, while market pressures may inadvertently amplify symbolic behavior.
From a policy perspective, the study suggests that regulators must strengthen standards for environmental disclosure and impose meaningful penalties for misleading digital narratives. Media organizations play a crucial role in scrutinizing inconsistencies between data-driven claims and actual environmental outcomes. Investors and analysts should deepen due diligence and seek third-party verification rather than relying solely on polished ESG dashboards.
For corporate leaders, the findings serve as a warning. Big data analytics is a double-edged sword. Used responsibly, it can improve environmental monitoring and operational efficiency. Used strategically for impression management, it can erode trust and expose firms to long-term reputational risk.
The authors acknowledge that the study focuses on Chinese listed firms and that institutional dynamics may vary across countries. They also note that textual measures of digital capability may not fully capture internal implementation depth. Future research could extend the analysis across different regulatory systems and incorporate internal operational data.
- FIRST PUBLISHED IN:
- Devdiscourse

