Generative AI must adopt healthcare-style consent rules


CO-EDP, VisionRICO-EDP, VisionRI | Updated: 09-01-2026 19:51 IST | Created: 09-01-2026 19:51 IST
Generative AI must adopt healthcare-style consent rules
Representative Image. Credit: ChatGPT

New academic analysis states that today’s dominant generative AI systems are built on large-scale data extraction that ignores consent, ownership, and compensation, creating legal, social, and economic risks that threaten the technology’s long-term viability.

The study, titled Consentful-by-design: a perspective on safeguarding data ownership from generative AI leveraging lessons from the healthcare domain, was published in the journal AI & Society. The paper frames generative AI as an ethical turning point and calls for a structural shift in how data is sourced, governed, and rewarded.

The authors state that generative AI must adopt governance principles already enforced in high-risk fields such as healthcare.

Generative AI built on data extraction without consent

The paper identifies the core ethical fault line in modern generative AI as the treatment of training data as a free resource rather than owned intellectual labor. Large language and diffusion models are typically trained on massive datasets scraped from the internet, including copyrighted books, articles, artwork, photographs, and music. In most cases, creators are neither informed nor compensated, and they have no technical means to opt out once their work has been absorbed into a model.

The authors show how this model of data extraction has already produced real-world consequences. Creative workers across writing, visual arts, and music have raised concerns that generative systems replicate distinctive styles, flood markets with imitative content, and undermine livelihoods built on intellectual property. Lawsuits, labor actions, and public protests have emerged as creators challenge the assumption that training data can be used without permission.

From a technical perspective, the study explains that even when generative models do not reproduce exact copies of training data, they can still generate outputs that are recognizably derived from specific creators or works. This creates reputational harm and economic displacement regardless of whether the process qualifies as direct copying under narrow legal definitions. The authors argue that focusing solely on technical distinctions between memorization and generalization misses the broader ethical issue of uncompensated labor and loss of control.

The paper places these conflicts in a wider societal context. As generative AI reaches lay users at scale, misuse and low-quality outputs have amplified misinformation, fraud, and erosion of trust across industries. The authors argue that these outcomes are not accidental side effects but predictable consequences of deploying powerful systems without enforceable ethical constraints.

Lessons from healthcare ethics reshape the AI debate

To address these challenges, the study turns to a domain that has faced similar tensions between innovation and risk: healthcare. Medical AI systems operate under strict ethical and legal frameworks because errors can cause direct harm. Over decades, healthcare has developed enforceable standards around informed consent, data protection, accountability, and transparency.

The authors argue that generative AI now occupies a comparable position of societal impact and should be governed accordingly. They frame their recommendations around three ethical principles adapted from biomedical ethics and AI governance frameworks: autonomy, transparency, and beneficence.

Autonomy centers on the right of individuals to control how their data is used. In healthcare, informed consent is mandatory before patient data can be used for treatment or research. By contrast, generative AI systems rarely offer opt-in or opt-out mechanisms, and they lack the ability to selectively remove data once models are trained. The study argues that this imbalance violates a basic ethical standard and fuels public backlash.

Transparency addresses the ability to trace how data flows through AI systems and how outputs are generated. In medical AI, documentation of data sources, model development, and decision pathways is required to ensure accountability. Generative AI systems, however, often operate as opaque pipelines with minimal disclosure about training sources or output provenance. This opacity undermines trust and makes enforcement of rights nearly impossible.

Beneficence requires that technologies actively contribute to social good rather than merely avoiding harm. The authors argue that generative AI currently concentrates value among developers and platforms while externalizing costs onto creators and the public. A sustainable system, they argue, must redistribute value by recognizing data contributors as stakeholders rather than raw material.

By grounding their argument in healthcare ethics, the authors position consentful design not as a moral luxury but as a practical necessity for maintaining legitimacy in high-impact technologies.

Technical pathways toward consentful generative AI

The study goes beyond ethical critique to outline concrete technical pathways that could make consent enforceable rather than symbolic. One proposal is content-owner-in-the-loop systems, where creators retain oversight over how their data contributes to outputs. In such systems, content owners could approve, reject, or correct generated material derived from their work and receive compensation proportional to their contribution.

Federated learning is highlighted as another mechanism to preserve autonomy. Instead of centralizing training data, federated approaches allow models to be trained locally, with only parameter updates shared. This keeps data under the control of its owners and reduces the risk of misuse. While full federated training of large foundation models remains impractical, federated fine-tuning using parameter-efficient techniques offers a near-term compromise.

The authors also point up retrieval-augmented generation architectures as a promising route to real opt-out mechanisms. In these systems, models access external data repositories at inference time rather than absorbing all data into their parameters. This makes it technically feasible to remove or restrict specific data contributions, restoring a degree of control that current models lack.

Transparency and traceability are addressed through data provenance systems, including cryptographic and blockchain-based registries that log consent, ownership, and usage conditions. These systems could allow auditors, regulators, and creators to verify that outputs comply with licensing and consent requirements throughout a model’s lifecycle.

The authors acknowledge that none of these solutions are fully mature. Challenges around scalability, computational cost, and standardization remain significant. Similar obstacles were overcome in healthcare precisely because regulation and ethics created incentives for innovation rather than blocking it.

  • FIRST PUBLISHED IN:
  • Devdiscourse
Give Feedback