Structural flaws make generative AI systems hard to secure


CO-EDP, VisionRICO-EDP, VisionRI | Updated: 10-02-2026 12:46 IST | Created: 10-02-2026 12:46 IST
Structural flaws make generative AI systems hard to secure
Representative Image. Credit: ChatGPT

Efforts to secure generative AI systems are increasingly clashing with a key limitation: many of the most serious risks cannot be regulated or filtered away. New research suggests that existing cybersecurity frameworks are ill-suited to the way these systems operate. According to the study published in the Journal of Cybersecurity and Privacy, generative AI introduces structural security risks that persist regardless of governance approach.

The study Securing Generative AI Systems: Threat-Centric Architectures and the Impact of Divergent EU–US Governance Regimes examines how architectural design choices interact with EU and US regulatory models to shape real-world security outcomes.

Why generative AI breaks traditional security models

Generative AI systems behave fundamentally differently from traditional software. Conventional applications operate deterministically, executing predefined logic in response to structured inputs. Security failures in such systems are often binary and can usually be traced to specific bugs or misconfigurations that can be patched. Generative AI systems, by contrast, operate probabilistically. Their outputs depend on context, prior interactions, retrieved data, and emergent behavior that may not be predictable even to their designers.

In generative AI systems, natural language functions not just as data but as a control channel. User prompts, retrieved documents, and external content can influence system behavior in ways that blur the boundary between instructions and information. As a result, attackers no longer need to exploit traditional vulnerabilities such as buffer overflows or missing authentication checks. Instead, they can manipulate the system semantically, steering it toward unsafe actions using language that appears legitimate.

The study maps these risks across a five-layer reference architecture that reflects how modern generative AI systems are deployed in practice. These layers include the training and adaptation pipeline, the model behavior layer, retrieval and data interfaces, orchestration and tool use, and runtime interaction with users. By aligning threats with architectural layers, the authors show that many failures occur not inside the model itself but at integration boundaries, where untrusted inputs are coupled with privileged capabilities.

Structural risks arise from architectural design choices that persist regardless of operational controls. Examples include the coupling of untrusted natural language inputs with retrieval mechanisms, the ability of models to invoke tools or trigger actions, and the persistence of sensitive information learned during training. These risks cannot be reliably mitigated through policies, filters, or monitoring alone. Configurable risks, by contrast, stem from deployment choices such as permission scoping, logging, and access control, and can be meaningfully reduced through established cybersecurity practices.

The authors argue that many organizations overestimate the effectiveness of configurable controls while underestimating the impact of structural exposure. This imbalance creates a false sense of security, particularly when generative AI systems are integrated into workflows that touch sensitive data or critical infrastructure.

How retrieval and tool use expand the attack surface

The study focuses on retrieval-augmented generation and agentic systems, which represent the dominant direction of enterprise generative AI deployment. Retrieval allows models to access live, mutable data sources such as internal documents, emails, databases, and external feeds. Tool use enables models to perform actions, from querying systems and modifying records to executing code or sending messages.

While these capabilities unlock powerful use cases, they also dramatically expand the attack surface. In a text-only chatbot, successful manipulation may result in misleading or harmful output. In a system with retrieval and tool access, the same manipulation can escalate into data exfiltration, unauthorized actions, or system compromise. The study shows that this escalation is not accidental but structural, arising from architectural decisions to grant models authority over data and actions.

A key vulnerability pattern highlighted in the paper is the modern equivalent of the confused deputy problem. In this scenario, a generative AI system with elevated privileges is tricked into misusing its authority on behalf of an attacker. Because the model processes untrusted content alongside trusted instructions within the same reasoning loop, it can be induced to act against the interests of its operator while appearing to behave normally. This risk persists even when output filters, prompt screening, or rate limiting are in place.

The authors review empirical evidence showing that indirect prompt injection and tool misuse remain highly effective across a range of models and benchmarks. These attacks exploit the fact that generative AI systems cannot reliably distinguish between benign context and adversarial instructions when both are expressed in natural language. As long as this instruction–data coupling exists, semantic manipulation remains possible.

The study argues that effective mitigation requires architectural containment rather than reactive filtering. This includes strict separation between instructions and retrieved content, least-privilege design for tools, sandboxed execution environments, explicit approval workflows for high-risk actions, and comprehensive logging that captures the full causal chain from input to action. Without these measures, organizations risk deploying systems where a single semantic failure can have outsized consequences.

Regulation drives architecture as much as security

Next up, the paper examines how governance and regulation shape generative AI security outcomes. The authors compare the regulatory approaches of the European Union (EU) and the United States, highlighting how divergent frameworks influence system architecture, evidence requirements, and operational practices.

The EU has adopted a prescriptive, risk-based approach through instruments such as the AI Act, NIS2, and related cybersecurity and data protection laws. These frameworks impose binding obligations on organizations deploying high-risk or systemically significant AI systems, including requirements for risk management, human oversight, logging, incident reporting, and documentation. Compliance is enforced through audits and penalties, making security and governance inseparable from system design.

On the other hand, the United States relies more heavily on voluntary frameworks, sector-specific regulation, and ex post enforcement through liability and consumer protection. The NIST AI Risk Management Framework provides guidance but does not carry the force of law. While this approach offers flexibility and encourages innovation, it also leads to uneven adoption of controls and inconsistent assurance across organizations.

The study finds that, in practice, EU requirements are increasingly becoming the de facto global baseline for multinational deployments. Organizations operating across jurisdictions often adopt EU-aligned controls globally to avoid maintaining separate architectures. This convergence drives more rigorous logging, oversight, and containment practices but also introduces operational friction, particularly in US-centric environments where such measures are not always mandated.

Importantly, the authors caution that compliance alone does not guarantee security. Regulatory frameworks can shape minimum standards, but they cannot eliminate structural risks inherent in current generative AI architectures. Treating compliance as a substitute for architectural rigor risks creating systems that satisfy documentation requirements while remaining vulnerable to real-world attacks.

The paper argues for a dual-track approach that combines a consistent, architecture-led technical baseline with jurisdiction-specific governance processes. In this model, organizations design systems to withstand structural threats first, then layer compliance and reporting mechanisms on top. This approach aligns security investment with actual risk rather than regulatory optics.

A shift toward architecture-led defense in depth

The study asserts that securing generative AI requires a shift in mindset. Rather than treating security as a set of controls applied around a model, organizations must treat it as a property of the entire system, shaped by architectural decisions made early in the design process. Defense in depth remains essential, but it must be applied at the right boundaries, particularly where language interfaces meet data access and action execution.

Many generative AI risks cannot be eliminated entirely. Probabilistic behavior, emergent capabilities, and the complexity of natural language ensure that residual risk will remain. Managing that risk requires continuous monitoring, evidence-based assurance, and explicit governance decisions about acceptable exposure. Pre-deployment testing and policy documentation are necessary but insufficient without ongoing operational oversight.

Human factors also play a key role. Developers, security teams, architects, and leadership must understand how generative AI differs from traditional systems and how those differences translate into new threat models. Training, secure development practices, and clear ownership of risk are as important as technical controls. Without organizational maturity, even well-designed architectures can fail in practice.

  • FIRST PUBLISHED IN:
  • Devdiscourse
Give Feedback