The Breach That Is Already Happening
Security teams have spent decades building perimeters around their most sensitive data. Firewalls, VPNs, DLP policies, access controls — layers upon layers designed to keep confidential information inside the organisation. Then their own employees started using ChatGPT, and a significant portion of those carefully constructed perimeters became largely irrelevant.
The data point that should be on every CTO's desk right now comes from the LayerX 2025 Enterprise AI Security Report: 77% of enterprise employees who use AI tools have pasted company data into a public chatbot query. Not sometimes. Routinely. And 22% of those incidents involved confidential personal or financial information — the exact categories that trigger GDPR, CCPA, and sector-specific regulatory obligations.
This is not a future threat to model. It is a current breach pattern at scale, happening today in most large organisations whether or not it appears in any incident log. The reason it rarely appears in incident logs is that it is nearly invisible: a copy-paste into a browser tab leaves no DLP alert, no access log, no anomaly signal. It looks like a Tuesday.
Samsung discovered the scale of the problem in a particularly costly way in 2023: engineers at the semiconductor division pasted proprietary source code and internal meeting notes into ChatGPT while debugging. The data was, by design, used to improve the model. Samsung's response was a company-wide ban on generative AI tools — a reaction that illustrates both the severity of the exposure and the bluntness of the tools most organisations reach for when they finally respond to it.
Shadow AI: The Governance Gap Beneath the Surface
Shadow AI — the use of AI tools outside of official IT approval and governance processes — has become the enterprise security story of 2026 in a way that shadow IT never quite managed. The adoption velocity is simply different. Employees adopted personal cloud storage and messaging apps gradually. They adopted AI tools immediately, because the productivity gains were obvious and the friction was zero.
The scale is not marginal. Research across enterprise populations consistently shows that the number of employees actively using AI tools exceeds the number of officially sanctioned AI deployments by a factor of five to ten. In practical terms: your security team has probably reviewed and approved three to five AI tools. Your employees are using forty to sixty.
The risk profile of shadow AI is not just about data exfiltration. It is also about organisational liability. Air Canada provides the clearest legal precedent: its AI chatbot fabricated a bereavement travel policy that did not exist, a passenger relied on it and was denied the fare reduction, and the dispute went to tribunal. The court found Air Canada liable for its AI's statements and ordered it to honour the fabricated policy. The organisation that deploys an AI — even an unauthorised one used by a single employee in a customer-facing context — can inherit liability for that AI's outputs.
The governance response most organisations reach for first — blanket bans — consistently fails. It drives usage further underground, deprives the organisation of visibility, and does not address the underlying productivity need that drove adoption. The more durable approach treats shadow AI as a demand signal: employees are using these tools because they are useful, and the organisation's job is to channel that demand into managed, visible, compliant tooling rather than trying to suppress it.
Key Takeaways
- 77% of enterprise AI users have pasted company data into public chatbots; 22% involved confidential information
- Shadow AI deployments outnumber officially sanctioned ones by 5–10x in most large enterprises
- Blanket bans consistently fail — they drive usage underground and destroy visibility
- Air Canada was held legally liable for its AI chatbot's fabricated policy, setting a significant precedent
Prompt Injection: The Attack Vector Built Into the Architecture
While shadow AI is largely an insider behaviour problem, prompt injection is a genuine external attack vector — and it is now ranked number one in OWASP's Top 10 for Large Language Model Applications.
The mechanism is deceptively simple. An LLM application is instructed to behave in a certain way via a system prompt. An attacker submits a user-controlled input designed to override, ignore, or manipulate that system prompt — causing the model to take actions or reveal information outside its intended scope. In a simple chatbot, this might mean extracting the system prompt or generating content the application was supposed to refuse. In an agentic system with tool access, it can mean triggering real-world actions: sending emails, querying databases, modifying files, or making API calls.
The threat is scaling alongside agent deployment. Between December 2025 and January 2026, security researchers documented 35,000 attack sessions specifically targeting exposed LLM and MCP endpoints — the first large-scale, systematic commercial campaign of this type. The attackers were not looking for traditional software vulnerabilities. They were looking for AI systems with tool access and insufficient input validation, treating prompt injection not as an academic curiosity but as a viable attack primitive for credential theft, data extraction, and lateral movement.
Second-order prompt injection is the more sophisticated variant that agentic systems are particularly vulnerable to. In this pattern, the malicious instruction is not submitted directly by the attacker but is embedded in content the AI agent retrieves during its normal operation — a webpage it scrapes, a document it reads, an email it processes. A ServiceNow security research disclosure in late 2025 demonstrated a variant where a low-privilege AI agent could, via carefully crafted content in a shared document, cause a higher-privilege peer agent to perform restricted actions on its behalf. Traditional access controls built for human actors offer limited protection against this vector because the model, not the attacker, is the actor performing the privileged operation.
Key Takeaways
- Prompt injection is #1 in OWASP's Top 10 for LLM Applications in 2025/2026
- 35,000 attack sessions targeted exposed LLM and MCP endpoints in a two-month window (Dec 2025–Jan 2026)
- Second-order injection embeds malicious instructions in content AI agents retrieve — bypassing input validation entirely
- Agentic systems with tool access convert prompt injection from information disclosure to real-world action
RAG Poisoning: When Your Knowledge Base Becomes the Attack Surface
Retrieval-Augmented Generation (RAG) has become the dominant architecture for deploying LLMs in enterprise contexts: instead of fine-tuning a model on proprietary data, you maintain a vector database of documents and retrieve relevant chunks at inference time. It is more cost-effective, more updatable, and more auditable than fine-tuning. It also creates an attack surface that most teams have not adequately threat-modelled.
The attack pattern, documented in research known as PoisonedRAG, is straightforward in principle: inject a small number of malicious documents into the retrieval index. When the AI system retrieves those documents as context for a user query, the embedded instructions — formatted to look like authoritative content — influence the model's output. In controlled research conditions, injecting just five malicious documents into a corpus of millions caused the targeted AI to return the attacker's desired false answer 90% of the time for specifically crafted queries.
The practical implications for enterprise RAG deployments are significant. Most organisations populate their retrieval indices from sources with imperfect provenance controls: internal wikis edited by many contributors, SharePoint libraries with broad write permissions, document management systems where access governance lags document creation. An attacker — internal or external — who can write a single document to any of these sources potentially has a vector to influence AI-assisted decisions made from that knowledge base.
Vector embeddings themselves present an underappreciated risk. The widespread assumption that stored embeddings are non-sensitive — that they are opaque numerical representations of the original text — was challenged by 2025 research demonstrating embedding inversion attacks: techniques that can reconstruct the original sentence from its vector representation with sufficient accuracy to expose confidential information. Embedding stores that were treated as safe to replicate or share widely may contain latent privacy exposure that has not been accounted for.
Key Takeaways
- PoisonedRAG research: 5 malicious documents injected into millions caused targeted AI to return attacker-controlled answers 90% of the time
- Enterprise RAG indices typically draw from sources with inadequate write governance — wikis, SharePoint, document libraries
- Embedding inversion attacks can reconstruct original text from vector representations, making embedding stores a latent privacy risk
- RAG architecture's updatability advantage — its strength — is also its security liability
Reasoning Models Expand the Attack Surface Further
The deployment of reasoning models — o3, DeepSeek-R1, and their successors — adds another dimension to the LLM security picture that is only beginning to be understood in production contexts. Reasoning models produce extended internal thought chains before generating their final response. This extended reasoning improves accuracy on complex tasks. It also creates a larger and more sophisticated prompt injection attack surface.
A standard LLM processes a prompt and generates a response in a single forward pass. A reasoning model works through a multi-step chain, and each intermediate step is itself an inference surface. A well-crafted injection embedded in retrieved context can be designed to influence reasoning mid-chain — after input validation checks have run but before the final output is generated. Several researchers in early 2026 documented cases where reasoning models were more susceptible to certain injection variants than their non-reasoning counterparts, precisely because the extended reasoning process gives injected instructions more surface area to interact with.
The practical security implication is not that reasoning models should be avoided — their accuracy improvements on high-judgment tasks are real and valuable. It is that the security architecture for reasoning model deployments needs to account for this expanded attack surface explicitly: output monitoring of reasoning chains, not just final responses; stricter input sanitisation for retrieved context; and human-in-the-loop review for high-consequence actions that reasoning agents recommend.
There is also the economic attack surface to consider. Reasoning models are significantly more expensive to run than standard models — often by an order of magnitude per token. An adversary who can cause a reasoning model to trigger unnecessarily on low-value queries, or who can force extended reasoning chains through prompt manipulation, can mount a practical denial-of-service attack via cost inflation. This is not a theoretical concern; it is an active exploitation pattern documented in the 2026 LLMjacking campaign data.
A Practical LLM Security Framework for Engineering Teams
The security picture for LLM deployments is genuinely more complex than for traditional software systems, but it is not intractable. The organisations that are managing it well in 2026 are working from a consistent framework: visibility first, then policy, then technical controls — in that order.
Visibility means knowing what AI tooling is actually in use across your organisation, not just what is officially approved. This requires passive network monitoring for known AI endpoints, software inventory tooling that surfaces AI applications, and an honest survey of actual employee behaviour. Most organisations that have done this work have discovered that actual usage is three to five times higher than assumed. You cannot govern what you cannot see.
Policy means establishing clear rules for what data can interact with which AI systems, under what conditions. This does not have to be restrictive to be effective. A tiered data classification mapped to approved tool categories — public data can go anywhere, internal data goes to approved tools, confidential data stays in your controlled deployment — is more useful and more enforceable than a blanket prohibition. It also gives employees a path to compliance rather than a choice between productivity and policy.
Technical controls for LLM security fall into several categories: input validation and sanitisation for user-supplied and retrieved content; output monitoring for sensitive data patterns; isolation of agentic systems with tool access behind principle-of-least-privilege architectures; audit logging of tool invocations and data retrievals; and red-teaming exercises specifically designed for prompt injection and RAG poisoning. For agentic systems, the principle of least privilege deserves particular emphasis: an agent that only has access to the tools and data it needs for its specific task has a dramatically smaller blast radius when compromised.
Key Takeaways
- Build visibility first: actual AI tool usage is typically 3–5x what IT has officially sanctioned
- Tiered data classification mapped to approved tool categories outperforms blanket bans on both compliance and adoption
- Principle of least privilege is the single most effective architectural control for agentic systems
- Red-teaming for prompt injection and RAG poisoning should be a standard part of AI deployment review, not an afterthought
What This Means for Engineering Teams and Outsourcing
LLM security is not a specialisation that can be cleanly separated from AI development. It needs to be embedded in how teams build, not bolted on after deployment. This has direct implications for how engineering leaders think about team composition and vendor selection.
In-house teams need engineers who understand the LLM threat model at a component level: what prompt injection looks like in their specific application architecture, how their RAG pipeline's write governance compares to its read exposure, what the blast radius of their agentic tools would be under a successful injection attack. This is not a security team problem to be handed off — it is a software engineering problem that requires the engineers building the system to understand the attack surface.
For organisations using outsourced or nearshore development teams to build AI-integrated systems, vendor evaluation needs to explicitly include LLM security capability. Questions to ask: Has this team conducted prompt injection testing on prior AI projects? How do they approach data governance for RAG deployments? What is their process for threat modelling agentic systems? The answers will quickly separate teams that have thought carefully about these problems from teams that are treating AI integration as a standard feature build.
The nearshore advantage in this specific domain is real and currently underutilised. Senior engineers in mature Eastern European tech ecosystems — Serbia, Poland, Romania, Bulgaria — are working on AI integrations at a high level of sophistication, and the intersection of AI development expertise with security engineering is a genuine differentiator. For European enterprises in particular, working with EU-adjacent partners on AI systems that handle GDPR-sensitive data reduces legal surface area significantly compared to offshore alternatives outside the EU's legal framework.
Key Takeaways
- LLM security is a software engineering problem, not a security team handoff — build teams need to own the threat model
- Vendor evaluation for AI development work should explicitly test for prompt injection testing methodology and RAG governance practices
- EU-adjacent nearshore partners offer a structural advantage for GDPR-sensitive AI deployments that offshore alternatives cannot match
- The intersection of AI development expertise and security engineering is currently a genuine differentiator in the senior nearshore talent market
The Bottom Line
The 77% problem is not going away by telling employees to stop. It is going away when organisations provide AI tooling that is both useful and governed, when engineers build AI systems with the threat model embedded from the start rather than patched in at the end, and when leadership treats LLM security as a first-class engineering discipline rather than a compliance checkbox. The external threat landscape — prompt injection campaigns, RAG poisoning, LLMjacking, agentic lateral movement — is real and escalating. But the internal behaviour gap remains the larger near-term risk for most enterprises, because it is invisible, it is widespread, and it is currently happening without triggering any of the monitoring systems that security teams have spent years building. Closing that gap requires visibility before policy, policy before prohibition, and technical controls designed by engineers who understand the AI attack surface as well as they understand the rest of their stack.
Building a team in Eastern Europe?
StepTo helps European and US companies build senior-led nearshore engineering teams in Serbia. Let's talk about what your next engagement could look like.
Start a conversation