An enterprise AI system handles data from across the organisation, responds to natural language queries that can be constructed in ways no UI designer anticipated, and generates outputs that can carry sensitive information in forms that are hard to filter. Securing it requires a threat model that did not exist before generative AI, and that most organisations have not finished developing.
The security failures in enterprise AI systems tend not to be the exotic ones: not model theft or adversarial attacks on the underlying neural network. They are more often access control failures, where users receive information they were not supposed to access, or data crosses tenant boundaries that were meant to be impermeable.
These failures are architectural. They come from AI systems designed for functionality first and security second. For AI, that means the security review often happens after the data access patterns are already embedded in the system.
The retrieval boundary is a security boundary
In a RAG system, the data that ends up in a response is determined by the retrieval layer: which documents are returned for a given query, and which parts of those documents appear in the context window. If the retrieval layer does not enforce access control, the generation layer cannot enforce it. The model produces outputs from whatever is in its context, and context can contain anything the retrieval returned.
This is the first and most fundamental security decision in an enterprise AI system: access control must be enforced at the retrieval layer, not at the presentation layer. Filtering sensitive content from responses after retrieval is unreliable; the model may surface the information in paraphrased form, as part of a summary, or as part of a comparison between documents the user should and shouldn't see.
Access control at retrieval means that every query to the vector store is scoped to the documents the querying user is authorised to access. The implementation varies: metadata filters, per user indices, or reranking that understands access rules. The principle is consistent: the security boundary is at retrieval, not at generation.
Prompt injection as a data exfiltration vector
Prompt injection is the AI attack where malicious content in the system's data sources instructs the model to behave differently than intended. In a RAG system, a document in the knowledge base that contains text designed to override the system prompt can, if retrieved, cause the model to ignore its instructions and expose information it was told to protect.
The attack is relevant in any system where the knowledge base contains content from external or unvetted sources. A document submitted by a user, a webpage scraped into the knowledge base, or a third party data feed can all be vectors for injected instructions. The model has no reliable mechanism to distinguish between legitimate content and injected instructions when both appear in its context.
Mitigation requires a combination of input validation during ingestion, structural separation between system instructions and retrieved content in the prompt template, and output monitoring for responses that contain patterns associated with injection attempts. No single measure eliminates the risk, which is why prompt injection should be in the threat model from the design phase rather than discovered in a security review.
Tenant isolation: the decisions that are hard to reverse
AI systems that serve multiple client organisations have data isolation as a non negotiable requirement. The failure mode is severe: one tenant's queries returning content from another tenant's data. This is both a security failure and a regulatory compliance failure in most jurisdictions.
The isolation architecture must match the risk profile. Metadata filtering, where all tenants share a single index and access control is enforced through filters at query time, is simpler to operate and cheaper to run. It also depends on the filter being applied correctly on every query with no exception path. A bug in the filter logic, a misconfigured query, or an edge case in the retrieval pipeline can violate isolation.
A separate index for each tenant is harder to operate at scale, but it eliminates an entire class of isolation failures because there is no shared index to contaminate. The choice between these architectures should be made based on the risk tolerance of the most sensitive tenant, not the average case. Changing the architecture later requires ingesting all content again, which makes this one of the decisions that has to be made correctly from the start.
Audit logging for AI interactions
AI interactions need the same audit logging as any other privileged operation in an enterprise system. Who queried what, when, with what context, and what response was produced. This is the evidence that a security or compliance investigation requires, and it is much harder to reconstruct after the fact than to capture at the time.
The audit record for an AI interaction is more complex than for a conventional API call. The query is important; so is the retrieved context that informed the response, and the generated response itself. Logging only the query doesn't capture what data the model actually had access to when generating its response.
Retention requirements for AI interaction logs may be longer than for operational logs if the AI is making decisions with regulatory implications. The audit log should be treated as a separate concern from the operational log, with its own retention policy and access controls. Access to the audit log itself should also be controlled and logged.
Data minimisation in context windows
Every piece of data in the context window when a model generates a response is data the model was exposed to. In practice, RAG systems often retrieve more content than the query strictly needs. They may return five documents when one would have answered the question, because more context generally improves answer quality up to a point.
The security implication is that the model has access to all retrieved content, including content it never cites. If the retrieval returns a document containing personal data of a third party alongside the document that actually answers the query, the model was exposed to that personal data during generation. Whether it surfaces it in the response is not fully controllable.
Data minimisation at the retrieval layer reduces exposure by retrieving only what is needed and filtering unnecessary content aggressively. It also reduces cost and often improves response quality by reducing noise in the context window. The security and quality motivations align, which makes this design choice one of the straightforward ones.
AI security requires AI specific thinking
The access control patterns and threat models that enterprises have developed for conventional applications are a starting point for AI systems, not a complete answer. The new attack surfaces include prompt injection, retrieval boundary failures, and tenant contamination. They require AI specific mitigations that have to be designed in from the start.
The organisations that get AI security right are the ones that include it in the architectural design process, not the ones that add a security review layer after the system is already built. The structural decisions that determine the security posture of an AI system are made early, and changing them later is expensive.
More in this series
- Why Most AI Pilots Never Reach Production
- Designing Explainable AI: When 'Trust the Model' Isn't Good Enough
- AI Hallucinations in Production: Mitigation Strategies That Work
- Monitoring AI Systems: What to Measure Beyond Uptime
- Human in the Loop Design for AI Powered Workflows
- AI Security: Access Control and Data Isolation for Enterprise Systems
- Controlling AI Costs in Production Without Degrading QualityComing soon
- When Not to Use AI: A Practical Decision FrameworkComing soon
- AI SLAs and Error Budgets: How to Set Expectations for AI FeaturesComing soon