Most enterprise AI security conversations still focus on training — how models are built, what data goes in, how to prevent poisoning. But the greater operational exposure sits elsewhere: in inference, the moment a trained model processes a live query and produces an output. That is where proprietary logic, sensitive prompts, and business strategy become visible to anyone watching the traffic.
A recent panel hosted by The Quantum Insider, featuring leaders from BMO, CGI, and 01Quantum, put the point bluntly: inference is AI working, and AI working is where risk accumulates. Nearly half of the audience polled during the session admitted they lack confidence that their AI systems meet anticipated 2026 security standards. That number is consistent with broader industry data: a Cloud Security Alliance survey found that only 27 per cent of organisations feel confident they can secure AI used in core business operations.
This is not an abstract concern. It is the practical, operational end of the governance conversation maddaisy has been tracking for weeks.
Why inference, not training, is the exposure point
Training happens once (or periodically). Inference happens continuously — every API call, every chatbot interaction, every agentic workflow execution. As Tyson Macaulay of 01Quantum explained during the panel, inference models often contain the distilled intellectual property of an organisation. In expert systems, the model itself reflects proprietary training data, domain knowledge, and internal logic. Reverse engineering an inference endpoint can reveal insights about what the organisation knows and how it thinks.
But the exposure runs in both directions. Prompts themselves reveal information — about individuals, strategy, and operational priorities. A medical query reveals personal health data. A corporate query may signal product development direction. The question, in other words, can be as sensitive as the model.
When maddaisy examined CIOs’ non-AI priorities in February, cybersecurity topped the list — precisely because AI adoption was expanding the attack surface. Dmitry Nazarevich, CTO at Innowise, described security spending increases as “directly related to the increase in exposure and risk to data associated with the increased attack surface resulting from the introduction of generative AI.” Inference security is where that expanding surface is most exposed — and most neglected.
The shadow AI dimension
The problem is compounded by what organisations cannot see. Research suggests that roughly 70 per cent of organisations have shadow AI in use — employees running unauthorised tools outside IT oversight. Every unsanctioned ChatGPT or Claude query involving company data is an unmonitored inference event, pushing proprietary information through systems the organisation does not control.
JetStream Security, a startup founded by veterans of CrowdStrike and SentinelOne, raised $34 million in seed funding last week to address precisely this gap. The company’s product, AI Blueprints, maps AI activity in real time — which agents are running, which models they use, what data they access. The premise is straightforward: you cannot secure what you cannot see.
When maddaisy covered shadow AI in February, the focus was on governance and policy. Inference security adds a harder technical dimension. It is not enough to write policies about acceptable AI use if the organisation has no visibility into what models are being queried, by whom, and with what data.
Real-world vulnerabilities are already surfacing
The risks are not hypothetical. In February, LayerX Security published a report describing a critical vulnerability in Anthropic’s Claude Desktop Extensions — a malicious calendar invite could silently execute arbitrary code with full system privileges. The issue stemmed from an architectural choice: extensions ran unsandboxed with direct file system access, enabling tools to chain actions autonomously without user consent.
The debate that followed was instructive. Anthropic argued the onus was on users to configure permissions properly. Security researchers countered that competitors like OpenAI and Microsoft restricted similar capabilities through sandboxing and permission gates. The real lesson for enterprises is that inference-layer vulnerabilities are architectural, not incidental — and they require controls before deployment, not after.
As Rock Lambros of RockCyber put it: “Every enterprise deploying agents right now needs to answer — did we restrict tool chaining privileges before activation, or did we hand the intern the master key and go to lunch?”
The governance gap has a security-shaped hole
Maddaisy has covered the emerging agentic AI governance playbook extensively — the frameworks from regulators, the principles converging around least-privilege access and real-time monitoring. But frameworks are policy instruments. Inference security is the engineering layer that makes those policies enforceable.
The numbers illustrate the disconnect. According to the latest governance statistics compiled from major 2025-26 surveys, 75 per cent of organisations report having a dedicated AI governance process — but only 26 per cent have comprehensive AI security policies. Fewer than one in 10 UK enterprises integrate AI risk reviews directly into development pipelines. Governance without security controls is aspiration without implementation.
The financial services sector offers a partial model. Kristin Milchanowski, Chief AI and Data Officer at BMO, described her bank’s approach during the Quantum Insider panel: bringing large language models in-house where possible, ensuring that additional training on proprietary data remains contained, and treating responsible AI as a board-level cultural priority rather than a compliance exercise. But BMO operates under some of the strictest regulatory regimes globally. Most enterprises do not face equivalent pressure — yet.
What practitioners should be doing now
The practical agenda emerging from this convergence of research is specific and actionable:
Audit inference endpoints. Map every production AI system, including shadow deployments. The JetStream model — real-time visibility into which models are running, what data they touch, and who is responsible — is becoming table stakes.
Apply least-privilege to AI agents. The agentic governance frameworks maddaisy covered last week prescribe this. At the inference layer, it means restricting tool chaining, sandboxing execution environments, and requiring explicit permission gates for cross-system actions.
Build cryptographic agility into procurement. The Quantum Insider panel raised a forward-looking point: “harvest now, decrypt later” attacks — where encrypted inference traffic is collected today for decryption once quantum computing matures — are overtaking model drift as the top digital trust concern among infrastructure leaders. Embedding post-quantum cryptography expectations into vendor contracts now is practical and low-cost.
Treat inference security as infrastructure. Not as a feature, not as an add-on. As the panel concluded: critical infrastructure must be secured before it is tested by failure.
The operational layer matters most
The governance conversation has matured rapidly. Frameworks exist. Principles are converging. Regulation is arriving. But between the policy layer and the production environment sits inference — the operational layer where AI actually works, where data flows through models, where prompts reveal strategy, and where the absence of controls creates the exposure that governance documents are supposed to prevent.
Gartner projects spending on AI governance platforms will reach $492 million this year and surpass $1 billion by 2030. That money will be wasted if it funds policies without the engineering to enforce them. The organisations pulling ahead will be those that treat inference security not as a technical detail for the security team, but as the operational foundation on which their entire AI strategy depends.