Tag: agentic-ai

  • Digital Experiences Now Have Two Audiences. Most Enterprises Are Only Designing for One.

    For as long as digital products have existed, experience design has asked a single question: what does the user want? The user browses, clicks, hesitates, backtracks, and eventually converts — or does not. Every interface decision, from navigation hierarchy to button placement, has been optimised around that human journey.

    In 2026, a second audience has arrived. AI agents now browse websites, interpret content, summarise product pages, compare services, and make purchasing recommendations — often before a human ever sees the interface. Search engines have done this quietly for years. But the new generation of autonomous agents does it actively, making decisions and taking actions on behalf of the people they serve.

    The implication for enterprises is straightforward and largely unaddressed: digital experiences must now be designed for two interpreters simultaneously, and they do not read the same way.

    The dual-interpreter problem

    Humans and machines process digital experiences through fundamentally different lenses. A human visitor might scan a page loosely, drawn by visual hierarchy, tone of voice, and emotional cues. They browse without clarity, explore without urgency, and change their minds mid-session. That inconsistency is not a flaw — it is how people navigate complex decisions.

    Machines, by contrast, prefer structure. They infer meaning from hierarchy, repetition, semantic markup, and patterns. They classify, compress, and summarise. When an AI agent visits a product page, it does not feel reassured by a warm brand photograph. It parses structured data, identifies key claims, and decides — in milliseconds — what that page is about, what matters, and what to report back to the user who sent it.

    As Composite Global noted in a recent analysis, experience design has shifted from being about flow to being about interpretation. The question is no longer just “how will a person navigate this?” but “how will an agent read this — and will it get the right answer?”

    Where the gap shows up

    The consequences of ignoring machine intent are already visible. When AI agents summarise a company’s offerings inaccurately, the problem is rarely that the agent is broken. More often, the page was never designed to be machine-readable in any meaningful way. The content was written for humans — rich in nuance, light on structure — and the agent did its best with what it found.

    Research from TBlocks found that 71 per cent of users now expect digital experiences to adapt to their intent, while 76 per cent notice and feel frustrated when that adaptation fails. Those expectations increasingly extend to agent-mediated experiences. If a user asks an AI assistant to compare three consulting firms’ service offerings, and the agent returns a garbled summary because one firm’s website relies on unstructured prose and JavaScript-rendered content, the brand loses — not the agent.

    The practical failures tend to cluster around a few recurring problems: content hierarchies that make sense visually but not semantically; messaging that requires context an agent cannot infer; calls to action that depend on emotional persuasion rather than clear structure; and pages that load dynamically in ways that agents cannot reliably parse.

    This is not SEO by another name

    It would be tempting to treat this as an extension of search engine optimisation. After all, making content machine-readable has been a concern since the early days of Google. But the agent-readability challenge goes further than search ranking.

    Search engines index pages and rank them. AI agents interpret pages and act on them. An agent does not return a list of blue links — it makes a recommendation, completes a task, or rules out an option entirely. The stakes are different. A page that ranks poorly in search results is still findable. A page that an AI agent misinterprets may never surface at all, or worse, may surface with the wrong message attached.

    This distinction matters for how enterprises invest. SEO focuses on keywords, metadata, and backlinks. Agent-readability requires structured data, semantic clarity, explicit labelling, and content architectures that hold meaning when stripped of their visual presentation. The overlap exists, but the disciplines are not the same.

    What maddaisy’s coverage has been pointing toward

    Readers of maddaisy’s recent coverage will recognise the broader pattern here. When this publication examined the governance challenges of AI agents, the focus was on how enterprises monitor and control autonomous systems. When it covered OpenAI’s Frontier Alliance, the story was about agents disrupting enterprise software by sitting above it. And when it explored vibe coding’s enterprise arrival, the thread was about how AI is reshaping how software gets built.

    The digital experience question is downstream of all three. If agents are going to interact with enterprise digital products — browsing service pages, interpreting pricing structures, summarising capabilities for prospective clients — then those products need to be designed with agents in mind. Not instead of humans. Alongside them.

    Designing for clarity across interpreters

    The emerging discipline — sometimes called “dual-intent design” — requires thinking in layers. Composite Global’s framework identifies three dimensions of intent that designers must now map simultaneously: explicit intent (what a user directly communicates), behavioural intent (what systems infer from interaction patterns), and emotional context (the confidence, uncertainty, or curiosity a human brings to the interaction).

    The first two are measurable. The third is where human judgment lives — and where machines consistently fall short. Strong experience design ensures that machine interpretation reinforces human meaning rather than distorting it. In practice, that means clear content hierarchies so agents classify correctly, structured data so machines parse quickly, explicit labelling so summaries remain accurate, and focused messaging so automated recommendations do not flatten a brand’s positioning.

    CoreMedia’s analysis of 2026 customer experience trends puts it bluntly: AI has become “a powerful new intermediary stepping between brand and customer.” The brands that treat that intermediary as an afterthought will find their message distorted in transit.

    The practical question for enterprises

    For most organisations, the immediate question is not whether to redesign everything. It is whether their existing digital properties communicate clearly to both audiences. A simple audit reveals the answer quickly: take a key product or service page, strip away the visual design, and read only the structured content. Does it still make sense? Would an agent, parsing that structure, draw the right conclusions?

    If the answer is no — and for most enterprise websites built in the pre-agent era, it will be — the remediation is less about redesign than about augmentation. Adding structured data, clarifying semantic hierarchy, making content modular rather than monolithic, and ensuring that key claims do not depend on visual context for meaning.

    None of this requires abandoning human-centred design. The point is not to optimise for machines at the expense of people. It is to build clarity that holds up under both interpretations — a standard that, arguably, should have been the goal all along.

    The enterprises that get this right will not just rank well or convert well. They will be accurately represented by the AI systems that increasingly mediate how their customers discover, evaluate, and choose them. In a market where agents are becoming the first point of contact, being misunderstood by a machine may prove more costly than being overlooked by a human.

  • The Governance Frameworks for AI Agents Exist. The Hard Part Is Making Them Work.

    The governance playbook for autonomous AI agents is no longer a blank page. Regulatory bodies have published frameworks. Law firms have issued guidance. Industry coalitions have identified priorities. The principles – least privilege, human checkpoints, real-time monitoring, value-chain accountability – are converging across jurisdictions. And yet, Gartner predicts that 40 per cent of agentic AI projects will be cancelled by the end of 2027, citing escalating costs, unclear business value, and inadequate risk controls.

    The problem is not that enterprises lack governance policies. It is that they lack governance infrastructure – the operational machinery to translate principles into practice across live, autonomous systems operating at scale.

    When maddaisy examined the emerging governance playbook last week, the direction of travel was clear: regulators and advisors were converging on what good governance should look like. The question that follows is more difficult. What does it take to actually run that governance, day after day, across agents that plan, execute, and adapt autonomously?

    The Gap Between Policy and Operations

    The most revealing data point in recent weeks comes not from a governance report but from Logicalis’s 2026 CIO Report. Among 1,000 chief information officers surveyed globally, 89 per cent described their AI governance approach as “learning as we go.” That is not experimentation. That is the absence of operational governance.

    The skills gap compounds the problem. Nearly nine in 10 organisations cite a lack of internal technical capability as their primary constraint on AI deployment. For governance specifically, the deficit is acute. Monitoring agent behaviour in production, auditing multi-step reasoning chains, and interpreting regulatory requirements across jurisdictions all demand expertise that most enterprises have not yet hired for – and in many cases, cannot find.

    A PwC survey found that 79 per cent of companies have adopted agents in some capacity. But when enterprise search firm Lucidworks assessed over 1,100 organisations, only 6 per cent had deployed more than one agentic solution. The implication is significant: most enterprises are governing a single, contained pilot. The governance challenge changes materially when agents multiply, interact, and share data across business functions.

    Regulations Are Arriving – Unevenly

    The regulatory landscape is not waiting for enterprises to catch up. The EU AI Act’s obligations on high-risk and general-purpose AI systems take effect from August 2026, applying globally to any organisation whose systems affect EU residents. In the United States, the picture is more fragmented. President Trump’s December 2025 Executive Order signalled federal intent to consolidate AI oversight, but as legal analysis from Gunderson Dettmer makes clear, it does not preempt existing state laws.

    California, Colorado, and Texas have each enacted comprehensive AI governance statutes with distinct requirements for high-risk systems. New York’s RAISE Act imposes transparency obligations that do not apply elsewhere. For multinational enterprises deploying autonomous agents, the compliance surface is not one framework – it is dozens, with different definitions of high-risk, different disclosure requirements, and different enforcement timelines.

    This is where governance-as-policy meets governance-as-operations. A well-crafted internal policy cannot resolve the question of whether an agent deployed in London, which processes data from a New York customer and executes a transaction through a Singapore-based system, complies with three different regulatory regimes simultaneously. That requires technical infrastructure: jurisdictional routing, dynamic compliance rules, and audit trails that satisfy multiple authorities.

    What Operational Governance Actually Requires

    Several CIOs interviewed by CIO.com this month offered a consistent message: governance cannot be separated from workflow design.

    Don Schuerman, CTO at Pega, put it directly: the expectation that thousands of agents can be deployed randomly across a business and left to operate is a myth. Successful deployments anchor agents in well-defined business processes with prescribed steps, high predictability, and clear audit requirements. The governance is not a layer added afterwards – it is embedded in how the agent’s workflow is designed.

    IBM CIO Matt Lyteson echoed the point, stressing that organisations need to understand the outcomes they are targeting, the data agents will require, and the controls needed to manage them before deployment – not after. Salesforce CIO Dan Shmitt added that without high-quality data and a unified governance model, agents produce unreliable results regardless of the policy framework around them.

    The emerging consensus among practitioners, distinct from the framework-level guidance, centres on three operational requirements.

    First, governance must be embedded in agent design, not bolted on. Decision boundaries, escalation rules, and compliance checks need to be part of the agent’s workflow architecture. Retrofitting governance onto an agent already in production is significantly harder and more expensive.

    Second, observability infrastructure is non-negotiable. As maddaisy has previously reported on agentic drift, agents that pass review at launch can behave differently months later. Continuous monitoring of reasoning chains, action sequences, and decision outcomes is the minimum viable governance stack – not periodic audits.

    Third, governance requires dedicated roles, not committees. The Mayer Brown framework identified four governance functions: policy-setters, product teams, cybersecurity integration, and frontline escalation. Most enterprises have distributed these responsibilities informally. As agents scale beyond pilot stage, informal arrangements become liabilities.

    The Trajectory Ahead

    The governance conversation has moved faster than most observers expected. Twelve months ago, agentic AI governance was a theoretical concern. Today, it has dedicated regulatory guidance, published legal frameworks, and named positions on practitioners’ organisational charts. That is genuine progress.

    But the distance between knowing what governance should look like and operating it reliably is where the next phase of difficulty lies. The 40 per cent cancellation rate Gartner projects is not primarily a technology failure – it is a governance and operational maturity failure. The organisations that succeed with autonomous agents will not be those with the most sophisticated AI models. They will be the ones that built the operational infrastructure to govern them before they scaled.

    For consultants advising enterprise clients on agentic AI, the message has shifted. The question is no longer whether governance frameworks exist. It is whether the organisation has the skills, tooling, and organisational design to make those frameworks operational. That is a harder conversation, but it is now the one that matters.

  • OpenAI’s Frontier Alliance Is Not Just About Consulting. It Is a Bet Against the Enterprise Software Stack.

    When maddaisy examined OpenAI’s Frontier Alliance in February, the focus was on what it meant for the consulting firms — McKinsey, BCG, Accenture, and Capgemini — and the admission that AI vendors cannot scale enterprise deployments alone. That story was about the consulting industry. This one is about the companies the alliance is quietly aimed at: the enterprise software vendors that have built trillion-dollar businesses on per-seat licensing.

    The per-seat model under pressure

    OpenAI’s Frontier platform, launched in early February, is designed as an enterprise operating layer — a unified system where AI agents can log into applications, execute workflows, and make decisions across an organisation’s entire technology stack. CRM systems, HR platforms, ticketing tools, internal databases. The ambition is not to replace any single application but to sit above all of them.

    The threat to SaaS vendors is structural, not incremental. If AI agents execute the tasks that human employees currently perform inside Salesforce, ServiceNow, or Workday, the justification for per-seat licensing weakens. Fewer human users logging in means fewer seats to sell. And if agents can orchestrate workflows across multiple systems from a single platform, the case for buying specialised point solutions — each with its own subscription — becomes harder to make.

    The market has not waited for proof. Investors wiped roughly $2 trillion in market value from technology stocks in a single week over AI displacement concerns. ServiceNow shares fell more than 20% year-to-date by mid-February. IBM suffered its largest single-day decline in 25 years after Anthropic’s Claude demonstrated competency with legacy COBOL systems — the very maintenance work that underpins a significant portion of IBM’s consulting revenue.

    The consulting conduit

    What makes the Frontier Alliance specifically dangerous for SaaS incumbents is not the technology. It is the distribution channel.

    McKinsey, BCG, Accenture, and Capgemini are not just consulting firms. They are the primary implementation partners for the very software companies that Frontier could displace. When a Fortune 500 company deploys Salesforce, it typically hires one of these firms to manage the rollout. When it migrates to ServiceNow’s IT service management platform, the same consulting firms handle the integration. The relationships are deep, multi-year, and built on trust.

    OpenAI has effectively enlisted those relationships as a distribution network. Each of the four firms has established dedicated OpenAI practice groups, certified their teams on Frontier, and committed to multi-year alliances. OpenAI’s own forward-deployed engineers will sit alongside consulting teams in client engagements — a model borrowed from Palantir’s playbook for embedding in enterprise accounts.

    The result is a direct-to-enterprise pipeline that does not need SaaS vendors as intermediaries. A consulting firm advising a client on AI strategy can now recommend Frontier agents that orchestrate existing systems, rather than recommending new SaaS products that require their own implementation projects. The consulting firm earns either way. The SaaS vendor may not.

    SaaS is not dying. But the economics are shifting.

    The counterarguments deserve a hearing. Fortune 500 companies will not abandon decades of enterprise software investment overnight. Compliance requirements, audit trails, data sovereignty obligations, and the sheer operational complexity of large organisations create friction that no AI platform can simply wave away. As one analyst put it, “we are simply not going to see a complete unwinding of the past 50 years of enterprise software development.”

    The incumbents are also adapting. Salesforce has pioneered what it calls the “Agentic Enterprise Licence Agreement” — a fixed-price, consumption-based model designed to decouple revenue from headcount. ServiceNow and Microsoft are shifting toward outcome-based pricing. These moves acknowledge the threat and attempt to neutralise it by changing the unit of value from the human user to the business outcome.

    But adaptation comes at a cost. Per-seat licensing has been the engine of SaaS margins for two decades. Moving to consumption or outcome-based models compresses revenue predictability and margins in the short term, even if it preserves relevance in the long term. The transition is not painless, and investors know it.

    Where this connects

    This development sits at the intersection of several threads maddaisy has been tracking. Capgemini’s CEO, Aiman Ezzat, argued last week that organisations are deploying AI capabilities ahead of their ability to absorb them. He is right — but that does not mean the structural pressure on SaaS pricing will wait for organisations to catch up. The market reprices on expectations, not on deployment maturity.

    And the consulting pyramid piece from yesterday noted that the industry’s base is being reshaped as AI compresses the need for junior analytical work. A similar compression is now visible in enterprise software: the middle layer of the stack — the specialised tools that automate individual workflows — faces pressure from platforms that automate across workflows.

    The question for enterprise software vendors is not whether AI agents will change their businesses. It is whether they can shift their pricing, their value propositions, and their competitive moats fast enough to remain the platform of choice — rather than becoming the legacy infrastructure that sits beneath someone else’s agent layer.

    For practitioners evaluating their own technology stacks, the practical implication is this: the next software audit should not just ask what each tool costs per seat. It should ask what happens to that cost when half the seats belong to agents that do not need a licence.

  • The Agentic AI Governance Playbook Is Taking Shape. Most Enterprises Are Not Ready for It.

    The governance playbook for agentic AI is starting to take shape — and it looks nothing like the frameworks most enterprises currently rely on.

    Over the past month, a cluster of regulatory bodies, law firms, and industry coalitions have published guidance specifically addressing agentic AI systems. The UK Information Commissioner’s Office released its first Tech Futures report on agentic AI in January. Mayer Brown published a comprehensive governance framework in February. The Partnership on AI identified agentic governance as the top priority among six for 2026. And FINRA’s 2026 oversight report now includes a dedicated section on AI agents as an emerging threat to financial markets.

    Taken individually, none of these publications is remarkable. Taken together, they represent something maddaisy has been tracking since mid-February: the governance conversation is finally catching up to the deployment reality.

    The problem these frameworks are trying to solve

    When maddaisy examined the agentic AI governance gap in February, the numbers were stark: three-quarters of enterprises planning to deploy agentic AI, but only one in five with a mature governance model. Subsequent analysis of agentic drift — where systems degrade gradually without triggering alarms — and shadow AI adoption revealed that the risks are not hypothetical. They are already materialising in production environments.

    The core issue, as these new frameworks make explicit, is that agentic AI does not fit within existing oversight models. Traditional AI governance assumes a human reviews the output before acting on it. Agentic systems are the actor. They plan, execute, and adapt autonomously — booking appointments, approving procurement, triaging complaints, managing collections. Governance cannot happen after the fact. It must be embedded in real time.

    What the emerging frameworks have in common

    Despite coming from different jurisdictions and institutions, the recent guidance converges on several principles that practitioners should note.

    Least privilege by default. Mayer Brown’s framework emphasises restricting what an agent can access — not just what it can do. Agents should not have standing access to sensitive databases, trade secrets, or systems beyond their immediate task scope. This mirrors the zero-trust approach that cybersecurity teams have adopted over the past decade, now applied to autonomous software.

    Human checkpoints at decision boundaries, not everywhere. The emerging consensus rejects both extremes: fully autonomous operation with no oversight, and human approval for every action (which would negate the point of agentic AI). Instead, the frameworks advocate for defined boundaries — moments where human approval is required before the agent proceeds. These include irreversible actions, decisions in regulated domains such as healthcare or financial services, and any step that falls outside the agent’s defined scope.

    Real-time monitoring, not periodic audits. The ICO report and Mayer Brown both stress continuous behavioural monitoring after deployment. This addresses the drift problem directly: an agent that passed all review gates at launch may behave differently three months later after prompt adjustments, model updates, and tool changes. Logging the full chain of reasoning and actions — not just inputs and outputs — is becoming a baseline expectation.

    Transparency to users. Multiple frameworks now explicitly require that organisations disclose when a customer is interacting with an AI agent rather than a human. The ICO notes that agentic AI “magnifies existing risks from generative AI” because these systems can rapidly automate complex tasks and generate new personal information at scale. Users need to know what they are dealing with — and they need a route to a human at any point.

    Value-chain accountability. The Partnership on AI flags a gap that most enterprise governance programmes have not addressed: who is responsible when something goes wrong in a multi-agent system? If an agent calls another agent, which calls a third-party tool, which accesses a database — and the outcome is harmful — the liability chain is unclear. Their recommendation: establish an agreed taxonomy for the AI value chain before deployment, not after an incident.

    Where the frameworks fall short

    For all the convergence, there are notable gaps. None of the published guidance adequately addresses the measurement problem. Adobe’s 2026 AI and Digital Trends Report found that only 31% of organisations have implemented a measurement framework for agentic AI. Without clear metrics for what “good governance” looks like in practice, the frameworks risk becoming compliance theatre — policies that exist on paper but do not change how agents actually operate.

    There is also limited guidance on cross-border deployment. The Partnership on AI calls for international coordination, but the practical reality — as maddaisy’s analysis of America’s state-by-state regulatory fragmentation highlighted — is that even within a single country, compliance requirements vary dramatically. An agent deployed in New York now faces transparency requirements under the RAISE Act that do not apply in Texas, where the Responsible AI Governance Act takes a different approach. For multinational enterprises, the compliance surface is formidable.

    What this means for practitioners

    The practical takeaway is straightforward, if demanding. Organisations deploying or planning to deploy agentic AI should be doing four things now.

    First, audit existing governance frameworks against the agentic-specific requirements these publications outline. Most enterprises have AI policies designed for advisory systems. Those policies almost certainly do not cover autonomous execution, multi-agent coordination, or real-time behavioural monitoring.

    Second, define decision boundaries before deployment. Which actions require human approval? What constitutes an irreversible decision in your context? Where are the regulatory tripwires? These questions are easier to answer before an agent is in production than after it has been running for six months.

    Third, invest in observability infrastructure. As maddaisy noted in the agentic drift analysis, the systems that fail most dangerously are the ones that appear to be working. Full execution logging, behavioural baselines, and anomaly detection are not optional extras — they are the minimum viable governance stack for agentic systems.

    Fourth, assign clear ownership. Mayer Brown’s framework identifies four distinct governance roles: decision-makers who set policy, product teams who implement it, cybersecurity teams who integrate agents into security procedures, and frontline employees who can identify and escalate issues. Most organisations have not mapped these responsibilities for their agentic deployments.

    The governance race is just starting

    The gap between agentic AI deployment and agentic AI governance remains wide. But the direction of travel is now clear: regulators, industry bodies, and legal advisors are converging on a set of principles that will become the baseline expectation. Organisations that build these capabilities now — real-time monitoring, defined decision boundaries, value-chain accountability, and clear ownership — will be better positioned than those scrambling to retrofit governance after their first incident.

    The frameworks are not perfect. They will evolve. But the era of governing agentic AI with policies designed for chatbots is ending. For consultants and technology leaders advising enterprises on AI deployment, that shift should be shaping every engagement.

  • OpenAI’s Frontier Alliance Confirms What Consultants Already Knew: AI Vendors Cannot Scale Alone

    OpenAI announced on 23 February that it has formed multi-year “Frontier Alliances” with McKinsey, Boston Consulting Group, Accenture, and Capgemini. The four firms will help sell, implement, and scale OpenAI’s Frontier platform — an enterprise system for building, deploying, and governing AI agents across an organisation’s technology stack.

    For readers who have been following maddaisy’s coverage of the consulting industry’s AI pivot, this is not a surprise. It is the logical next step in a pattern that has been building for months — and it tells us more about the limits of AI vendors than about the ambitions of consulting firms.

    The vendor cannot scale alone

    The most revealing line in the announcement came from Capgemini’s chief strategy officer, Fernando Alvarez: “If it was a walk in the park, OpenAI would have done it by themselves, so it’s recognition that it takes a village.”

    That candour is worth pausing on. OpenAI’s enterprise business accounts for roughly 40% of revenue, with expectations of reaching 50% by the end of the year. The company has already signed enterprise deals with Snowflake and ServiceNow this year and appointed Barret Zoph to lead enterprise sales. Yet it still needs consulting firms — with their existing client relationships, implementation expertise, and organisational change capabilities — to get its technology into production at scale.

    This is not a story about OpenAI’s generosity in sharing the enterprise market. It is an admission that the gap between a capable AI platform and a working enterprise deployment remains stubbornly wide. As maddaisy reported last week, PwC’s 2026 CEO Survey found that 56% of chief executives still cannot point to measurable revenue gains from their AI investments. The technology is not the bottleneck. Integration, governance, and organisational readiness are.

    A clear division of labour

    The alliance structure reveals how OpenAI sees the enterprise AI value chain. McKinsey and BCG are positioned as strategy and operating model partners — helping leadership teams determine where agents should be deployed and how workflows need to be redesigned. BCG CEO Christoph Schweizer noted that AI must be “linked to strategy, built into redesigned processes, and adopted at scale with aligned incentives.”

    Accenture and Capgemini take the systems integration role: data architecture, cloud infrastructure, security, and the unglamorous work of connecting Frontier to the CRM platforms, HR systems, and internal tools that enterprises actually run on. Each firm is building dedicated practice groups and certifying teams on OpenAI technology. OpenAI’s own forward-deployed engineers will sit alongside them in client engagements.

    This two-tier model — strategy at the top, integration at the bottom — maps neatly onto the consulting industry’s existing hierarchy. It also creates a clear dependency: OpenAI provides the platform, the consultancies provide the last mile.

    The maddaisy continuity thread

    This announcement intersects with several stories maddaisy has been tracking. When we examined McKinsey’s 25,000 AI agent deployment, the question was whether the firm’s aggressive internal build-out was a first-mover advantage or an expensive experiment. The Frontier Alliance suggests McKinsey is now positioning that internal capability as a credential — evidence that it can deploy agentic AI at scale, which it can now offer to clients through the OpenAI partnership.

    Similarly, when maddaisy covered the shift from billable hours to outcome-based consulting, the question was how firms would make the economics work. Vendor alliances like this provide part of the answer: the consulting firm brings the implementation expertise, the AI vendor provides the platform, and the client pays for outcomes rather than hours. The risk is shared across the chain.

    And Capgemini’s dual bet — adding 82,300 offshore workers while simultaneously investing in AI — now makes more strategic sense. The offshore delivery capacity is precisely what is needed to operationalise Frontier at enterprise scale. The bodies and the bots are not competing; they are complementary.

    The SaaS vendors should be nervous

    As Fortune noted, the Frontier Alliance creates a specific tension for established software-as-a-service vendors. Salesforce, Microsoft, Workday, and ServiceNow all depend on these same consulting firms to market and deploy their products. Now those consultants will also be actively promoting an alternative platform — one that positions itself as a “semantic layer” sitting above the traditional SaaS stack.

    The consulting firms are not choosing sides. They are hedging. Accenture, for instance, signed a multi-year partnership with Anthropic in December 2025 and is now a Frontier Alliance member. The firms will sell whichever platform best fits a given client’s needs, which gives them leverage over the AI vendors rather than the other way around.

    For the SaaS incumbents, however, having McKinsey and BCG actively evangelise an AI-native alternative to C-suite buyers is a development they will not welcome. Investor anxiety in this space is already elevated — shares of several enterprise software companies have been punished over concerns that customers will choose AI-native platforms over traditional offerings.

    What to watch

    The Frontier Alliance is a partnership announcement, not a set of outcomes. The real test is whether this model — AI vendor plus consulting firm — can close the deployment gap that has kept enterprise AI adoption stubbornly below expectations.

    Three things matter from here. First, whether the certified practice groups produce measurably better outcomes than the piecemeal implementations enterprises have been attempting on their own. Second, whether Frontier’s “semantic layer” architecture genuinely simplifies agent deployment or simply adds another platform layer to an already complex stack. And third, whether the consulting firms’ simultaneous alliances with competing AI vendors — OpenAI, Anthropic, Google — create genuine client value or just a more complicated sales cycle.

    For practitioners, the immediate signal is clear: the enterprise AI market is consolidating around a vendor-plus-integrator model. If your organisation is planning an agentic AI deployment, the question is no longer which model to use. It is which combination of platform, integrator, and operating model redesign will actually get agents into production — and keep them there.

  • Agentic AI Drift: The Silent Production Risk No One Is Measuring

    When maddaisy examined the agentic AI governance gap last week, the focus was on a structural mismatch: three-quarters of enterprises planning to deploy agentic AI, but only one in five with a mature governance model. That gap remains wide. But a more specific — and arguably more dangerous — operational risk is now coming into focus: agentic AI systems do not fail suddenly. They drift.

    A recent analysis published by CIO makes the case plainly. Unlike earlier generations of AI, which tend to produce identifiable errors — a wrong classification, a hallucinated fact — agentic systems degrade gradually. Their behaviour evolves incrementally as models are updated, prompts are refined, tools are added, and execution paths adapt to real-world conditions. For long stretches, everything appears fine. KPIs hold. No alarms fire. But underneath, the system’s risk posture has already shifted.

    The Problem with Demo-Driven Confidence

    Most organisations still evaluate agentic AI the way they evaluate any software feature: through demonstrations, curated test scenarios, and human judgment of output quality. In controlled settings, this looks adequate. Prompts are fresh, tools are stable, edge cases are avoided, and execution paths are short and predictable.

    Production is different. Prompts evolve. Dependencies fail intermittently. Execution depth varies. New behaviours emerge over time. Research from Stanford and Harvard has examined why many agentic systems perform convincingly in demonstrations but struggle under sustained real-world use — a gap that grows wider the longer a system runs.

    The result is a pattern that will be familiar to anyone who has managed complex software in production: a system passes all its review gates, earns early trust, and then becomes brittle or inconsistent months later, without any single change that clearly broke it. The difference with agentic AI is that the degradation is harder to detect, because the system’s outputs can still look reasonable even as the reasoning behind them has shifted.

    What Drift Actually Looks Like

    The CIO analysis includes a telling case study from a credit adjudication pilot. An agent designed to support high-risk lending decisions initially ran an income verification step consistently before producing recommendations. Over time, a series of small, individually reasonable changes — prompt adjustments for efficiency, a new tool for an edge case, a model upgrade, tweaked retry logic — caused the verification step to be skipped in 20 to 30 per cent of cases.

    No single run produced an obviously wrong result. Reviewers often agreed with the recommendations. But the way the agent arrived at those recommendations had fundamentally changed. In a credit context, that difference carries real financial and regulatory consequences.

    This is the nature of agentic drift: it is not a bug. It is the predictable outcome of complex, adaptive systems operating in changing environments. Two executions of the same agent with the same inputs can legitimately differ — that stochasticity is inherent to how modern agentic systems work. But it also means that point-in-time evaluation, one-off tests, and spot checks are structurally insufficient for production risk management.

    From Policy to Diagnostics

    When maddaisy covered the shadow AI governance challenge earlier this month, one theme was clear: governance frameworks are necessary but not sufficient. They define ownership, policies, escalation paths, and controls. What they often lack is an operational mechanism to answer a deceptively simple question: has the agent’s behaviour actually changed?

    Without that evidence, governance operates in the dark. Policy defines what should happen. Diagnostics establish what is actually happening. When measurement is absent, controls develop blind spots in precisely the live systems where agentic risk tends to accumulate.

    The Cloud Security Alliance has begun framing this as “cognitive degradation” — a systemic risk that emerges gradually rather than through sudden failure. Carnegie Mellon’s Software Engineering Institute has similarly emphasised the need for continuous testing and evaluation discipline in complex AI-enabled systems, drawing parallels to how other high-risk software domains manage operational risk.

    What Practitioners Should Watch For

    The emerging consensus points toward several operational principles for managing agentic drift:

    Behavioural baselines over output checks. No single execution is representative. What matters is how behaviour shows up across repeated runs under similar conditions. Organisations need to establish baselines — not for what an agent should do in the abstract, but for how it has actually behaved under known conditions — and then monitor for sustained deviations.

    Separate configuration changes from behavioural evidence. Prompt updates, tool additions, and model upgrades are important signals, but they are not evidence of drift on their own. What matters is persistence: transient deviations are often noise in stochastic systems, while sustained behavioural shifts across time and conditions are where risk begins to emerge.

    Treat agent behaviour as an operational signal. Internal audit teams are asking new questions about control and traceability. Regulators are paying closer attention to AI system behaviour. Platform teams are under growing pressure to demonstrate stability in live environments. “It looked fine in testing” is no longer a defensible operational posture, particularly in sectors — financial services, healthcare, compliance — where subtle behavioural changes carry real consequences.

    The Observability Gap

    This is, ultimately, the next chapter in the governance story maddaisy has been tracking. The first chapter — covered in the enforcement era analysis — was about moving from principles to rules. The second, examined through Deloitte’s enterprise data, was the gap between strategic confidence and operational readiness. This third chapter is more specific and more technical: the gap between having governance frameworks and having the observability infrastructure to make them work.

    The goal is not to eliminate drift. Drift is inevitable in adaptive systems. The goal is to detect it early — while it is still measurable, explainable, and correctable — rather than discovering it through incidents, audits, or post-mortems. Organisations that build this capability will be better positioned to deploy agentic AI at scale with confidence. Those that do not will continue to be surprised by systems that appeared stable, until they were not.

    For consultants advising on enterprise AI deployments, the implication is practical: governance reviews that stop at policy documentation are incomplete. The question to ask is not just whether a client has an AI governance framework, but whether they can tell you how their agents are behaving today compared to three months ago. If the answer is silence, that is where the work begins.

  • From Billable Hours to Shared Risk: Consulting’s AI-Driven Business Model Shift

    McKinsey’s 25,000 AI agents grabbed the headlines, but the more consequential number is the roughly one-third of the firm’s revenue now tied to outcome-based engagements. Across the industry, AI is not just changing how consultants work – it is rewriting how they get paid.

    When maddaisy.com examined McKinsey’s 25,000 AI agent deployment last week, the focus was on scale: was deploying one digital agent for every 1.6 human employees a first-mover advantage or an expensive experiment? The answer may depend less on the agent count and more on a quieter transformation happening in parallel – the shift from selling time to selling outcomes.

    The Model That Built an Industry Is Under Pressure

    For decades, consulting has run on a straightforward exchange: expertise for time, billed by the hour or the project. It is a model that has produced extraordinary margins, but it carries an inherent misalignment. Consultants profit from the complexity of a problem, not necessarily from solving it quickly.

    AI agents threaten that dynamic directly. When an algorithm can synthesise research in minutes that previously took analysts weeks, the hours-based model starts to look exposed. McKinsey’s own data – 1.5 million hours saved on search and synthesis work – quantifies exactly how much billable time AI has already removed from the equation.

    Rather than watching margins erode, McKinsey is pivoting. Speaking on the HBR IdeaCast in February, CEO Bob Sternfels confirmed that outcome-based engagements – where McKinsey co-invests alongside clients and ties fees to measurable business results – now account for roughly a third of the firm’s revenue. Two years ago, that figure was negligible.

    The Economics Only Work with AI

    This is where the 25,000 agents become strategically coherent. Outcome-based consulting is inherently riskier than fee-for-service; the firm only earns if the client succeeds. To make the economics work, you need two things: lower delivery costs and higher confidence in results.

    AI agents address both. QuantumBlack, McKinsey’s 1,700-person AI division, now drives 40% of the firm’s total work. Non-client-facing headcount has fallen 25%, while output from those teams has risen 10% – the “25 squared” model. The savings create the margin headroom needed to absorb the risk of outcome-based pricing.

    It is not just McKinsey making this calculation. BCG has deployed “forward-deployed consultants” who build AI tools directly within client organisations, effectively embedding methodology as software rather than slides. Capgemini has trained 310,000 employees on generative AI, though its agentic AI bookings only reached 10% of quarterly pipeline by Q4 2025. Accenture has stopped reporting AI bookings separately because, as the firm noted in its Q1 fiscal 2026 results, AI is now embedded in virtually every engagement.

    The Client Side of the Equation

    The timing is not coincidental. As maddaisy.com reported last week, PwC’s 2026 CEO Survey found that 56% of chief executives still cannot demonstrate revenue gains from AI. When the client cannot prove value, the consultancy offering to underwrite outcomes holds a powerful negotiating position – essentially saying, “we believe in this enough to stake our fees on it.”

    The irony is that consultancies are asking clients to trust their AI capabilities while the industry’s own track record on AI delivery remains uneven. Deloitte’s own 2026 State of AI report found that 42% of organisations consider their AI strategy “highly prepared” but feel markedly less ready on infrastructure, data governance, and talent.

    McKinsey faces this credibility gap from a different direction. The firm’s State of AI research found that only 5% of companies globally see AI hitting their bottom line. Positioning itself as the firm that can deliver measurable outcomes means McKinsey is, in effect, claiming to solve a problem that its own research says almost no one has solved.

    What Changes for Practitioners

    For consultants and the organisations that hire them, three practical implications stand out.

    Procurement shifts. If outcome-based pricing becomes the norm, procurement teams will need to evaluate consulting engagements more like joint ventures than service contracts. That means assessing the firm’s AI capabilities, data infrastructure, and delivery methodology – not just the partner’s credentials and the day rate.

    The talent model is splitting. Sternfels has been explicit that McKinsey wants “great consultants and/or great technologists, groomed to be both.” The traditional path – from analyst to associate to engagement manager – now runs alongside a technical track where consultants build and deploy AI systems. BCG’s vibe-coding consultants are an early version of this hybrid role.

    Governance becomes shared. When a consultancy co-invests in outcomes and deploys AI agents within a client’s operations, the governance question becomes bilateral. As maddaisy.com has covered extensively, the gap between AI deployment speed and governance maturity is already the defining risk of 2026. Outcome-based models widen this gap further, because neither party has clear precedent for who owns the risk when an AI agent produces flawed analysis that drives a business decision.

    The Bigger Picture

    The consulting industry has weathered previous disruptions – offshoring, automation, the rise of in-house strategy teams – by evolving its value proposition. This time, the change is structural. AI agents do not just reduce the cost of delivering advice; they make it possible to charge for results instead.

    McKinsey’s 25,000 agents are best understood not as a technology deployment but as a financial instrument – the infrastructure that underwrites a new revenue model. Whether the model works depends on something no agent can automate: whether clients actually achieve the outcomes both parties are betting on.

  • Lloyds Banking Group’s £100 Million AI Bet: What the UK’s First Agentic Financial Assistant Means for Enterprise AI

    Lloyds Banking Group expects its artificial intelligence programme to deliver more than £100 million in value this year — double the £50 million it attributes to generative AI in 2025. The figures, disclosed alongside the group’s annual results in January 2026, represent one of the more concrete attempts by a major financial institution to attach a number to what AI is actually worth.

    That specificity matters. As maddaisy examined last week, PwC’s 2026 Global CEO Survey found that 56% of chief executives still cannot point to measurable revenue gains from their AI investments. Lloyds is not claiming to have solved the ROI puzzle entirely, but it is doing something most enterprises have not: publishing the numbers and tying them to specific operational improvements rather than vague promises of transformation.

    The financial assistant: what it actually does

    The headline initiative is a customer-facing AI financial assistant, which Lloyds describes as the first agentic AI tool of its kind offered by a UK bank. Announced in November 2025 and scheduled for public rollout in early 2026, the assistant sits within the Lloyds mobile app and is designed to help customers manage spending, savings, and investments through natural conversation.

    The system uses a combination of generative AI for its conversational interface and agentic AI to process requests and execute actions. In practical terms, a customer can query a payment, ask for a spending breakdown, or request guidance on savings options — and the assistant will interpret the request, plan the necessary steps, and carry them out. Where it reaches the limits of what automated support can handle, it refers users to human specialists.

    The scope is intended to expand. Lloyds has said the assistant will eventually cover its full product suite, from mortgages to car finance to protection products, serving its 28 million customer accounts across the Lloyds, Halifax, Bank of Scotland, and Scottish Widows brands.

    Testing at scale, not in a lab

    Before public launch, Lloyds tested the assistant with approximately 7,000 employees, who collectively completed around 12,000 trials. That is a meaningful pilot — large enough to surface edge cases and failure modes that a controlled lab environment would miss, and conducted with users who understand the bank’s products well enough to stress-test the system’s accuracy.

    The employee testing sits alongside a broader internal AI deployment that has already delivered measurable results. Athena, the group’s AI-powered internal search assistant, is used by 20,000 colleagues and has reduced information search times by 66%. GitHub Copilot, deployed to 5,000 engineers, has driven a 50% improvement in code conversion for legacy systems. An AI-powered HR assistant resolves 90% of queries correctly on the first attempt.

    These are not experimental pilots. They are production tools used at scale, and the fact that Lloyds is willing to attach specific performance metrics to each one distinguishes its approach from the many enterprises that describe AI impact in qualitative terms only.

    The ROI question: credible or convenient?

    The £100 million figure invites scrutiny, and it should. “Value” in corporate AI disclosures is notoriously slippery — it can mean cost savings, time savings converted to a monetary equivalent, revenue uplift, or some combination of all three. Lloyds has not published a detailed methodology for how it arrived at the £50 million figure for 2025 or how it projects the 2026 target.

    That said, the bank’s approach has features that lend it more credibility than many comparable claims. The internal tools have named user populations and specific performance benchmarks. The customer-facing assistant was tested with thousands of employees before launch, not unveiled as a concept. And the 2025 figure is presented as a delivered outcome, not a forecast — a distinction that matters when most enterprises are still struggling to prove any return at all.

    Lloyds also rose 12 places in the Evident AI Global Index last year — the strongest improvement of any UK bank — suggesting that external assessors see substance behind the claims.

    Agentic AI in financial services: the governance dimension

    The move to customer-facing agentic AI in banking raises governance questions that go beyond what internal productivity tools require. As maddaisy explored earlier this week, Deloitte’s 2026 AI report found that only one in five enterprises has a mature governance model for agentic systems. When those systems move from internal search assistants to customer-facing financial advice, the stakes escalate considerably.

    A banking AI that can execute transactions, provide savings guidance, and eventually handle mortgage queries operates in regulated territory. The Financial Conduct Authority’s expectations around suitability, fair treatment, and clear communication apply regardless of whether the advice comes from a human or an algorithm. Lloyds has acknowledged this by building in human referral pathways, but the real test will come at scale — when millions of customers interact with the system simultaneously, and edge cases multiply.

    Ron van Kemenade, the group’s chief operating officer, has framed the launch as “a pivotal step in our strategy as we continue to reimagine the Group for our customers and colleagues.” Ranil Boteju, chief data and analytics officer, has positioned it as a demonstration of responsible deployment, noting that the assistant “can understand and respond to specific, hyper-personalised customer requests and retains memory to offer a more holistic experience, ensuring the generated answer is safe to present to customers.”

    What this signals for the sector

    Lloyds is not the first bank to deploy AI, nor the first to make bold claims about its value. What distinguishes this move is the combination of a concrete financial baseline (£50 million delivered), a named and tested product (the financial assistant), a clear expansion roadmap (full product suite), and an institutional commitment to upskilling (a new AI Academy for its 67,000 employees).

    For practitioners watching the enterprise AI landscape, the Lloyds case offers a useful reference point against the prevailing narrative of deployment fatigue and unproven returns. It does not resolve the broader ROI question — one bank’s results do not establish an industry pattern — but it does suggest that organisations which invest in specific, measurable use cases and test rigorously before launch can move beyond the proof-of-concept purgatory that still traps most enterprises.

    The harder question is what happens next. An AI assistant that helps customers check spending patterns is useful. One that advises on mortgages and investment products enters a different category of risk and regulatory complexity. How Lloyds navigates that expansion — and whether the £100 million value target holds up under the scrutiny of real-world deployment — will be worth watching over the months ahead.

  • McKinsey’s 25,000 AI Agents: First-Mover Advantage or the Industry’s Biggest Experiment?

    McKinsey now counts 25,000 AI agents among its workforce — roughly one for every 1.6 human employees. That ratio, disclosed by CEO Bob Sternfels at the Consumer Electronics Show and confirmed by the firm, makes the consultancy’s internal agentic build-out one of the most aggressive in professional services.

    The numbers have moved quickly. Eighteen months ago, McKinsey operated a few thousand agents. Today, through its AI arm QuantumBlack, AI-related work accounts for 40% of the firm’s output. The agents have saved an estimated 1.5 million hours on search and synthesis tasks. Non-client-facing headcount has fallen 25%, yet output from those teams has risen 10%.

    Sternfels’s stated ambition is to pair every one of McKinsey’s 40,000 employees with at least one AI agent within the next 18 months.

    Scale versus substance

    The scale is eye-catching. Whether it is meaningful depends on what you count as an agent and how you measure the return.

    McKinsey’s rivals are openly sceptical. EY’s global engineering chief has argued that “a handful of agents do the heavy lifting” and that value should be tracked through efficiency KPIs, not headcount. PwC’s chief AI officer has called agent count “probably the wrong measure”, advocating instead for quality and workflow optimisation. Their counterargument is clear: a smaller fleet of high-performing agents, rigorously measured, may deliver more than a vast deployment still being calibrated.

    The critique lands on familiar ground. As maddaisy examined earlier today, PwC’s 2026 Global CEO Survey found that 56% of chief executives still cannot point to revenue gains from their AI investments. The deployment-versus-outcomes gap is the central tension in enterprise AI right now, and McKinsey’s bet raises the question of whether the firm is racing ahead of the same problem — or solving it.

    From advisory to infrastructure

    The more consequential shift may be in McKinsey’s business model. Sternfels described a move away from the firm’s traditional fee-for-service approach toward a model where McKinsey works with clients to identify joint business cases and then helps underwrite the outcomes.

    This is a significant departure for a firm built on advisory fees and billable hours. It positions McKinsey less as a strategic counsellor and more as an infrastructure partner — one that brings its own AI workforce to bear on client problems and shares in the measurable results.

    QuantumBlack, with 1,700 people, now drives all of McKinsey’s AI initiatives. Alex Singla, the senior partner who co-leads the unit, has described the firm’s evolving recruitment profile: candidates who can move fluidly between traditional consulting and engineering, and who can work alongside AI rather than simply directing it.

    Boston Consulting Group is pursuing a similar direction, deploying “forward-deployed consultants” who build AI tools directly on client projects. But McKinsey’s scale of internal adoption — 25,000 agents embedded across the firm — gives it a data advantage that is harder to replicate. Every internal deployment generates operational insight into what works, what fails, and how agentic systems behave at enterprise scale.

    The governance question maddaisy has been tracking

    The timing of McKinsey’s announcement is worth noting against the backdrop of the agentic AI governance gap maddaisy covered earlier this week. Deloitte’s data showed that only 21% of companies have mature governance models for agentic AI, even as three-quarters plan to deploy it within two years. And a broader pattern has emerged across maddaisy’s recent coverage: enterprises are strategically confident about AI but operationally underprepared.

    McKinsey, as both a deployer and an adviser, sits at the intersection of this tension. If the firm can demonstrate that 25,000 agents operate reliably at scale — with governance, measurement, and accountability frameworks to match — it will have built the most persuasive case study in the industry. If the agents outrun oversight, the reputational exposure is equally significant. When an AI agent produces an analysis and the recommendation proves wrong, the liability question is not academic.

    What practitioners should watch

    For consulting professionals and enterprise leaders watching this play out, three things matter more than the headline number.

    First, the metric that matters is not agent count but outcome attribution. McKinsey’s 1.5 million hours saved is a process metric. The firm’s shift to underwriting client outcomes suggests it understands the need to move beyond efficiency and toward measurable business impact — the same gap that PwC’s CEO Survey identified industry-wide.

    Second, the talent model is changing faster than many firms acknowledge. McKinsey’s search for hybrid consultant-engineers, and BCG’s forward-deployed model, signal that the traditional consulting skill set is being augmented, not just supported, by AI fluency. Firms that treat AI as a productivity tool rather than a workforce design challenge will fall behind.

    Third, scale creates its own governance requirements. As McKinsey’s own Carolyn Dewar argued in Fortune, the real risk is not the technology but how leaders manage the fear and trust dynamics that surround it. Deploying 25,000 agents without the organisational infrastructure to govern them would validate every concern the firm’s rivals have raised.

    McKinsey’s wager is that first-mover scale in agentic AI creates a compounding advantage — more data, better workflows, stronger client proof points. The industry is about to find out whether volume leads to value, or whether a smaller, sharper approach gets there first.

  • The Governance Gap No One Is Closing: Why Agentic AI Is Outrunning Enterprise Oversight

    Three-quarters of enterprises plan to deploy agentic AI within two years. Only one in five has a mature governance model for it. That arithmetic should concern anyone responsible for enterprise technology strategy.

    The figures come from Deloitte’s 2026 State of AI in the Enterprise report, and they represent something more specific than the familiar story of AI adoption outpacing regulation. The challenge with agentic AI is not that rules do not exist — as maddaisy recently examined, the EU AI Act, Colorado’s AI Act, and a growing patchwork of global regulations are creating real enforcement deadlines. The challenge is that agentic systems demand a fundamentally different kind of oversight, and most organisations have not built the operational machinery to provide it.

    What makes agentic AI different

    Conventional AI systems — including the generative AI tools that have dominated enterprise adoption over the past two years — operate in an advisory mode. They suggest, summarise, draft, and classify. A human reviews the output and decides what to do with it. Governance for these systems, while imperfect, fits within existing frameworks: you audit the model, monitor outputs, and maintain a human in the decision loop.

    Agentic AI breaks that model. These systems are designed to plan, execute, and adjust autonomously — booking flights, approving procurement decisions, triaging customer complaints, or managing collections workflows without waiting for human sign-off. Oracle’s new agentic banking platform, launched in February 2026, illustrates the trajectory: domain-specific agents handle loan originations, credit decisioning, and compliance checks, with human oversight positioned as a “human-in-the-loop” role rather than a gatekeeping one.

    The distinction matters because it changes where governance must operate. With advisory AI, oversight happens after the model produces an output and before a human acts on it. With agentic AI, the system is the actor. Governance must be embedded in real time — monitoring agent behaviour as it happens, enforcing boundaries on what an agent can and cannot do, and maintaining audit trails that capture not just decisions but the full chain of reasoning and actions that led to them.

    The 21% problem

    Deloitte’s finding that only 21% of companies have a mature agentic AI governance model is striking, but the detail beneath it is more revealing. In Singapore, where deployment ambitions are among the highest globally — 72% of businesses plan to deploy agentic AI across multiple operational areas within two years, up from 15% today — the mature governance figure drops to just 14%.

    As maddaisy noted in its analysis of the broader Deloitte report, this fits a wider pattern: organisations are increasingly confident in their AI strategy but declining in readiness on the operational foundations needed to execute it. The agentic governance gap is perhaps the sharpest expression of this paradox — a technology that is advancing from pilot to production while the controls needed to run it safely remain in early stages.

    Half of Singapore respondents reported using a patchwork of public and internal proprietary frameworks to assess agent risk and performance. That is not a governance model — it is improvisation.

    Why existing frameworks fall short

    The AI Trends Report 2026, published by statworx and AI Hub Frankfurt, identifies three operational disciplines that are becoming foundational for reliable agentic AI: AI governance, DataOps, and what the report terms AgentOps — the operational layer for managing autonomous AI agents in production.

    AgentOps is a useful concept because it captures what most enterprise governance frameworks currently lack. Traditional AI governance focuses on model development: training data quality, bias testing, documentation, and approval workflows before deployment. That is necessary but insufficient for systems that learn, adapt, and take actions in production environments.

    Agentic systems require runtime governance: clear boundaries on agent autonomy (what decisions can the agent make independently, and which require escalation?), real-time monitoring of agent behaviour against expected parameters, kill switches for when agents drift outside acceptable bounds, and comprehensive audit trails that regulators can inspect after the fact.

    The EU AI Act’s requirements for high-risk systems — documented risk management, technical logging, human oversight mechanisms, and conformity assessments — implicitly assume this kind of operational infrastructure. But most organisations have not yet translated those requirements into engineering reality.

    Deployment is not waiting for governance

    The uncomfortable truth is that agentic AI is entering production regardless of whether governance is ready. Oracle is shipping banking agents now. Companies like AMD and Heathrow Airport are deploying autonomous agents in customer experience roles. Gartner predicts agentic systems will autonomously resolve 80% of customer service issues by 2028.

    Constellation Research offers a useful counterweight to the hype, arguing that agentic AI is “more of a feature than a revolution” and that the real measure of value is decision velocity — how quickly smaller decision trees and processes can be automated at scale. This framing is helpful because it reduces the abstraction. An AI agent rebooking a flight is not a paradigm shift; it is a process automation with a more sophisticated reasoning layer. But that reasoning layer is precisely what makes governance harder. The agent is not following a static script — it is making contextual judgements, and those judgements need oversight.

    What the governance gap actually costs

    The business case for closing the governance gap is not primarily about regulatory fines, though those are real. It is about operational risk. When an agentic system autonomously commits to a procurement decision, misprices a financial product, or gives a customer incorrect information with real-world consequences, the liability question is immediate and the reputational exposure is direct.

    It is also about scaling. Deloitte’s data shows that companies with stronger governance foundations are deploying agentic AI more successfully — they start with lower-risk use cases, build governance capabilities alongside deployment, and scale deliberately. Organisations that skip the governance step find themselves either slowing down when something goes wrong or, worse, not knowing that something has gone wrong until a regulator or customer tells them.

    What needs to happen

    The gap between agentic AI deployment and agentic AI governance is not going to close on its own. Three practical steps can narrow it.

    Define agent autonomy boundaries explicitly. For every agentic AI deployment, organisations need a clear specification of what the agent can do independently, what requires human approval, and what is prohibited. These boundaries should be codified in the system, not just written in a policy document. The Oracle banking platform’s “human-in-the-loop” architecture is one model, but even that needs specificity about when and how the loop engages.

    Invest in runtime monitoring, not just pre-deployment testing. The governance challenge with agentic AI is that it operates continuously and adapts to context. Pre-deployment audits are necessary but not sufficient. Organisations need real-time monitoring that tracks agent decisions against expected parameters and flags anomalies before they compound.

    Build audit trails as engineering infrastructure. When a regulator asks how an agent arrived at a specific decision — and under the EU AI Act, they will — the organisation needs to produce a complete chain of the agent’s reasoning, data inputs, and actions. This is not a reporting challenge; it is an engineering one that needs to be designed into the system from the start, not retrofitted after deployment.

    The agentic AI governance gap is not a future problem. It is a present one, widening with every new deployment. The organisations that treat governance as a technical discipline — building it into the engineering of their agentic systems rather than bolting it on as a compliance afterthought — will have a structural advantage as the technology matures. Those that do not will discover, as many enterprises have with earlier waves of technology adoption, that the cost of retrofitting oversight always exceeds the cost of building it in.