Tag: ai-adoption

  • The Three-Tool Threshold: BCG Research Reveals Where AI Productivity Gains Turn Into Cognitive Overload

    For months, the evidence that AI tools are intensifying work rather than simplifying it has been accumulating. maddaisy has tracked this story from the UC Berkeley research showing employees absorbing more tasks under AI, through the organisational failures that leave workers unsupported, to the implementation problems that bake burnout into the system from day one. What was missing was a specific threshold — a number that tells enterprises where the gains end and the damage begins.

    Boston Consulting Group has now supplied one. In a study published in Harvard Business Review this month, researchers surveyed 1,488 full-time US workers and found a clean break point: employees using three or fewer AI tools reported genuine productivity gains. Those using four or more reported the opposite — declining productivity, increased mental fatigue, and higher error rates. BCG calls the phenomenon “AI brain fry.”

    The finding is not just academic. Among workers reporting brain fry, 34% expressed active intention to leave their employer, compared with 25% of those who did not. For a workforce already under pressure from rapid technology deployment, that nine-percentage-point gap represents a tangible retention risk.

    The cognitive cost no one budgeted for

    The BCG research puts numbers to something the UC Berkeley study identified in qualitative terms earlier this year. When AI tools require high levels of oversight — reading, interpreting, and verifying LLM-generated content rather than simply delegating administrative tasks — workers expend 14% more mental effort. They experience 12% greater mental fatigue and 19% more information overload.

    Many respondents described a “fog” or “buzzing” sensation that forced them to step away from their screens. Others reported an increase in small mistakes — exactly the kind of errors that compound in professional services, financial analysis, and other high-stakes environments.

    “People were using the tool and getting a lot more done, but also feeling like they were reaching the limits of their brain power,” Julie Bedard, the study’s lead author and a managing director at BCG, told Fortune. “Things were moving too fast, and they didn’t have the cognitive ability to process all the information and make all the decisions.”

    This aligns with what maddaisy has previously described as the task expansion pattern: when AI makes certain tasks faster, employees do not use the freed-up time for strategic thinking. They absorb more work. The BCG data now suggests the breaking point arrives sooner than most organisations assume — at the fourth tool, not the tenth.

    The macro picture is equally sobering

    The three-tool threshold sits against a broader backdrop of underwhelming AI productivity data at scale. A Goldman Sachs analysis published this month found “no meaningful relationship between productivity and AI adoption at the economy-wide level,” with measurable gains confined to just two domains: customer service and software development.

    Separately, a survey of 6,000 C-suite executives found that 90% saw no evidence of AI impacting productivity or employment in their workplaces over the past three years. Their median forecast: a 1.4% productivity increase over the next three. That is hardly the transformation narrative that justified billions in enterprise AI spending.

    These findings do not mean AI is useless. The Federal Reserve Bank of St. Louis estimated a 33% hourly productivity boost for workers during the specific hours they use generative AI. The problem is that this micro-level gain does not scale linearly. Adding more tools, more prompts, and more AI-generated outputs does not multiply the benefit — it multiplies the cognitive overhead.

    What the threshold means for enterprises

    The practical implications are straightforward, even if they run against the instincts of most technology procurement processes.

    First, fewer tools, better deployed. The BCG data suggests that organisations would get better results from consolidating around two or three well-integrated AI tools than from giving every team access to every available platform. This runs counter to the current market dynamic, where vendors push specialised AI tools for every function — writing, coding, data analysis, scheduling, customer interaction — and enterprises buy them all to avoid falling behind.

    Second, oversight design matters as much as tool selection. The highest cognitive costs were associated with tasks requiring workers to interpret and verify AI output, not with AI performing autonomous background work. Enterprises that can shift more AI usage toward the latter — automated workflows, pre-verified data processing, agent-completed administrative tasks — will impose less cognitive strain on their people.

    Third, training needs to include when not to use AI. As maddaisy has previously noted, most organisations treat AI capability-building as a deployment event rather than a sustained practice. The BCG researchers found that when managers provided ongoing training and support, brain-fry symptoms decreased. The Berkeley team suggested batching AI-intensive work into specific time blocks rather than leaving it on all day — a scheduling discipline that few organisations currently enforce.

    The next chapter in a familiar story

    The AI-productivity narrative is following a pattern that technology historians will recognise. Early adopters see real gains. Organisations rush to scale. The gains plateau or reverse as implementation complexity outpaces human capacity to manage it. Eventually, a more measured approach emerges — not abandoning the technology, but deploying it with greater discipline.

    The BCG three-tool threshold may turn out to be an early data point rather than a universal law. But it offers something that has been missing from the AI-adoption conversation: a concrete starting point for right-sizing the technology stack to what human cognition can actually sustain.

    For consultants advising on AI transformation, that is a message worth delivering — even when it runs counter to the vendor pitch deck.

  • The Venture Subsidy Era for AI Is Ending. Enterprise Budgets Are Not Ready.

    For the past three years, enterprises have been building their AI strategies on pricing that does not reflect reality. The era of venture-subsidised AI — where a ChatGPT query costs pennies despite burning roughly ten times the energy of a Google search — is approaching its expiry date. The question is not whether prices will rise. It is whether organisations have budgeted for what comes next.

    The subsidy model, laid bare

    The numbers tell the story clearly enough. OpenAI’s own internal projections show $14 billion in losses for 2026, against roughly $13 billion in revenue. Total spending is expected to reach approximately $22 billion this year. Across the 2023–2028 period, the company expects to lose $44 billion before turning cash-flow positive sometime around 2029 or 2030.

    Anthropic’s trajectory looks different but carries the same structural tension. The company hit $19 billion in annualised revenue by March 2026, growing more than tenfold annually. But its gross margins sit at around 40% — a long way from the 77% it needs to justify its $380 billion valuation. That gap has to close, and it will not close through efficiency gains alone.

    Both companies have raised staggering sums to sustain the current pricing. OpenAI’s $110 billion round in February valued it at $730 billion. Anthropic’s $30 billion Series G came from a coalition including GIC, Microsoft, and Nvidia. This is venture capital on a scale that makes the ride-hailing subsidy wars look modest — and, like those wars, it is designed to capture market share before the real pricing arrives.

    The millennial lifestyle subsidy, enterprise edition

    The pattern is familiar. Uber and DoorDash used investor capital to underwrite artificially cheap services, building habits and dependencies before gradually raising prices toward sustainable levels. AI providers are running the same playbook, but the stakes are larger. When Uber raised fares, consumers grumbled and occasionally took the bus. When AI API costs increase threefold — which industry analysts suggest may be the minimum adjustment needed for sustainable economics — enterprises will face a different kind of reckoning.

    The reckoning is already starting at the platform level. Microsoft will raise commercial pricing across its entire 365 suite from July 2026, with increases of 8–17% depending on the tier. The company attributed the rises to AI capabilities such as Copilot Chat being embedded into standard subscriptions. Its new $99-per-user E7 tier bundles Copilot, identity management, and agent orchestration tools — positioning AI not as an optional add-on but as a cost baked into the platform itself.

    The broader enterprise software market is following the same trajectory. Gartner forecasts enterprise software spend rising at least 40% by 2027, with generative AI as the primary accelerant. Average annual SaaS price increases now range from 8–12%, with aggressive movers implementing hikes of 15–25% at renewal.

    The budget gap nobody is discussing

    The disconnect between AI ambition and AI economics is widening. Organisations now spend an average of $7,900 per employee annually on SaaS tools — a 27% increase over two years. AI-native application spend has surged 108% year-on-year, reaching an average of $1.2 million per organisation. And these figures reflect the subsidised era.

    As Axios reported this week, the unusually low cost of many AI services will not survive the transition from venture-funded growth to public-market accountability. As OpenAI and Anthropic pursue potential IPOs, investors will demand the margins that current pricing cannot deliver. Subscription prices and usage-based costs are expected to rise across the industry.

    For enterprises that have been scaling AI adoption on the assumption that current costs are permanent, this represents a planning failure in the making. A consumer application with $2 in AI costs per user per month looks viable. The same application at $10 per user does not. High-volume automation workflows — precisely the use cases enterprises are most excited about — are the most vulnerable to cost increases.

    The pacing argument gains new weight

    This pricing trajectory adds a new dimension to arguments maddaisy has previously explored around pacing AI investment. When Capgemini’s CEO Aiman Ezzat cautioned against getting “too ahead of the learning curve,” his concern was primarily about deploying capabilities ahead of organisational readiness. The pricing question strengthens that case. Organisations that rush to embed AI across every workflow at subsidised rates may find themselves locked into architectures whose economics no longer work when the real costs arrive.

    Similarly, the enterprise scaling gap reported last week — where two-thirds of organisations cannot move AI past pilot stage — takes on a different character when viewed through an economic lens. The skills shortage and governance deficits that constrain scaling today may prove less urgent than the budget constraints that arrive tomorrow. Organisations struggling to scale AI at subsidised prices will find it considerably harder at market rates.

    What prudent organisations should do now

    The adjustment does not need to be dramatic, but it does need to start. Three measures stand out.

    First, stress-test AI budgets against realistic pricing. If API costs tripled tomorrow, which workflows would still deliver positive returns? The answer reveals which AI investments are genuinely valuable and which are artefacts of artificially cheap compute.

    Second, build multi-provider flexibility into the architecture. Vendor lock-in has always been a risk in enterprise technology. In AI, where pricing models are still evolving and open-source alternatives like Llama and Mistral are improving rapidly, flexibility is not just prudent — it is a hedge against the cost increases that are coming.

    Third, watch the open-source floor. The existence of capable open models creates a price ceiling that limits how aggressively commercial providers can raise rates. Organisations that invest in the capability to run open models on their own infrastructure — or through commodity inference services — will have negotiating leverage that others will not.

    The correction, not the crisis

    None of this means AI is overvalued or that enterprise adoption will stall. The technology works. The productivity gains are real. But the current pricing does not reflect the true cost of delivering those gains, and the correction will arrive gradually over the next two to four years as the industry’s largest players transition from growth-at-all-costs to sustainable economics.

    The organisations best positioned for that transition will be those that treated the subsidy era as a window for experimentation — learning which AI applications genuinely transform their operations — rather than a permanent baseline for their technology budgets. The window is closing. The question is whether the planning has already begun.

  • Nearly Every Enterprise Wants AI. Two-Thirds Cannot Scale It Past a Pilot.

    The Appetite Is Not the Problem

    Nearly every large organisation wants AI. The technology works. The budgets are approved. The pilots are running. And yet, when it comes to deploying AI at scale – moving from a successful proof of concept to a production capability that changes how the business operates – two-thirds of enterprises are stuck.

    That is the central finding of Logicalis’s 2026 CIO Report, which surveyed more than 1,000 chief information officers globally. The numbers paint a stark picture of an industry that has solved the belief problem but not the execution one: 94% of CIOs report growing organisational appetite for AI. Over a third have accelerated AI initiatives based on early proof-of-concept results. But two-thirds say they cannot scale AI beyond those initial deployments.

    The gap between wanting AI and running AI at enterprise scale has become the defining challenge of 2026. And the constraints holding organisations back are not the ones most boardrooms are discussing.

    Skills, Not Budgets, Are the Bottleneck

    The most striking finding in the Logicalis data is what CIOs identify as their primary constraint. It is not funding. It is not technology. It is skills.

    A lack of internal technical capability is holding back AI ambitions in nearly nine out of ten organisations. This is not a shortage of data scientists or machine learning engineers in the abstract – it is a shortage of people who understand how to integrate AI into existing business processes, manage its outputs, and govern its behaviour in production environments.

    The skills gap becomes more consequential as AI moves from experimentation to operation. A pilot needs a small team of enthusiasts and a sandbox. A production deployment needs data engineers, integration specialists, change managers, and – critically – people who understand both the technology and the business domain well enough to know when the AI is wrong.

    This connects directly to what Harvard Business Review reported in February: 88% of companies now report regular AI use, yet adoption is stalling because employees experiment with tools without integrating them into how work actually gets done. The tools are present. The capability to use them well is not.

    Governance Is Being Compromised, Not Solved

    If the skills gap explains why organisations cannot scale, the governance gap explains why scaling carries risk. The Logicalis report found that 62% of CIOs have compromised on AI governance due to limited knowledge, and only 44% say they fully grasp the risks of AI adoption. Meanwhile, 76% describe unchecked AI as a serious concern.

    This is not a theoretical problem. As maddaisy.com has reported extensively, the governance question follows AI into every new domain – from agentic systems that drift in production to vibe coding tools that generate enterprise software without conventional oversight. Organisations are deploying AI faster than they can build the frameworks to manage it, and most acknowledge this openly. An overwhelming 89% of CIOs in the Logicalis survey describe their current approach as “learning as we go.”

    That phrase deserves attention. It means that the majority of enterprise AI programmes are operating without mature risk management, without clear accountability structures, and without the monitoring infrastructure needed to catch problems before they compound. For regulated industries – financial services, healthcare, defence – this is not a growing pain. It is an exposure.

    The Pattern maddaisy.com Has Been Tracking

    The Logicalis data does not exist in isolation. It quantifies a pattern that has been building across multiple data points this year.

    In February, maddaisy.com reported on PwC’s Global CEO Survey, which found that 56% of chief executives could not point to measurable revenue gains from their AI investments. The diagnosis then was a measurement problem as much as a technology one – organisations deploying AI without redesigning workflows or building the instrumentation to track outcomes.

    Earlier this month, Capgemini’s CEO Aiman Ezzat made the case for pacing AI investment, arguing that companies are deploying capabilities ahead of their organisation’s ability to absorb them. EY research cited in that analysis showed organisations failing to capture up to 40% of potential AI benefits – not because the technology underperformed, but because the surrounding processes, skills, and culture were not ready.

    And in maddaisy.com’s analysis of the consulting pyramid, Eden McCallum research revealed that 95% of AI pilots have failed to deliver returns – a figure that aligns precisely with the Logicalis finding that two-thirds of organisations cannot move past the pilot stage.

    These are not isolated reports reaching coincidentally similar conclusions. They are different measurements of the same underlying problem: the bottleneck in enterprise AI has moved from technology capability to organisational readiness.

    The Managed Services Pivot

    One of the more telling details in the Logicalis report is that 94% of CIOs plan to lean on managed service providers over the next two to three years to help navigate AI governance, scaling, and sustainability. This is a quiet but significant shift in how enterprises relate to their technology infrastructure.

    It suggests that many CIOs have concluded they cannot build the required skills and governance frameworks internally – at least not at the pace the technology demands. Rather than owning and operating AI capabilities directly, they are moving toward orchestrating a network of external providers. The CIO role, in this model, becomes less about technology ownership and more about vendor management, risk oversight, and strategic coordination.

    For consulting and technology services firms, this creates a substantial market opportunity. But it also raises a question that the industry has not yet answered convincingly: if the clients cannot scale AI internally because they lack the skills and governance frameworks, and they outsource to service providers who are themselves still working out how to embed AI into their own operations, where does the actual expertise reside?

    What Practitioners Should Watch

    The adoption gap is unlikely to close quickly. Skills take time to develop. Governance frameworks take time to mature. Organisational change – the kind that turns a pilot into a production capability – is measured in quarters and years, not weeks.

    Three things are worth tracking. First, whether the 89% “learning as we go” figure starts to decline in subsequent surveys – that will be the clearest signal that enterprises are moving from experimentation to operational maturity. Second, whether the managed services pivot produces measurable outcomes or simply moves the scaling problem from one organisation to another. And third, whether the 67% of CIOs who expressed concern about an “AI bubble” translate that concern into more disciplined investment, or whether competitive pressure continues to override caution.

    The technology has arrived. The appetite is not in question. What remains unresolved is whether organisations can build the human and structural foundations fast enough to use what they have already bought.

  • The Consulting Pyramid Is Not Collapsing. It Is Being Quietly Redesigned.

    The consulting industry’s foundational staffing model — a small number of partners supported by large cohorts of junior analysts — is facing its most serious structural challenge in decades. But the narrative that AI is “dismantling” the pyramid overstates the pace and understates the complexity of what is actually happening.

    Nick Pye, managing partner at Mangrove Consulting, argued in Consultancy.uk this month that the traditional pyramid is “becoming increasingly difficult to justify.” His core thesis is straightforward: AI can now perform the analytical heavy lifting — research, financial modelling, scenario analysis — that once required rooms of graduates billing at premium rates. Clients are noticing. They want senior judgement, not junior analysis. And they increasingly have their own data capabilities, reducing dependence on external advisers for the work that used to fill the pyramid’s base.

    The argument is sound in principle. Where it risks overreach is in assuming the transition is further along than the evidence suggests.

    The data tells a more complicated story

    If AI were genuinely dismantling the consulting pyramid, one would expect to see mass reductions in headcount at the bottom. The picture is more mixed than that.

    As maddaisy reported in February, Capgemini ended 2025 with 423,400 employees — up 24% year-on-year — after adding 82,300 offshore workers in a single year. The company simultaneously announced €700 million in restructuring charges. It is not shrinking. It is reshuffling: eliminating some roles while creating others, primarily in AI engineering, data science, and agentic AI delivery.

    McKinsey, meanwhile, has begun testing new recruits on AI capabilities, and more than half of graduate roles now reportedly require AI skills. The Big Four have reduced student intakes, but the UK consulting market’s difficulties in 2025 — its worst year since lockdown — owe as much to broader demand softness as to structural AI disruption.

    And the technology itself is not yet delivering at the scale the hype implies. Eden McCallum research published this year found that while excitement for generative AI remains high, revenue impact remains minimal. Ninety-five per cent of AI pilots have failed to deliver returns, according to industry data cited by Consultancy.uk. That is not a technology that has already displaced the analyst class.

    What is genuinely changing

    None of this means the pyramid is safe. The direction of travel is clear, even if the pace is slower than the most breathless accounts suggest.

    Three shifts are converging. First, clients increasingly own and understand their own data. The analytical monopoly that consulting firms once held — gathering, processing, and synthesising information that clients could not access themselves — has eroded as organisations have built internal data teams and deployed their own AI tools.

    Second, the economics of the base are deteriorating. When an AI system can produce a comparable market analysis in minutes rather than weeks, it becomes progressively harder to justify billing rates for the same work performed by junior staff. This does not eliminate the need for human analysis, but it compresses the time and headcount required.

    Third, client expectations have shifted from deliverables to outcomes. As Pye puts it: clients want “decisions and performance,” not “decks and processes.” That shift favours experienced practitioners who can navigate organisational politics and drive implementation — not the analysts who assemble the slides.

    The diamond, the inverted pyramid, and the graduate question

    The replacement models being discussed are instructive. The most conservative is the “diamond” — wider in the middle, thinner at the base, with fewer entry-level analysts but more mid-level orchestration roles. It preserves hierarchy while acknowledging that the bottom of the pyramid has less to do.

    The more radical option is what Pye calls “flipping the pyramid”: small teams of senior and mid-level consultants tackling specific challenges, supported by AI systems rather than junior staff. Boutique consultancies have operated variations of this model for years. What AI changes is the scale at which it becomes viable.

    But neither model addresses the question that should concern the industry most: if junior roles contract, where do future senior consultants come from?

    The traditional pyramid functioned as a training pipeline. Graduates entered, learned the craft through years of analytical work, and developed into the experienced practitioners clients now prize. Close that entry point, and the industry faces a slow-motion skills crisis — a generation of senior consultants with no successors trained in the discipline.

    Pye’s answer is that consulting firms will increasingly recruit mid-career professionals who have already developed sector expertise elsewhere. The career path inverts: specialise first in an industry, then move into consulting.

    This is plausible but raises its own problems. The consulting skill set — structured problem solving, client management, the ability to diagnose organisational dysfunction — is not the same as industry expertise. A decade in financial services does not automatically produce someone who can run a transformation programme. The two capabilities overlap, but they are not identical.

    The real risk is not speed — it is the talent pipeline

    The firms that are moving fastest on AI are not necessarily the ones best positioned for the long term. Accenture is tracking AI logins for promotion decisions. OpenAI has formed alliances with McKinsey, BCG, Accenture, and Capgemini to deploy its enterprise AI platform. Capgemini’s CEO is counselling patience, arguing that deploying AI ahead of organisational readiness wastes both money and credibility.

    Each response reflects a different bet on how quickly the pyramid will change — and how to navigate the transition without breaking the firm’s ability to develop talent.

    The consulting industry is not being dismantled by AI. It is being redesigned, unevenly, firm by firm, with no consensus on the target operating model. The firms that get the balance right — reducing the base without severing the pipeline that produces tomorrow’s senior partners — will define what the industry looks like in a decade. The ones that treat this as a simple cost-cutting exercise will find, in five years, that they have cut too deep in exactly the wrong place.

  • Capgemini’s CEO Makes the Unfashionable Case for Pacing Your AI Investment

    There is a particular kind of courage in telling a room full of executives to slow down. Aiman Ezzat, CEO of Capgemini, has been doing exactly that – and his reasoning deserves more attention than the typical “move fast or die” narrative that dominates AI strategy discussions.

    “You don’t want to be too ahead of the learning curve,” Ezzat told Fortune in February. “If you are, you’re investing and building capabilities that nobody wants.”

    Coming from the head of a €22.5 billion consultancy that has trained 310,000 employees on generative AI and is actively building labs for quantum computing, 6G, and robotics, this is not a counsel of inaction. It is a strategic position on pacing – one that puts Ezzat at odds with much of the technology industry’s current mood.

    The FOMO problem

    The fear of missing out on AI has become a boardroom affliction. Boston Consulting Group reports that half of CEOs now believe their job is at risk if AI investments fail to deliver returns. That pressure creates a predictable dynamic: spend big, move fast, worry about outcomes later.

    The data suggests the worry-later approach is not working. EY research shows that while 88% of employees report using AI at work, organisations are failing to capture up to 40% of the potential benefits. In the UK, only 21% of workers felt confident using AI as of January 2026. The tools are arriving faster than the capacity to use them well.

    Ezzat’s argument is that this gap is not a technology problem. It is a pacing problem. Companies are deploying AI capabilities ahead of their organisation’s ability to absorb them – and ahead of genuine customer demand for the outcomes those capabilities promise.

    AI is a business, not a technology

    The more substantive part of Ezzat’s case is about framing. Too many leadership teams, he argues, treat AI as “a black box that’s being managed separately” – a technology initiative bolted onto the existing business rather than a force reshaping how the business operates.

    “The question you have to focus on is: ‘How can your business be significantly disrupted by AI?’” Ezzat says. “Not ‘How is your finance team going to become more efficient?’ I’m sure your CFO will deal with that at the end of the day.”

    The distinction matters. Departmental efficiency projects – automating invoice processing, summarising meeting notes, generating marketing copy – are the low-hanging fruit that most enterprises are picking right now. They deliver incremental gains but rarely transform a business model. The harder question, the one Ezzat wants CEOs to sit with, is whether AI fundamentally changes what a company sells, how it competes, or what its customers expect.

    That question takes time to answer well. Rushing it produces expensive experiments that solve the wrong problems.

    The trust deficit

    Perhaps the most underexplored part of Ezzat’s argument is about human trust. “How do you get humans to trust the agent?” he asks. “The agent can trust the human, but the human doesn’t really trust the agent.”

    This cuts to a practical reality that technology roadmaps tend to gloss over. Agentic AI – systems designed to take autonomous actions rather than simply generate content – is the next wave of enterprise deployment. As maddaisy.com noted when covering Capgemini’s role in the OpenAI Frontier Alliance, the gap between a capable AI platform and a working enterprise deployment remains stubbornly wide. Trust is a significant part of why.

    Employees who do not trust AI agents will find ways to work around them. Managers who cannot explain AI-driven decisions to clients will revert to manual processes. Organisations that deploy autonomous systems faster than their culture can absorb them will create friction, not efficiency.

    Ezzat draws an analogy to ergonomics – the mid-twentieth century discipline of designing tools for humans rather than forcing humans to adapt to tools. “Bad chairs lead to bad backs,” he observes. “Bad AI is likely to be far more consequential.”

    Consistent with Capgemini’s own playbook

    What makes Ezzat’s position credible is that Capgemini’s recent actions align with it. The company’s approach has been to invest broadly but scale selectively.

    As maddaisy.com’s analysis of the company’s 2025 results highlighted, generative AI bookings rose above 10% of total bookings in Q4 – meaningful but not yet dominant. The company maintains labs for emerging technologies including quantum and 6G, keeping a foot in multiple possible futures without betting the firm on any single one.

    Meanwhile, the company added 82,300 offshore workers in 2025 – largely through the WNS acquisition – while simultaneously earmarking €700 million for workforce restructuring. The message is clear: AI changes the shape of the workforce, but it does not eliminate the need for one. Building the human infrastructure to deliver AI at scale takes as much investment as the technology itself.

    The metaverse lesson

    Ezzat’s most pointed comparison is to the metaverse – a technology that commanded billions in corporate investment before the market concluded that customer demand had been dramatically overstated. Capgemini itself experimented with a metaverse lab. Mark Zuckerberg renamed his company around it. Now, as Ezzat puts it, “like air fryers, its time may now have passed.”

    The parallel is not that AI will follow the metaverse into irrelevance – the use cases are far more concrete, and the enterprise adoption data is already stronger. The point is about the cost of overcommitment. Companies that invested heavily in metaverse capabilities before the market was ready wrote off those investments. The same risk exists with AI, particularly in areas like agentic systems where the technology’s capability is advancing faster than organisational readiness to use it.

    Ezzat’s prescription is agility over ambition: small pilots, constant monitoring, and the willingness to scale rapidly when adoption genuinely accelerates. “We have to be investing – but not too much – to be able to be aware of the technology, following at the speed to make sure that we are ready to scale when the adoption starts to accelerate.”

    What this means for practitioners

    For consultants advising clients on AI strategy, Ezzat’s framework offers a useful counterweight to the prevailing urgency. The question is not whether to invest in AI – that debate is settled. The question is how to pace that investment so that capability, demand, and organisational readiness move roughly in step.

    Companies that get the pacing right will avoid the twin traps of overinvestment (building capabilities nobody wants) and underinvestment (being caught flat-footed when adoption accelerates). In a market where half of CEOs fear for their jobs over AI outcomes, the discipline to move at the right speed – rather than the fastest speed – may prove to be the more valuable skill.

  • Vibe Coding Enters the Enterprise. The Governance Question Follows It In.

    When Andrej Karpathy coined the term “vibe coding” in early 2025, he was describing something informal — a developer giving in to the flow of conversation with an AI assistant, accepting whatever code it generated, and iterating by feel rather than by specification. It was a shorthand for a new way of working that felt more like directing than engineering.

    Fourteen months later, the term has migrated from developer Twitter into enterprise press releases. Pegasystems announced this week that its Blueprint platform now offers an “end-to-end vibe coding experience” for designing mission-critical workflow applications. Salesforce has embedded similar capabilities into Agentforce. Gartner, in a May 2025 report titled Why Vibe Coding Needs to Be Taken Seriously, predicted that 40 per cent of new enterprise production software will be created using vibe coding techniques by 2028. What started as a solo developer’s guilty pleasure is being repackaged as an enterprise strategy.

    The question is whether the repackaging addresses the risks, or merely relabels them.

    From Slang to Sales Pitch

    The appeal of vibe coding in an enterprise context is straightforward. Natural language replaces formal specification. Business users can describe what they want in conversational terms — a workflow, an approval chain, a customer-facing process — and an AI assistant translates that intent into a working application. Development cycles that previously took months collapse into days or hours. Stakeholder alignment happens at the prototype stage rather than after months of requirements gathering.

    Pega’s implementation illustrates the model. Users converse with an AI assistant using text or speech to design applications, refine workflows, define data models, and build interfaces. They can switch between conversational input and traditional drag-and-drop modelling at any point. Completed designs deploy directly into Pega’s platform as live, governed workflows. The company’s chief product officer, Kerim Akgonul, framed it as “the excitement and speed of vibe coding” combined with “enterprise-grade governance, security, and predictability.”

    That framing is telling. Enterprise vendors are not adopting vibe coding wholesale — they are domesticating it. The original concept involved a developer accepting AI-generated code on trust, with minimal review. The enterprise version keeps the conversational interface but routes the output through structured frameworks, predefined best practices, and platform-level guardrails. Whether that still qualifies as vibe coding or is simply a new marketing label for low-code development with an AI front end is an open question.

    The Numbers Behind the Hype

    Gartner’s 40 per cent prediction is eye-catching, but it deserves scrutiny. The firm also projects that 90 per cent of enterprise software engineers will use AI coding assistants by 2028, up from under 14 per cent in early 2024. These are not niche forecasts — they describe a wholesale transformation of how software gets built.

    The market signals support the direction. Y Combinator reported that a quarter of its Winter 2025 startup cohort had codebases that were 95 per cent AI-generated. AI-native SaaS companies are achieving 100 per cent year-on-year growth rates compared with 23 per cent for traditional SaaS. Pega’s own Q4 2025 results showed 17 per cent annual contract value growth and a 33 per cent surge in cloud revenue, with management attributing much of the acceleration to Blueprint adoption.

    But there is a less comfortable set of numbers. A Veracode report from 2025 found that nearly 45 per cent of AI-generated code introduced at least one security vulnerability. Linus Torvalds, creator of Linux, publicly cautioned that vibe coding “may be a horrible idea from a maintenance standpoint” for production systems requiring long-term support. And Gartner’s own research acknowledges that only six per cent of organisations implementing AI become “high performers” achieving significant financial returns.

    The Shadow Already Has a Name

    For regular readers of maddaisy, these risks will sound familiar. When we examined shadow AI in February, the data showed 37 per cent of employees had already used AI tools without organisational permission — including coding assistants plugged into development environments without security review. Vibe coding, in its original ungoverned form, is essentially shadow AI with a better name.

    The enterprise vendors’ pitch — governed vibe coding, with guardrails — is a direct response to this problem. Rather than fighting the tide of developers and business users reaching for AI-assisted tools, platforms like Pega and Salesforce are channelling that energy through controlled environments. It is the same pattern that played out with cloud computing a decade ago: shadow IT became sanctioned cloud adoption once the governance frameworks caught up.

    The difference this time is speed. Cloud adoption played out over years. Vibe coding is moving in months. And as maddaisy’s coverage of agentic AI drift highlighted, AI-generated systems do not fail suddenly — they degrade gradually, in ways that are harder to detect than traditional software failures. An application built through conversational prompts, where the development team may not fully understand the underlying logic, amplifies that risk considerably.

    The Governance Gap Is the Real Story

    The enterprise vibe coding pitch rests on a critical assumption: that platform-level guardrails can substitute for developer-level understanding. In regulated industries — financial services, healthcare, government — this assumption will be tested quickly and publicly.

    The immediate challenge is not whether vibe coding works in a demo. It clearly does. The challenge is what happens six months into production, when the original conversational prompts have been refined dozens of times, the underlying models have been updated, and the people who designed the workflows have moved on. That is the maintenance problem Torvalds flagged, and it maps directly onto the agentic drift pattern: small, individually reasonable changes accumulating into a system whose behaviour no longer matches its original intent.

    Consultants and technology leaders evaluating vibe coding platforms should be asking three questions. First, can you audit the reasoning chain — not just the output, but why the system built what it built? Second, what happens when the AI model underneath is updated — does the application need to be revalidated? Third, who owns the maintenance burden when the person who “vibe coded” the application is no longer available?

    What to Watch

    Enterprise vibe coding is not a fad. The productivity gains are real, the vendor investment is substantial, and the Gartner forecasts — even if directionally approximate — point to a genuine shift in how software gets built. PegaWorld 2026, scheduled for June in Las Vegas, will likely showcase dozens of enterprise vibe coding implementations.

    But the narrative developing around it echoes the early days of every enterprise technology wave: speed first, governance second. The organisations that get this right will be those that treat vibe coding as a development interface, not a development shortcut — using the conversational speed to accelerate design while maintaining the engineering discipline to ensure what gets built can be understood, audited, and maintained over time.

    The vibes are entering the enterprise. The question is whether the rigour follows them in.

  • Insurtech’s AI-Fuelled Five Billion Dollar Comeback — And the Question the Industry Has Not Answered

    Global insurtech funding reached $5.08 billion in 2025, up 19.5% from $4.25 billion the year before. It is the first annual increase since 2021 — and, according to Gallagher Re’s latest quarterly report, it marks a fundamentally different kind of recovery from the one the sector last enjoyed.

    The 2021 boom was driven by venture capital chasing consumer-facing disruptors. The 2025 comeback is driven by insurers and reinsurers themselves investing in operational AI. That distinction matters far more than the headline number.

    The money is coming from inside the house

    In 2025, insurers and reinsurers made 162 private technology investments into insurtechs — more than in any prior year on record. This is not outside capital speculating on disruption. It is the industry itself funding its own modernisation, a shift Gallagher Re describes as a “changing of the guard” in the insurtech investor community.

    The fourth quarter was particularly striking. Funding hit $1.68 billion — a 66.8% increase over Q3 and the strongest quarterly figure since mid-2022. More than 100 insurtechs raised capital for the first time since early 2024, and mega-rounds (deals exceeding $100 million) returned in force, with 11 such rounds totalling $1.43 billion for the full year, up from six in 2024.

    Property and casualty insurtech funding rebounded 34.9% to $3.49 billion, driven by companies like CyberCube, ICEYE, Creditas, Federato, and Nirvana, which collectively secured $663 million in Q4 alone. Life and health insurtech, by contrast, declined slightly — a 4.6% dip that underlines where the industry sees its most pressing operational gaps.

    Two-thirds of the money follows AI

    The most telling statistic in the report is this: two-thirds of all insurtech funding in 2025 — $3.35 billion across 227 deals — went to AI-focused firms. By Q4, that share had climbed to 78%.

    Andrew Johnston, Gallagher Re’s global head of insurtech, frames this as convergence rather than a trend: “Over time, we see AI becoming so integrated into insurtech that the two may well become synonymous — in much the same way as we could already argue that ‘insurtech’ is itself a meaningless label, because all insurers are technology businesses now.”

    That trajectory is visible in the deals themselves. mea, an AI-native insurtech, raised $50 million from growth equity firm SEP in February — its first external capital after years of profitable organic growth. The company’s platform, already processing more than $400 billion in gross written premium across 21 countries, automates end-to-end operations for carriers, brokers, and managing general agents. mea claims its AI can cut operating costs by up to 60%, targeting the roughly $2 trillion in annual industry operating expenses where manual workflows persist.

    At the seed stage, General Magic raised $7.2 million for AI agents that automate administrative tasks for insurance teams — reducing quote generation time from approximately 30 minutes to under three in early deployments with major insurers.

    Profitability, not just growth

    What separates the 2025 wave from the 2021 boom is that several insurtechs are now proving they can make money, not just raise it.

    Kin Insurance, which focuses on high-catastrophe-risk regions, reported $201.6 million in revenue for 2025 — a 29% increase — with a 49% operating margin and a 20.7% adjusted loss ratio. Hippo, another property-focused insurtech, reversed its 2024 net loss with $58 million in net income, driven by improved underwriting and a deliberate shift away from homeowners insurance toward more profitable lines.

    These are not unicorn-valuation stories. They are companies demonstrating operational discipline — the kind of results that explain why insurers and reinsurers, rather than venture capitalists, are now leading the investment.

    The B2B shift

    Gallagher Re’s data reveals another structural change worth watching. Nearly 60% of property and casualty deals in 2025 went to business-to-business insurtechs — a 12 percentage point increase from 2021’s funding boom. Meanwhile, the deal share for lead generators, brokers, and managing general agents fell to 35%, the lowest on record.

    The implication is clear: capital is flowing toward technology that improves how existing insurers operate, not toward new entrants trying to replace them. The disruptor narrative of the early 2020s has given way to something more pragmatic — and, arguably, more durable.

    This parallels a pattern visible across financial services. As maddaisy noted when examining Lloyds Banking Group’s AI programme, established institutions are increasingly treating AI not as an innovation experiment but as core operational infrastructure — and measuring it accordingly.

    The question the industry has not answered

    For all the funding momentum, Johnston raises a challenge that the sector has yet to confront seriously: the “so what” problem.

    “As the implementation of AI starts to deliver efficiency gains, it is imperative that the industry works out how to best use all of this newly freed up time and resource,” he writes.

    This is not a hypothetical. If mea can genuinely reduce operating costs by 60% for a carrier, that frees up a substantial portion of the 14 percentage points of combined ratio currently consumed by operations. The question is whether that freed capacity translates into better underwriting, deeper risk analysis, and improved customer outcomes — or whether it simply gets absorbed into margin without changing how insurance fundamentally works.

    The broker market is already feeling the tension. In February, insurance broker stocks dropped roughly 9% after OpenAI approved the first AI-powered insurance apps on ChatGPT, enabling consumers to receive quotes and purchase policies within the conversation. Most analysts called the selloff overdone — commercial broking remains complex enough to resist near-term disintermediation — but the episode illustrated how quickly market sentiment can shift when AI moves from back-office tooling to customer-facing distribution.

    What to watch

    The $5 billion figure is a milestone, but the real signal is in its composition. Insurtech funding is no longer a venture capital bet on disruption. It is the insurance industry’s own investment in operational AI — led by incumbents, focused on B2B infrastructure, and increasingly backed by profitability rather than just promise.

    Whether that investment translates into genuinely better insurance — not just cheaper operations — depends on how the industry answers Johnston’s question. The money is flowing. The efficiency gains are materialising. What the sector does with them will determine whether this comeback is a lasting structural shift or just the next chapter of doing the same things with fewer people.

  • Accenture Will Track AI Logins for Promotions. The Risk Is Measuring Compliance, Not Competence.

    Accenture has begun tracking how often senior employees log into its AI tools — and will factor that usage into promotion decisions. An internal email, reported by the Financial Times, put it plainly: “Use of our key tools will be a visible input to talent discussions.” For a firm that sells AI transformation to the world’s largest organisations, the message to its own workforce is unmistakable: adopt or stall.

    The policy is not without logic. Accenture has invested heavily in AI readiness — 550,000 employees trained in generative AI, up from just 30 in 2022, backed by $1 billion in annual learning and development spending. But training people and getting them to change how they work are two very different problems. The promotion-linked tracking is an attempt to close that gap by force.

    The credibility problem

    This move arrives at a moment when Accenture’s external positioning makes internal adoption a matter of commercial credibility. As maddaisy.com reported last week, Accenture is one of four firms named in OpenAI’s Frontier Alliance — tasked with building the data architecture, cloud infrastructure, and systems integration work needed to deploy AI agents at enterprise scale. It is difficult to sell that capability convincingly if your own senior managers are not using the tools.

    The policy applies specifically to senior managers and associate directors, with leadership roles now requiring what Accenture calls “regular adoption” of AI. The firm is tracking weekly logins to its AI platforms for certain senior staff, though employees in 12 European countries and those on US federal contracts are excluded — a pragmatic nod to varying data protection regimes and security requirements.

    The resistance is the interesting part

    What makes this story more than a policy announcement is the reaction. Some senior employees have questioned the value of the tools outright, with one describing them as “broken slop generators.” Another told the Financial Times they would “quit immediately” if the tracking applied to them.

    That resistance is worth taking seriously, not dismissing. It maps directly onto a pattern maddaisy.com has been tracking. Research published earlier this month found that only 34% of employees say their organisation has communicated AI’s workplace impact “very clearly” — a figure that drops to 12% among non-senior staff. When people do not understand why they are being asked to use a tool, mandating its use tends to produce compliance rather than competence.

    Harvard Business Review research, cited in that same analysis, identified three psychological needs that determine whether employees embrace or resist AI: competence (feeling effective), autonomy (feeling in control), and relatedness (maintaining meaningful connections with colleagues). A policy that monitors logins and ties them to career progression addresses none of these. It measures activity. It says nothing about whether that activity is useful.

    Logins are not outcomes

    This is the core tension. Accenture’s leadership knows that senior adoption is a bottleneck — industry observers note that older managers are often “less comfortable with technology and more wedded to established working methods.” CEO Julie Sweet has framed AI adoption as existential, telling analysts that the company is “exiting employees” in areas where reskilling is not possible. The 11,000 layoffs announced in September reinforced the point.

    But tracking logins conflates presence with productivity. A senior manager who logs in weekly to check a dashboard is counted the same as one who has genuinely integrated AI into client delivery. The metric captures the floor, not the ceiling.

    This echoes a broader concern maddaisy.com has documented. UC Berkeley researchers found that employees using AI tools worked faster but not necessarily better — absorbing more tasks, blurring work-life boundaries, and entering a cycle of acceleration that resembled productivity but often was not. If Accenture’s policy drives more tool usage without more thoughtful tool usage, it risks producing exactly this outcome at scale.

    What this tells the rest of the industry

    Accenture is not alone in struggling with senior AI adoption. The challenge is structural across professional services. Deloitte’s 2026 CSO Survey found that while 95% of chief strategy officers expect AI to reshape their priorities, only 28% co-lead their organisation’s AI decisions. The people with the authority to mandate change are often the furthest from understanding it.

    Accenture’s approach is at least direct. Rather than hoping adoption trickles up from junior staff — who typically adopt new tools faster — it is applying pressure from the top. And the numbers suggest some urgency: with 750,000 employees and $70 billion in revenue, Accenture has grown enormously from its 275,000-person, $29 billion base in 2013. Maintaining that trajectory while its competitors embed AI into delivery models requires its own workforce to be fluent, not just trained.

    The risk, though, is that the policy optimises for the wrong signal. Organisations that have navigated AI adoption most effectively — and maddaisy.com has covered several — tend to share a common trait: they measure what AI enables people to do differently, not how often people open the application. Accenture’s policy would be considerably more compelling if it tracked client outcomes improved through AI-assisted work, or time freed for higher-value tasks, rather than weekly platform logins.

    The precedent matters more than the policy

    Whatever one makes of the specifics, Accenture has done something that most large organisations have avoided: it has made AI adoption an explicit, measurable condition of career advancement for senior leaders. That is a significant signal. It tells clients that Accenture is serious about practising what it sells. It tells employees that AI fluency is no longer optional at the leadership level.

    Whether it works depends on what happens next. If the tracking evolves toward measuring genuine integration — how AI changes the quality of work, not just the frequency of logins — Accenture could set a useful template for the industry. If it remains a blunt instrument that rewards compliance over competence, it will likely produce exactly the kind of performative adoption that gives AI transformation programmes a bad name.

    For consultants and enterprise leaders watching from outside, the lesson is practical: mandating AI adoption is easy; mandating it well is the hard part. The metric you choose to track will shape the behaviour you get.

  • PwC Built an AI That Can Actually Read Enterprise Spreadsheets. Here Is Why That Matters.

    Most enterprise AI demonstrations involve chatbots, code generation, or image synthesis — capabilities that are impressive but often disconnected from the workflows where organisations actually make decisions. PwC has taken a different approach. On 19 February, the firm announced a frontier AI agent that can reliably reason across complex, multi-sheet enterprise spreadsheets — the kind of messy, formula-dense workbooks that underpin deals, risk assessments, and financial modelling across virtually every large organisation.

    The announcement would be easy to dismiss as incremental. It is, in fact, one of the more practically significant AI developments of the year so far.

    The Spreadsheet Problem No One Talks About

    AI has made rapid progress with text, images, and code. But enterprise spreadsheets have remained stubbornly resistant. The reason is structural: a typical enterprise workbook is not a neatly formatted data table. It is a sprawling, multi-sheet artefact containing hundreds of thousands of rows, cross-sheet formulas, hidden dependencies, embedded charts, and formatting inconsistencies accumulated over years of manual editing by multiple authors.

    Conventional AI systems — including the most advanced large language models — struggle with this complexity. They can process a clean CSV file or answer questions about a simple table. But ask them to trace a formula chain across five sheets in a workbook with 200,000 rows and inconsistent column headers, and accuracy collapses. For regulated industries where precision is non-negotiable — auditing, tax, financial due diligence — this limitation has kept spreadsheet analysis firmly in the domain of human practitioners.

    PwC’s agent addresses this directly. Combining multimodal pattern recognition with a retrieval-augmented architecture, the system can process up to 30 workbooks containing nearly four million cells. In internal benchmarks, it achieved roughly three times the accuracy of previously published methods while using 50% fewer computational tokens — a meaningful efficiency gain that reduces both cost and energy consumption.

    How It Works, Without the Hype

    The technical approach mirrors how experienced analysts actually work. Rather than attempting to ingest an entire workbook at once — a strategy that overwhelms even million-token context windows — the agent scans, indexes, and selectively retrieves relevant sections. It can jump across tabs, trace logic through formula chains, integrate visual elements like charts, and explain its reasoning with what PwC describes as “defensible precision.”

    Two internal use cases illustrate the practical impact. In engagement documentation, PwC teams work with large, nominally standardised workbooks that document business processes and controls. In practice, these files vary significantly — column names shift, fields appear in different orders, structures change between engagements. The agent handles this in two stages: first mapping the workbook’s structure, then extracting specific details using targeted retrieval rather than brute-force ingestion.

    In risk assessment, the agent replaces what was previously weeks of custom development work. Each new set of files could break existing programmatic approaches due to formatting variations. The agent indexes and extracts directly, regardless of these inconsistencies. PwC reports that what previously required weeks of configuration can now be completed in hours.

    The ROI Connection

    The timing of this announcement is worth noting. Earlier this month, maddaisy examined PwC’s own 2026 Global CEO Survey, which found that 56% of chief executives could not point to measurable revenue gains from their AI investments. Only 12% reported achieving both revenue growth and cost reduction from AI programmes.

    The spreadsheet agent is, in a sense, PwC’s answer to its own data. Rather than pursuing the kind of ambitious, organisation-wide AI transformation that the survey suggests most companies are failing at, this tool targets a specific, bounded problem: making AI useful where decisions actually get made. Spreadsheets are unglamorous, but they remain the substrate of enterprise decision-making across every industry. If AI cannot work reliably with them, the ROI gap that PwC’s own research documented will persist.

    Matt Wood, PwC’s Commercial Technology and Innovation Officer, was notably direct about the origin: “This didn’t start as a research project. It started because our teams were spending weeks manually tracing logic through workbooks that no existing tool could handle.”

    A Broader Pattern: Consulting Firms as Technology Builders

    This development fits a pattern that maddaisy has been tracking across the consulting industry. Firms are not merely advising clients on AI — they are building proprietary capabilities that change the economics of their own delivery. McKinsey’s 25,000 AI agents. Accenture’s ongoing automation of delivery operations. Now PwC, with a tool that converts weeks of manual work into hours.

    The competitive implications are significant. A firm that can process complex financial workbooks in hours rather than weeks can bid more aggressively on engagements, take on more work with the same headcount, and offer the outcome-based pricing models that clients increasingly prefer. The spreadsheet agent is not just a productivity tool — it is a structural advantage in the shifting economics of professional services.

    What Practitioners Should Watch

    For consultants and enterprise leaders, the PwC announcement carries a practical message: the AI value gap may start closing not through headline-grabbing deployments, but through targeted tools that tackle specific bottlenecks in existing workflows.

    The broader FP&A landscape is moving in the same direction. IBM’s 2026 analysis of financial planning trends highlights that 69% of CFOs now consider AI integral to their finance transformation strategy, with the primary applications centring on data ingestion, budget analysis, and narrative generation — precisely the kind of spreadsheet-adjacent work that PwC’s agent addresses.

    The question is no longer whether AI can handle enterprise data complexity. It is whether organisations will deploy these capabilities against the right problems — the mundane, time-intensive, precision-critical workflows where the return on investment is most measurable and most immediate.

    PwC appears to have started there. Given the firm’s own data on the AI ROI crisis, that is arguably the most credible place to begin.

  • Agentic AI Drift: The Silent Production Risk No One Is Measuring

    When maddaisy examined the agentic AI governance gap last week, the focus was on a structural mismatch: three-quarters of enterprises planning to deploy agentic AI, but only one in five with a mature governance model. That gap remains wide. But a more specific — and arguably more dangerous — operational risk is now coming into focus: agentic AI systems do not fail suddenly. They drift.

    A recent analysis published by CIO makes the case plainly. Unlike earlier generations of AI, which tend to produce identifiable errors — a wrong classification, a hallucinated fact — agentic systems degrade gradually. Their behaviour evolves incrementally as models are updated, prompts are refined, tools are added, and execution paths adapt to real-world conditions. For long stretches, everything appears fine. KPIs hold. No alarms fire. But underneath, the system’s risk posture has already shifted.

    The Problem with Demo-Driven Confidence

    Most organisations still evaluate agentic AI the way they evaluate any software feature: through demonstrations, curated test scenarios, and human judgment of output quality. In controlled settings, this looks adequate. Prompts are fresh, tools are stable, edge cases are avoided, and execution paths are short and predictable.

    Production is different. Prompts evolve. Dependencies fail intermittently. Execution depth varies. New behaviours emerge over time. Research from Stanford and Harvard has examined why many agentic systems perform convincingly in demonstrations but struggle under sustained real-world use — a gap that grows wider the longer a system runs.

    The result is a pattern that will be familiar to anyone who has managed complex software in production: a system passes all its review gates, earns early trust, and then becomes brittle or inconsistent months later, without any single change that clearly broke it. The difference with agentic AI is that the degradation is harder to detect, because the system’s outputs can still look reasonable even as the reasoning behind them has shifted.

    What Drift Actually Looks Like

    The CIO analysis includes a telling case study from a credit adjudication pilot. An agent designed to support high-risk lending decisions initially ran an income verification step consistently before producing recommendations. Over time, a series of small, individually reasonable changes — prompt adjustments for efficiency, a new tool for an edge case, a model upgrade, tweaked retry logic — caused the verification step to be skipped in 20 to 30 per cent of cases.

    No single run produced an obviously wrong result. Reviewers often agreed with the recommendations. But the way the agent arrived at those recommendations had fundamentally changed. In a credit context, that difference carries real financial and regulatory consequences.

    This is the nature of agentic drift: it is not a bug. It is the predictable outcome of complex, adaptive systems operating in changing environments. Two executions of the same agent with the same inputs can legitimately differ — that stochasticity is inherent to how modern agentic systems work. But it also means that point-in-time evaluation, one-off tests, and spot checks are structurally insufficient for production risk management.

    From Policy to Diagnostics

    When maddaisy covered the shadow AI governance challenge earlier this month, one theme was clear: governance frameworks are necessary but not sufficient. They define ownership, policies, escalation paths, and controls. What they often lack is an operational mechanism to answer a deceptively simple question: has the agent’s behaviour actually changed?

    Without that evidence, governance operates in the dark. Policy defines what should happen. Diagnostics establish what is actually happening. When measurement is absent, controls develop blind spots in precisely the live systems where agentic risk tends to accumulate.

    The Cloud Security Alliance has begun framing this as “cognitive degradation” — a systemic risk that emerges gradually rather than through sudden failure. Carnegie Mellon’s Software Engineering Institute has similarly emphasised the need for continuous testing and evaluation discipline in complex AI-enabled systems, drawing parallels to how other high-risk software domains manage operational risk.

    What Practitioners Should Watch For

    The emerging consensus points toward several operational principles for managing agentic drift:

    Behavioural baselines over output checks. No single execution is representative. What matters is how behaviour shows up across repeated runs under similar conditions. Organisations need to establish baselines — not for what an agent should do in the abstract, but for how it has actually behaved under known conditions — and then monitor for sustained deviations.

    Separate configuration changes from behavioural evidence. Prompt updates, tool additions, and model upgrades are important signals, but they are not evidence of drift on their own. What matters is persistence: transient deviations are often noise in stochastic systems, while sustained behavioural shifts across time and conditions are where risk begins to emerge.

    Treat agent behaviour as an operational signal. Internal audit teams are asking new questions about control and traceability. Regulators are paying closer attention to AI system behaviour. Platform teams are under growing pressure to demonstrate stability in live environments. “It looked fine in testing” is no longer a defensible operational posture, particularly in sectors — financial services, healthcare, compliance — where subtle behavioural changes carry real consequences.

    The Observability Gap

    This is, ultimately, the next chapter in the governance story maddaisy has been tracking. The first chapter — covered in the enforcement era analysis — was about moving from principles to rules. The second, examined through Deloitte’s enterprise data, was the gap between strategic confidence and operational readiness. This third chapter is more specific and more technical: the gap between having governance frameworks and having the observability infrastructure to make them work.

    The goal is not to eliminate drift. Drift is inevitable in adaptive systems. The goal is to detect it early — while it is still measurable, explainable, and correctable — rather than discovering it through incidents, audits, or post-mortems. Organisations that build this capability will be better positioned to deploy agentic AI at scale with confidence. Those that do not will continue to be surprised by systems that appeared stable, until they were not.

    For consultants advising on enterprise AI deployments, the implication is practical: governance reviews that stop at policy documentation are incomplete. The question to ask is not just whether a client has an AI governance framework, but whether they can tell you how their agents are behaving today compared to three months ago. If the answer is silence, that is where the work begins.