Anthropic's May release of 10 purpose-built AI agents for banking, insurance and asset management has already spooked the software incumbents—FactSet and Morningstar lost billions in market cap within hours. The agents handle pitch drafting, financial statement review and compliance escalation across regulated sectors. For mid-market UK firms, the headline matters less than the subtext: the market is now moving fast enough that generic AI models trained to refuse anything uncertain are being replaced by task-specific agents designed to actually complete work. But here is the uncomfortable truth the market is not discussing: Anthropic's agents inherit the same fundamental liability problem that every AI platform in regulated sectors faces. They will hallucinate. They will miss material detail. And under FCA Consumer Duty PS22/9, SRA Code 7.1.1, PRA SS1/23 and ICO UK GDPR principles, that risk sits squarely with the firm, not the vendor.
This story is part of a pattern. Harvey AI built legal agents. Microsoft released Copilot Pro with financial services modules. Luminance and Legora added agentic workflows to document review. Each announcement assumes the same thing: that regulatory firms want agents that move fast and escalate to humans. But the pattern reveals a deeper misreading of what UK regulated practice actually needs. The FRC ISA UK 315 and equivalent standards now require firms to understand their AI control environment with the same rigour as data controls. Most mid-market practices do not have that infrastructure. They have models. They do not have auditable decision trails. They have workflows. They do not have systematic monitoring of where the model failed and why. The vendor race to deploy agents has outpaced the buyer's ability to safely govern them.
Trovix's view is direct: the gap is not between agents and non-agents. The gap is between agents with real governance and agents without it. Anthropic's announcement is technically competent. But it treats compliance escalation as a feature, not a safety mechanism. An agent that drafts a pitch deck or pre-screens a document only becomes valuable in a regulated firm when you can prove three things: one, that it made the decision you think it made; two, that it applied the rules it was supposed to apply; three, that when it failed, you caught it before a client or regulator did. Harvey and Luminance have built some of this infrastructure. Most commercial agents have not. Trovix Sift was designed precisely for this—to sit between the agent and the work, extracting and verifying what the agent actually found, not what it claims it found. Trovix Watch handles the monitoring layer, so you know when regulatory rules change faster than the agent's training data can keep up.
If your practice is evaluating agents—whether from Anthropic, Microsoft, or anyone else—ask three non-negotiable questions before you deploy. First: can you see inside the decision? Does the system show you what data the agent reviewed, what rules it applied, and where it hesitated? Second: do you have the audit trail? Can you reconstruct, six months later, why the agent made that call for that matter? Third: do you have monitoring? Are you tracking agent accuracy per task type, per user, per week—the way you would track a fee-earner's error rate? If the vendor cannot answer these clearly, you are not buying an agent. You are buying a lawsuit waiting to happen. The FCA is watching. The SRA is watching. And they will ask these questions if something goes wrong.
Source: Bloomberg News