The Legal AI Security Illusion: How Vendors Sell Contracts Instead of Protection

The fundamental technical risk nobody in the legal AI industry wants to talk about, and what law firms should be asking in 2026.

Feb 25, 2026

I’ve spent a significant portion of my career at the intersection of regulated industries and emerging technology. As the first Chief Innovation Officer at the FDIC, I watched financial institutions make the same mistake repeatedly: they would adopt new technology, get comfortable with the contractual language wrapped around it, and convince themselves that comfort was the same thing as security. It never was. The 2008 financial crisis had a technology component that most people still don’t fully understand. Firms had outsourced critical functions to third-party providers and discovered, under conditions of stress, that their contractual protections were worth exactly what they could recover in litigation, which, by then, was beside the point.

I’m watching the same mistake happen right now in the legal industry, at scale, with AI.

The legal AI market has exploded. Harvey has reached an $11 billion valuation. Thomson Reuters paid $650 million for Casetext. LexisNexis has partnered with Harvey, integrated OpenAI, Anthropic, Mistral, Google, and Microsoft into its Protégé platform, and is marketing the result as a “fully encrypted legal AI environment.” Every one of these companies has a security pages, data privacy agreements, and will tell you with great confidence that your client data is safe. What none of them will tell you is that there is a fundamental architectural gap between what their contracts promise and what their technology can guarantee. And in the legal industry, where even the existence of a representation can be privileged information, that gap could spell organizational doom.

The Architecture Problem Nobody’s Discussing

Every major legal AI platform is built the same way at its core. They take foundation models (GPT-4o, Claude, Gemini, Mistral) and build an application layer on top. That application layer is where most of their engineering effort goes: the legal fine-tuning, the retrieval augmented generation, the workflow orchestration, the user interface. This is genuine value-add work, and the best platforms do it well.

But here’s what that architecture means in practice: when a lawyer uploads a client document and asks Harvey or CoCounsel or Protégé to analyze it, meaningful fragments of that document travel to a foundation model for inference. And that inference happens inside infrastructure that these platforms do not own, do not operate, and cannot technically control.

Harvey routes data through OpenAI’s infrastructure on Azure, Anthropic’s infrastructure on AWS Bedrock, and Google’s infrastructure on Vertex AI. CoCounsel runs on both Google Cloud and Amazon AWS, routing to different model providers depending on the task.

LexisNexis’s Protégé accesses five separate model providers — OpenAI, Anthropic, Mistral, Google, and Microsoft — using what it calls “Best Fit” auto-routing, meaning your client’s document could touch any of them, in any combination, based on what the system determines is optimal at that moment.

Each of these platforms will tell you that their agreements with these providers include strict data handling commitments: no retention, no training, no human review. I believe them. These are real contractual commitments that enterprise API channels provide, as distinct from consumer-facing products.

But a contract is not an architectural control. A contract tells you what recourse you have after something goes wrong. Signing a contract does not make data leaks a technical impossibility.

What does make data leaks technically impossible at the inference layer is confidential computing. This is a class of hardware and software technology (Intel TDX, AMD SEV-SNP, Azure Confidential Compute) where computation happens inside a hardware-isolated enclave that is cryptographically sealed. Even the cloud provider operating the underlying infrastructure cannot inspect what’s being processed inside that enclave. The model runs, the inference happens, the output comes out, and at no point does any human or system outside the enclave have technical access to what went in.

None of the major legal AI platforms have publicly committed to deploying confidential computing for model inference. Not Harvey. Not CoCounsel. Not LexisNexis.

With the most secure technologies available, they continue promising complete security while relying on architectures that cannot deliver it.

Harvey: The Best Technical Story with the Same Fundamental Gap

Harvey is the most technically sophisticated of the three companies I want to discuss, and I want to give credit where it’s due before I explain why their security architecture still has a critical weakness.

Harvey built something genuinely smart: a proprietary model proxy layer that sits between their application and every external model API. Every request to OpenAI, Anthropic, or Google routes back through Harvey’s own Kubernetes cluster before going out, meaning external model providers never receive a raw, direct connection from the end user. There’s always a Harvey-controlled intermediary handling the request, rotating API keys, and managing what gets logged at the Harvey layer.

Their workspace isolation architecture is also real engineering. Every law firm exists in a logically separated workspace with no cross-contamination possible. Harvey’s penetration testing, conducted by firms like NCC Group and BishopFox, specifically targets workspace isolation to validate that the separation holds, and tests for the scenario where one firm’s data could appear in another firm’s outputs.

The inference-only architecture addresses a real threat, as the biggest documented risk in AI data exposure lies in training, not inference. When a model is trained on data and then that data gets memorized and reproduced in outputs to other users, that’s a genuine threat. Harvey’s commitment to never using client data for training, enforced contractually with every model provider, eliminates this specific risk in a meaningful way.

On top of that, their customer-managed encryption keys are technically enforced, meaning Harvey is completely incapable of decrypting client data at rest without the firm’s keys.

So why am I still worried?

Because once Harvey’s proxy forwards a request to OpenAI’s Azure infrastructure, or Anthropic’s AWS Bedrock environment, or Google’s Vertex AI cluster, Harvey controls nothing technical about what happens inside those systems. The proxy has done its job, and the request has been handed off. What happens next is governed entirely by Harvey’s contractual arrangements with those providers, not by any cryptographic or hardware mechanism that Harvey controls.

And those providers are operating shared infrastructure serving thousands of enterprise clients simultaneously. Of course, they have security teams, policies, audit processes; I’m not saying someone at OpenAI or Anthropic is doing something nefarious with Harvey’s clients’ data. But the technical controls preventing that from being possible, at the model provider layer, are not Harvey’s to deploy or verify.

Harvey’s security page says, “Harvey contractually guarantees through our Security Addendum that your data stays yours.” That sentence is carefully written. The promise is contractual. The word “guarantee” implies something stronger than what the architecture actually provides at the model provider boundary.

For most law firms handling most matters, this risk level is probably acceptable. But for firms handling matters where privilege is paramount? Acceptable isn’t good enough.

Thomson Reuters CoCounsel: The Legacy Brand with a MultiCloud Problem

Thomson Reuters commands enormous institutional trust in the legal industry. After serving law firms for over a century, Westlaw is embedded in the workflow of virtually every American attorney. When they say something is secure, lawyers believe them. That trust is largely deserved. Thomson Reuters has real security infrastructure and a track record of handling sensitive legal information at scale.

But trust earned in one era does not automatically transfer to a new architectural paradigm, and CoCounsel’s architecture creates a specific problem that Thomson Reuters has not publicly reckoned with: they are routing client data through two separate cloud providers’ infrastructure, to multiple model providers, in ways that create more surface area than their legacy security reputation was built to cover.

CoCounsel’s core product runs on Google Cloud, while its Anthropic/Claude integration, which is used specifically for tax services, runs on Amazon AWS Bedrock. This means client data, depending on the task, is processed within Google’s infrastructure or Amazon’s infrastructure. Thomson Reuters has no control over either of these environments. They have enterprise agreements with both, which include strong data handling commitments, but Google and Amazon are each operating vast, complex infrastructure serving millions of enterprise clients. The controls preventing any given client’s legal matter from being accessible to anyone inside those organizations are, at the boundary, contractual.

CoCounsel has earned ISO 42001 certification, becoming one of the first generative AI systems in professional services to do so. Its features include zero-retention architecture for client data and SOC 2 Type II audits, showing real investments in security governance.

What they don’t have, as far as publicly available information reveals, is a confidential computing architecture that would make it technically impossible for anyone at Google or Amazon (or anyone who successfully breached Google or Amazon’s infrastructure) to access the legal matter data being processed for inference. The ISO certification covers Thomson Reuters’ own practices and nothing more. They have no control over what happens inside Google’s or Amazon’s data centers once the API call lands.

There is also a less-discussed issue specific to CoCounsel’s situation. The acquisition of Casetext brought with it the CoCounsel product and team, but the architectural decisions made at a startup don’t automatically upgrade to enterprise-grade infrastructure just because a large company acquires them. CoCounsel was built quickly, in a competitive market, with the goal of demonstrating capability. Some of those architectural shortcuts may still be present underneath the enterprise packaging.

I’m don’t believe Thomson Reuters is cavalier about security, but when their narrative leans heavily on their institutional credibility and contractual arrangements and provides few details on the specific architectural question of what happens at the model provider boundary, suspicions arise.

For the many firms that have relied on the Thomson Reuters brand as a proxy for security, this distinction is worth understanding before it becomes relevant in ways that are difficult to remediate.

LexisNexis Protégé: Five Providers, One Surface Area Problem

LexisNexis presents the most complex security picture of the three, and not in a good way.

The Protégé platform has expanded aggressively over the past year, integrating model providers at a pace that appears to be driven primarily by competitive pressure rather than security architecture. As of the most recent announcements, Protégé’s “Best Fit” mode can route queries to OpenAI (GPT-4o, GPT-5, o3), Anthropic (Claude Sonnet), Google, Mistral, and Microsoft. That’s five separate external model providers, each operating their own infrastructure, each with their own data handling practices, each with their own contractual arrangements with LexisNexis.

When LexisNexis describes Protégé as operating within a “fully encrypted Lexis+ AI environment,” that description is accurate for what it covers: the LexisNexis application layer. Data is encrypted at rest within LexisNexis’s systems and in transit.

But “fully encrypted Lexis+ AI environment” describes where data lives, not where it goes. And data goes to five different sets of third-party compute infrastructure for the actual AI processing. At each boundary, we again see a security guarantee transition from architectural to contractual. Multiply that by five, and you have five times the contractual surface area and five times the number of third-party environments where the technical controls are outside LexisNexis’s direct authority.

LexisNexis also introduced something called “identifiable information removal” from AI interactions, meaning they strip personally identifiable information before sending data to model providers. This is a meaningful effort and directionally correct, however legal matters don’t expose their sensitivity through obvious identifiers. Sensitivity is often in the substance of the document, the nature of the legal question, the specific combination of facts. Stripping a name and an address does not anonymize a merger agreement or a document subpoena. The legal community understands this intuitively, and LexisNexis’s PII removal, while better than nothing, does not address the deeper confidentiality concern.

There’s also a governance issue worth naming. LexisNexis’s parent company RELX has been an investor in Harvey since the Series D. LexisNexis subsequently formed a strategic partnership with Harvey, integrating its content into Harvey’s platform. This means LexisNexis is simultaneously operating its own competing legal AI product and providing critical content infrastructure to its primary competitor, while also being a financial stakeholder in that competitor. The incentive structures here are complicated, and firms relying on LexisNexis to be single-mindedly focused on the security of their client data should be aware that LexisNexis’s priorities in this space are multidimensional.

Again, I am not suggesting bad faith, but when a company’s security posture serves multiple conflicting business interests, independent verification becomes more important, not less.

The Confidential Computing Gap: Why This Matters More Than You Think

I want to be precise about what confidential computing actually solves. The threat model I’m describing is complex, with several components and subtleties that are specifically addressed by preventing data access with properly configured hardware.

The first is the insider threat. Major cloud providers have thousands of employees with varying levels of access to infrastructure. Enterprise API agreements include “no human review” commitments, but those commitments describe policy, not technical impossibility. A determined insider, or someone who has successfully compromised an insider’s credentials, can access infrastructure in ways that violate policy. Without hardware-level isolation, there is no technical barrier to this specific threat.

The second is the sophisticated external breach. The major cloud providers are among the most hardened infrastructure targets in the world, and they do get breached. The Microsoft Exchange hack by HAFNIUM, the Capital One breach through AWS misconfiguration, the SolarWinds compromise that affected cloud environments across the US government and private sector. These are not hypothetical threats. When they happen, it doesn’t matter what the data privacy agreement stated if your data was technically accessible. With confidential computing, the hardware enclave is cryptographically sealed even against the provider’s own infrastructure access. Without it, what gets leaked depends on how the breach occurred and what the attackers were able to access.

The third is the legal discovery risk. When a cloud provider receives a government subpoena or national security letter for data processed within their infrastructure, their data handling commitments to their enterprise clients may not protect them from compliance. The NSA’s PRISM program, revealed by Snowden in 2013, demonstrated that major technology providers had been compelled to provide access to data processed within their systems, at scale, under legal orders that prohibited disclosure to the affected clients. The legal frameworks have evolved since then, but the fundamental issue has not been resolved: data processed within a third party’s infrastructure is potentially subject to that third party’s legal obligations in ways that may override their contractual commitments to you.

Confidential computing addresses all three of these threat vectors by making the data technically inaccessible at the hardware level, even to the cloud provider’s own infrastructure. An insider at Amazon cannot read what’s being processed inside an AMD SEV-SNP enclave, because the hardware prevents it. An attacker who breaches Azure’s control plane cannot decrypt data being processed inside an Intel TDX enclave, because the keys are not available to the control plane. A subpoena to Google for data processed inside a hardware-isolated enclave will not produce useful results, because Google itself cannot access that data.

This is not futuristic technology. Confidential computing is in production today, with systems like Azure Confidential Compute, AWS’s Nitro Enclaves, and Google Cloud’s Confidential VMs being generally available. The question is whether legal AI companies have chosen to build their architectures on it.

The honest answer, based on everything publicly available, is that they haven’t. And the reason they haven’t is that confidential computing adds latency, adds complexity, and adds cost. In a market where every vendor is racing to add features and lower friction, the incentive to invest heavily in an architectural improvement that most customers don’t know to ask for is limited.

The Attorney-Client Privilege Problem Nobody Has Solved

Before I get to what the industry should be asking, I want to name a specific legal risk that is hiding inside the technical architecture discussion.

Attorney-client privilege is not just an ethical obligation, but a legal doctrine that can be waived, and once waived, it generally cannot be recovered. One of the established ways to waive privilege is to voluntarily disclose privileged communications to a third party who does not share the legal interest that created said privilege. Courts have spent decades developing doctrine around when disclosure to a third party does or does not constitute waiver.

The question of whether routing client communications through a third-party AI model provider constitutes a disclosure that could be used to challenge privilege has not been definitively resolved. Most legal technology providers will tell you it doesn’t, and there are reasonable arguments for that position. The analogy to using a telephone or email server, where data passes through third-party infrastructure without waiving privilege, has some force.

But those analogies were developed in an era when the third-party infrastructure was dumb pipe. It received data, transmitted data, and did nothing with it. The legal AI model providers are far more complex, receiving and processing legal communications, generating inference, producing analysis, and creating outputs that reflect the substance of the privileged matter. The legal distinction between a telephone company that carries your call and an AI provider that reads your client’s document is not trivial.

I am not a practicing attorney, and I am not offering a legal opinion. I am observing that this question is live, that courts have not resolved it, and that every law firm adopting AI tools is implicitly taking a position on it. If a sophisticated opposing party in a major litigation decides to challenge the privilege over communications that passed through an AI inference infrastructure, the case they would make has a non-trivial basis, and the firms that would be most exposed are the ones that chose their AI vendor based on the DPA rather than the architecture.

This has not been litigated. But it will be.

This comes back to the confidential computing approach I described earlier, which potentially acts as a better privilege preservation argument. If the data never left an architecture where access by a third-party was a technically impossibility, the argument that privilege was waived by disclosure to a third party becomes significantly harder to make. Hardware-isolated inference serves as both a security and privilege argument.

What the Legal Industry Should Actually Be Asking

The legal industry has a peculiar relationship with technology risk. Law firms are simultaneously some of the most sophisticated buyers of professional services in the world and some of the slowest to update their understanding of what “secure” actually means in a given technological context.

The due diligence process most firms apply to legal AI vendors is borrowed from their process for evaluating other software vendors. SOC 2 Type II? Check. ISO 27001? Check. Data privacy agreement? Check. No training on client data? Check. These necessary questions, but they’re insufficient.

The question that should be at the center of every legal AI security evaluation is this: at what layer does the security guarantee transition from technical enforcement to contractual assurance, and what is the threat model for that boundary?

For Harvey, the answer is: at the boundary between Harvey’s proxy infrastructure and the external model provider’s compute environment. Harvey controls everything before that boundary technically. After that boundary, protection becomes contractual.

For CoCounsel, the answer is: at two separate boundaries — one into Google’s cloud infrastructure, one into Amazon’s — depending on which model is being used for which task. After those boundaries, protection becomes contractual.

For LexisNexis Protégé, the answer is: at five separate boundaries, to five separate model providers, with auto-routing potentially determining which boundary your client’s data crosses on any given query. After every one of those boundaries, protection becomes contractual.

The follow-up question is: given that boundary, what is your threat model, and is a contractual assurance sufficient for the matters your firm handles?

For firms handling M&A transactions where even the identity of the parties is market sensitive information before announcement, the answer should be that contractual assurance is not sufficient. For firms handling government investigations where national security implications may make legal process more likely, the answer should be that contractual assurance is not sufficient. For firms where a single matter represents hundreds of millions of dollars in fees and the reputational consequences of a breach would be existential, the answer should be that contractual assurance is not sufficient.

Legal AI is worth using, but the industry’s security narrative has gotten significantly ahead of its security architecture, and the people bearing the risk of that gap are the law firms and their clients who are trusting these platforms with their most sensitive information.

What Actually Good Looks Like

I want to be constructive here, because I don’t think the answer is for the legal industry to avoid AI. The efficiency gains are real. The competitive pressure to adopt is real. The technology will only get better.

But “what actually good looks like” is specific, and the industry should be able to articulate it.

Good looks like foundation model inference happening inside hardware-isolated confidential computing enclaves, where not even the cloud provider can access what’s being processed. This is technically achievable today.

Good looks like cryptographic proof of data handling — not just contractual commitments, but verifiable logs that can demonstrate what was processed where, when, and by what infrastructure, in a way that can be independently audited.

Good looks like the option for firms handling ultra-sensitive matters to have their data processed on dedicated, isolated infrastructure rather than shared multi-tenant cloud environments, even if that option comes at higher cost.

Good looks like transparency about exactly which model providers are being used for exactly which types of processing. It looks like the ability to limit that routing to specific, approved providers for specific matter types, not auto-routing that optimizes for performance without giving the firm visibility or control.

Good looks like third-party penetration testing that specifically targets the model provider boundary. It goes beyond the application layer and workspace isolation, and identifies what data is technically accessible to the model provider’s infrastructure during inference.

Some of this exists in embryonic form. Some of it doesn’t exist yet at scale in the legal AI market. The firms and vendors who build it will have a genuinely differentiated security story build on architectural guarantees, not contractual ones.

A Note on Why This Matters for Regulated Industries Generally

I’ve been writing primarily about law firms, because the legal AI market is where this conversation is most visible, but everything I’ve described applies with equal or greater force to the other regulated industries adopting AI: banking, healthcare, defense, intelligence.

At the FDIC, I watched financial institutions struggle with a version of this problem when they first moved to cloud infrastructure. The regulatory framework (OCC guidance, the Bank Service Company Act, vendor management requirements) was built around the assumption that regulators could examine third-party providers. The new AI infrastructure creates a version of this problem that existing regulatory frameworks are poorly equipped to address. The “vendor” has become a chain of vendors, each with their own infrastructure and contractual relationships, and the data flows between them are not always visible to the institution, let alone the regulator.

The banks that handled this well were the ones who refused to accept “we have a contract” as an answer to “what happens to our data.” They required architectural evidence: network diagrams, data flow documentation, technical controls assessments, penetration test results that specifically targeted the boundaries they were worried about. They made vendors explain, in technical terms, what was technically impossible rather than what was contractually prohibited.

The legal industry should adopt the same posture. The technology is different, but the underlying principle is identical: contractual protections and architectural protections are not the same thing, and in the moments when they diverge, only one of them actually protects your clients.

The Bottom Line

Harvey is the most technically sophisticated legal AI platform I’m aware of. Their proxy architecture, workspace isolation, and BYOK encryption are genuine engineering, not marketing. But once your data crosses into OpenAI’s, Anthropic’s, or Google’s infrastructure, Harvey’s technical controls end and their contractual ones begin.

CoCounsel has the benefit of Thomson Reuters’ institutional credibility and a century of trust in the legal industry. But that trust was built in an era before client data traveled to Google Cloud and Amazon AWS for AI inference, and the security architecture has not publicly caught up with that new reality.

LexisNexis Protégé offers the most capable multi-model environment in the market, but that capability comes with the most complex data flow. Five separate model providers, auto-routed, with client data potentially touching any of them in any combination, each with their own contractual, not architectural, protections at the boundary.

None of this means these platforms are reckless or that you shouldn’t use AI for legal work. It means you should be clear-eyed about what you’re buying, and you should be asking vendors the questions they are not yet used to being asked.

Here’s how I’d run the due diligence conversation if I were a law firm’s managing partner or CIO sitting across the table from any of these vendors:

First question: “At what specific point in your data flow does your security guarantee transition from a technical control to a contractual one? Walk me through the exact boundary.” Don’t let them answer with “end-to-end encryption” or “enterprise-grade security.” Make them name the boundary and describe what’s on either side of it.

Second question: “What is your confidential computing roadmap?” If they don’t have one, that tells you something. If they have one, ask when it will be in production and what it will cover. If they claim they already have it, ask for the technical documentation and have someone who understands Intel TDX or AMD SEV-SNP review it.

Third question: “Which specific model providers does my client data touch, and can I restrict that routing?” “Auto-routing to the best model” sounds like a feature. From a data governance perspective, it is a liability. You should know exactly whose infrastructure is processing your matters, and you should have the ability to say certain matters can only go to certain providers.

Fourth question: “What does your penetration testing scope include, and can I see the most recent report?” If the scope doesn’t include the model provider boundary — specifically testing whether data processed at inference time is technically accessible to the provider’s infrastructure — then you haven’t tested the most important thing.

Fifth question: “Has your outside counsel reviewed the privilege implications of your data architecture?” Not the DPA language. The actual architecture. If they haven’t done that analysis, you should require them to before you put privileged matter content into their system.

The vendors who can answer these questions well deserve your business. The vendors who respond with marketing language and point you back to the DPA do not yet deserve your trust with your most sensitive matters, regardless of how good their product is at the legal task level.

While the technologies dominating the market are genuinely impressive, their security narratives have gotten ahead of their security architecture. The people who pay the price for that gap, if and when it closes badly, are not the AI vendors. They are the firms and the clients who trusted them.

“Do you have a DPA?” is a 2018 question.

“Show me your confidential computing architecture” is the 2026 question.

The firms and the vendors who understand the difference between those two are the ones who will get ahead of this problem. Everyone else is hoping the gap never matters.

But in regulated industries, hope is not a risk management strategy.

This article was written by Sultan Meghji, CEO of Frontier Foundry and former Chief Innovation Officer at the FDIC. Visit his LinkedIn here.

To stay up to date with Frontier Foundry’s work building AI solutions for regulated industries, visit our website, or follow us on LinkedIn, X, and Bluesky.

To learn more about the services we offer, please visit our product page.

Thanks for reading Frontier Foundry! This post is public so feel free to share it.

Discussion about this post

Ready for more?