Private LLM Stack for Law Firms

A private LLM stack for a law firm should protect privileged material while giving lawyers the chat and document workflows they already expect. The goal is not to ban AI. The goal is to stop client and matter data from leaking into tools the firm cannot govern.

The minimum viable architecture

Layer	Purpose	Example options
Chat interface	Lawyer-facing assistant	Open WebUI, LM Studio, Jan
Model runtime	Runs approved models	Ollama, vLLM, llama.cpp
RAG/document layer	Matter-aware document Q&A	AnythingLLM, Haystack, LlamaIndex
Vector database	Stores document embeddings	Qdrant, Weaviate
Gateway	Routes requests and logs usage	LiteLLM, internal proxy
Identity/access	Keeps matters separated	SSO, groups, matter-level permissions
Deployment partner	Makes it usable and supportable	Legal/private AI implementation shop

Why law firms are different

Law firms have three hard constraints:

Privilege: client material cannot casually enter unmanaged third-party tools.
Matter separation: one client’s corpus cannot leak into another client’s workflow.
Auditability: leadership needs to know what tools received what kind of material.

Consumer AI tools were not designed around those constraints.

Start with three workflows

Do not start by indexing the entire document management system.

Start with:

internal policy Q&A
public-law research support
one controlled matter-document workflow

The first win should prove that lawyers will use the interface and that IT can explain the data boundary.

What to block first

The first policy move is not “no AI.” It is:

no privileged material in personal AI accounts
no matter documents in unmanaged browser extensions
no client records in unapproved chat tools
no confidential material in image/video AI tools

Then give people a local/private path that does not feel worse.

Vendor questions

Ask every vendor or partner:

Can matter access be separated?
Where are documents stored?
Where are embeddings stored?
Can logs be exported?
Can models be restricted?
Can the system run on-prem or in a private environment?
What is the offboarding process?
Who handles updates and support?

Bottom line

The best law-firm AI stack is boring from a security perspective and useful from a lawyer perspective. If it is secure but painful, lawyers will dodge it. If it is easy but unmanaged, risk moves faster than policy.

Run the AI egress audit before choosing tools.