How Doxi Protects Sensitive Documents: A Technical Deep Dive
Zero-Knowledge Document Processing: A Technical Deep Dive
How Doxi processes documents without ever seeing their contents. A look at our cryptographic approach.
The Challenge
Document processing is incredibly useful. Extracting data from contracts, summarizing reports, classifying filings — AI excels at these tasks. But there's a problem: the AI needs to see the document.
For sensitive documents — legal filings, medical records, immigration papers — this creates unacceptable risk. What if the AI provider is compromised? What if they're subpoenaed? What if their terms of service change?
Zero-Knowledge Architecture
Zero-knowledge processing means the service provider never has access to document contents in plaintext. This isn't about trust — it's about architecture that makes trust unnecessary.
How It Works
1. Client-Side Encryption Documents are encrypted before leaving your device. The encryption key never leaves your control.
2. Secure Enclaves Processing happens in hardware-isolated enclaves (Intel SGX, AWS Nitro). Even with full server access, an attacker can't read the data being processed.
3. Encrypted Results Outputs are encrypted before leaving the enclave. Only you can decrypt them.
The Cryptographic Stack
- Encryption: AES-256-GCM for document encryption
- Key Exchange: X25519 for secure key establishment
- Attestation: Remote attestation to verify enclave integrity
- Signatures: Ed25519 for result authentication
What We Can't Do
Let's be clear about the limitations:
- We can't read your documents. Ever.
- We can't provide them to law enforcement. We don't have them.
- We can't train on your data. We never see it.
- We can't recover your documents if you lose your keys.
This last point is important. Zero-knowledge means zero knowledge. If you lose your encryption keys, your documents are unrecoverable. We recommend robust key management.
The Processing Pipeline
1. Document Upload
Your client encrypts the document and establishes a secure channel to our enclave.
2. Enclave Attestation
Before sending anything, you verify you're talking to a genuine, unmodified enclave.
3. Secure Processing
The encrypted document is sent to the enclave, decrypted inside (where even we can't see it), processed, and results encrypted.
4. Result Retrieval
You receive encrypted results that only you can decrypt.
Use Cases
Immigration Document Processing
Immigration documents contain highly sensitive personal information. Doxi enables law firms to use AI document processing without creating discovery or breach risks.
Healthcare Records
HIPAA compliance is necessary but not sufficient. Zero-knowledge architecture means even a breach doesn't expose patient data.
Legal Discovery
Law firms can use AI to process discovery documents without creating new custody chains or risking privilege.
Performance Considerations
Zero-knowledge adds overhead. Encryption/decryption, enclave context switches, and attestation all take time. For Doxi:
- Latency: ~15% overhead vs. plaintext processing
- Throughput: Limited by enclave memory, but parallel processing is supported
- Cost: Enclave instances are more expensive, but the security trade-off is worth it
The Trust Model
With Doxi, you don't trust us. You trust:
- Hardware manufacturers (Intel, AMD) for enclave implementation
- Cryptographic primitives that have been publicly vetted
- Open-source code that you can audit
Our role is to build on these foundations, not to ask for your trust.
Getting Started
Doxi offers a free tier for individuals and small teams. For enterprise deployments, we offer on-premises options with your own enclave infrastructure.
Try Doxi free at /products/doxi or contact us for enterprise options.
Want to learn more?
Get in touch to discuss how we can help your organization.