Architecture

Why server-custodied AI keys beat per-laptop secrets

June 12, 2026·7 min read·By the EvolIDE team

EvolIDE blog preview — server-custodied AI keys: OpenAI, Anthropic, and Gemini keys held encrypted on the gateway, with the client holding only a JWT session.

Provider keys on every laptop is the most common — and most overlooked — AI coding risk. EvolIDE flips the model: keys live encrypted on the gateway, the client holds a session token only. If a device is lost, you revoke the token. Nothing leaks.

Ask a security team how they feel about an AI coding rollout and the answer almost never starts with the model. It starts with the key. Where does the OpenAI / Anthropic / Gemini key live? On every developer machine? In an env var? In a config file someone might check in? In a shared password manager? Across a fleet of a few hundred engineers, even one of these patterns is a breach waiting to happen.

Most AI IDEs default to the easy answer: put the key on the laptop. It works on day one. It does not survive an audit, and it does not survive an angry stolen-laptop incident report.

The hidden problem: keys leak with every laptop

A provider API key is a long-lived bearer credential. Whoever holds it can call the model — bill it, log it, exfiltrate context with it. There is no per-call user identity attached. Three things follow:

Rotation is theatre. Rotating the key everywhere takes coordination most orgs cannot achieve. A leaked key is usable until every developer has refreshed their config.
Per-developer caps are a fiction. If the key is at the edge, you cannot enforce rate limits or budgets server-side without trusting the client.
Audit is post-hoc. Provider dashboards do not know which engineer made which call. You see aggregate spend, not who did what.

The inversion: zero keys on client

EvolIDE puts cloud provider keys in exactly one place: the gateway, encrypted at rest. The desktop client never sees them. When the agent needs to call a model, it asks the gateway, which:

Authenticates the session token attached to the request.
Resolves the right provider key from custody, in memory only.
Proxies the call to the provider, returning the response to the client.
Writes a usage ledger entry: tenant, user, model, tokens, cost, timestamp.

The provider key is never returned to the client. It is never logged in plaintext. It is never written to disk on the developer machine. There is nothing to leak from the laptop because there is nothing sensitive on it.

What this changes for security review

Security teams ask three questions about any new tool. With server-custodied keys, the answers get much shorter.

1. What does the client have access to?

A scoped session token, expiring on a known schedule. That is the entire blast radius of a compromised device. Revoke the device in the dashboard and the bearer token stops working — without touching a single provider key.

2. How do you know what we are spending?

Every cloud call writes a ledger entry on the gateway. Spend rolls up per tenant, per user, per model. The wallet enforces hard stops at the gateway, not the client — so a compromised or misconfigured client cannot burn through a budget.

3. Can we bring our own keys?

Yes. Enterprise BYOK keys are stored encrypted at rest, per tenant, on the gateway. The pattern is identical: keys never live on a laptop, only on the server that calls the provider.

What teams give up — and what they don’t

Two concerns come up consistently when teams hear “the keys are server-side.”

“Doesn’t this make me dependent on your gateway?” The gateway is the same software for managed cloud, self-hosted, and on-prem deployments. Air-gapped teams run a private gateway in their own VPC with no outbound to the EvolIDE cloud. The custody model does not change.

“What about local models?” Local runtimes — Ollama, LM Studio, vLLM, llama.cpp — have no provider keys to leak, so they continue working without involving the gateway at all. The custody question only applies to cloud calls, which is exactly where it should.

Key takeaways

Provider keys on every laptop is a hidden, system-wide credential leak waiting to happen.
EvolIDE stores cloud keys only on the gateway, encrypted at rest. The client holds a session token.
Lost device? Revoke the token. Nothing else changes.
Spend is metered at the gateway, so budgets are enforced regardless of client state.
BYOK and on-prem deployments use the same custody model — the inversion is universal.

Frequently asked

Does the client ever hold a provider API key?

No. The desktop client only holds a session token (JWT). Every cloud LLM call is proxied through the gateway, which holds the encrypted provider keys and resolves them at runtime.

What happens if a laptop is stolen?

The attacker has at most an expiring session token. Revoke the device in the dashboard and nothing else needs to change — provider keys never left the gateway.

Can I bring my own keys (BYOK)?

Yes. Enterprise BYOK keys are stored encrypted at rest on the gateway per tenant. End users still never see them.

Keep reading

EvolIDE blog preview — agent run timeline with Plan, Scaffold, Build, Verify phases; a resume-from-checkpoint card highlighted in cyan with patch and token stats.

Engineering

Multi-phase agent builds: how to make long AI coding tasks survive

How EvolIDE splits oversized prompts into resilient phases that save incrementally and resume past partial failures.

Cost

The hidden cost of the wrong AI model: 57+ models, one task

Picking the right model is often the difference between $1 and $10 for the same outcome. The advisor explains why.

EvolIDE blog preview — four local runtimes (Ollama, LM Studio, vLLM, llama.cpp) auto-detected, each connected to localhost with a model badge. Offline mode ready.

Local-first

Local-first AI coding with Ollama, LM Studio, vLLM, and llama.cpp

You don't need an account or a cloud key to use an agent IDE. Here's how the four major local runtimes plug in.