AI Agent Security: Lock Down Claude Code, MCP Servers & OpenClaw
Stop Prompt Injection, Scope Your Credentials, and Ship Agents That Can't Be Owned
You gave your agent your own credentials just for testing, and now it runs in a loop at 3am with full access to everything you can touch. There is no security boundary, only a model that sometimes says no. This is the defensive field manual that hardens Claude Code, MCP servers, and OpenClaw by architecture, not by a model's mood: a threat model of your own agent, scoped credentials a prompt injection can't steal, a real sandbox, an untrusted-input gate, and a one-page hardening checklist you run on every agent you ship.
The agent-security shelf splits three ways: policy books that never touch your terminal, offensive pentest manuals, and zero-review templates padded with acronyms. None of them name Claude Code or OpenClaw. This one hands you a hardened fleet this weekend: by Chapter 2 a threat model with every untrusted input mapped, by Chapter 5 an agent boxed in gVisor or Firecracker with credentials it physically cannot exfiltrate, by Chapter 7 a cheap model that literally cannot send an email or spend a dollar, and by Chapter 10 a pen-test suite you fire at your own agent to watch each defense hold or fail. Over 40,000 words of copy-pasteable configs, real CVEs as case studies, and build steps you run on a live agent.
What You'll Build
Why model refusal is a probability that tracks your token budget, and only architecture is a real boundary.
Watch one prompt injection land step by step, then map your agent against the OWASP ASI01-ASI10 attack surface.
Write a deny-by-default least-privilege spec that treats the agent's access like a scoped API key, not your account.
Build a credential broker so the agent borrows five-minute tokens and holds no durable secret to steal.
Sandbox the agent with gVisor or Firecracker, a read-only filesystem, and a default-deny egress allowlist that catches exfiltration.
Stand up an input gate and a memory-write guard, then kill a planted payload at both the action and persistence points.
Split the agent into reader and actor so the untrusted path has no destructive tool to call, enforced in code.
Run four vetting gates (provenance, permissions, static scan, sandbox trial) that a real npm token-stealing trojan fails.
Give the agent its own identity and an on-behalf-of trail so every action answers who acted, passing a security review.
Fire a repeatable ASI attack suite at your hardened agent, score what holds, and wire it into CI as a required check.
Map your residual risk with a Swiss-cheese stack and a three-column register that names what you simply won't deploy.
Assemble the ten-step hardening checklist plus an isolate-revoke-audit-rotate incident playbook for your whole fleet.
Free Articles from this Book
The AI Agent Audit Log Lies. Here's How to Fix It
Your AI agent audit log says the human did it. Build distinct-identity attribution with an on-behalf-of trail so every agent action answers who really acted.
from: AI Agent Security: Lock Down Claude Code, MCP Servers & OpenClaw
Build a Credential Broker for Your AI Agent
Credential brokering stops your AI agent from holding a steal-able secret. Build a 30-line broker that mints scoped 5-minute tokens with a runnable test.
from: AI Agent Security: Lock Down Claude Code, MCP Servers & OpenClaw
How to Sandbox an AI Agent
How to sandbox an AI agent with four real walls: a default-deny egress allowlist, gVisor isolation, a read-only filesystem, and an approval gate you verify.
from: AI Agent Security: Lock Down Claude Code, MCP Servers & OpenClaw
MCP Server Security: Vet Any Server Before You Install
MCP server security for the people who install them: a four-gate vetting routine and the real package names, so a trojaned skill can't run as your agent.
from: AI Agent Security: Lock Down Claude Code, MCP Servers & OpenClaw
How to Stop Prompt Injection in Your AI Agent
Prompt injection defense for AI agents: build a PreToolUse gate that blocks a live payload plus a memory-write guard that kills the slow poisoning attack.
from: AI Agent Security: Lock Down Claude Code, MCP Servers & OpenClaw