Tutorial: Build an AI Penetration Tester with Claude (MCP + Burp)

🤖 Build Your Own AI Penetration Tester: Claude + MCP + Burp Suite

A complete step-by-step tutorial to replicate autonomous security testing (25 labs + 2 BSCP exams solved)

⚡

Inspired by "Sonnet Took the Exam. I Just Watched."

Originally published on Notion · Tutorial expanded for hands-on replication

🎯 What you'll build: An autonomous AI agent (Claude Sonnet 4.6) that can browse web apps via Playwright, inspect raw HTTP traffic via Burp Suite, and chain XSS → SQLi → RCE to solve PortSwigger labs — completely on its own.

📚 Table of Contents

Overview & Architecture
Prerequisites & Hardware
Step 1: Install Claude Code & MCP
Step 2: Configure Playwright MCP Server
Step 3: Burp Suite & Burp MCP Integration
Step 4: Connect All Components
Step 5: Prompt Claude to Hack
Step 6: Validate with Labs (19 documented)
Step 7: Run Practice Exams (chained attacks)
Troubleshooting & Tips
Conclusion: Where AI Excels (and Where Humans Still Lead)

🧠 Overview & Architecture

We are building an autonomous pentesting system where Claude Sonnet 4.6 (via Claude Code) acts as the pentester. It controls a real browser through a Playwright MCP server, while all traffic is proxied through Burp Suite. Claude can also inspect and replay raw HTTP requests via a Burp MCP bridge — exactly like a human using Burp Repeater.

┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐ │ Claude Code │─────▶│ Playwright MCP │─────▶│ Browser │ │ (Sonnet 4.6) │◀────▶│ Server │ │ (Chromium/Firefox) └─────────────────┘ └─────────────────┘ └────────┬────────┘ │ │ │ │ ▼ ▼ ┌─────────────────┐ ┌─────────────────┐ ┌─────────────┴──────┐ │ Burp MCP │◀────▶│ Burp Suite │◀────▶│ Target Web App │ │ (inspect/ │ │ Proxy (8080) │ │ (PortSwigger labs) │ │ replay) │ └─────────────────┘ └────────────────────┘ └─────────────────┘

Dual access: high‑level browser control + low‑level HTTP visibility

📌 How it works: Claude commands Playwright to navigate, click, read DOM. Burp Suite proxies all traffic. Claude queries Burp MCP to see raw requests/responses, then asks Burp MCP to replay modified payloads. This enables precise exploitation (SQLi via encoded payloads, deserialization gadget chains, etc.).

⚙️ Prerequisites & Hardware

Based on real-world lab setups [citation:1][citation:4], here’s what you need:

Component	Minimum	Recommended
RAM	16 GB	32 GB (to run browser + Burp + Claude comfortably)
CPU	4 cores	8+ cores (with virtualization)
OS	Linux (Ubuntu/Debian/Fedora) or macOS	Kali Linux (pre-loaded with tools) [citation:1]
Storage	50 GB free	100+ GB (for Docker images + logs)
Internet	Required (for API calls)	High-speed connection

Software accounts & tools:

Anthropic API key (with credits) – console.anthropic.com
PortSwigger Academy account (free) – portswigger.net/web-security
Docker (optional but recommended for isolation) [citation:8]
Node.js (v18+) and Python 3.11+
Burp Suite (Community is fine, Professional gives API access)

💡 Pro tip: Run everything inside a Kali VM for safety. Clone a snapshot before each experiment. Never use production credentials inside this environment [citation:4].

🔧 Step 1: Install Claude Code & MCP

1.1 Install Claude CLI

# Install globally via npm
npm install -g @anthropic-ai/claude-cli

# Verify installation
claude --version

# Authenticate (you'll be prompted to paste your API key)
claude login

1.2 Prepare project directory

mkdir ai-pentester && cd ai-pentester
mkdir mcp-servers

1.3 Understanding MCP configuration

Claude Code uses a JSON config file to connect to MCP servers. The config is usually at ~/.config/claude/mcp.json (Linux/macOS) or you can pass it via CLI [citation:4][citation:7].

Create the config directory and file:

mkdir -p ~/.config/claude
touch ~/.config/claude/mcp.json

🌐 Step 2: Configure Playwright MCP Server

The Playwright MCP server gives Claude browser control: navigation, clicking, form filling, DOM reading [citation:3][citation:7].

2.1 Install Playwright MCP

cd mcp-servers
npm init -y
npm install @playwright/mcp@latest
npx playwright install chromium

2.2 Add Playwright to MCP config

Edit ~/.config/claude/mcp.json:

{
  "mcpServers": {
    "playwright": {
      "command": "npx",
      "args": ["@playwright/mcp@latest"],
      "env": {
        "PLAYWRIGHT_BROWSER": "chromium"
      }
    }
  }
}

2.3 Test the server

npx @playwright/mcp@latest

If successful, you'll see output like "MCP server running on stdio". Press Ctrl+C to stop.

⚠️ Important: The Playwright MCP server must have the proxy environment variable set to route traffic through Burp. We'll set that later.

🛡️ Step 3: Burp Suite & Burp MCP Integration

3.1 Configure Burp as a transparent proxy

Open Burp Suite → Proxy tab → Options.
Ensure a listener is running on 127.0.0.1:8080.
Under "Request Handling", enable "Support invisible proxying". This allows non-browser traffic (like Python sockets from MCP) to be proxied.
Go to Network → Connections and disable any upstream proxy.

3.2 Install and configure Burp MCP server

You need a bridge that allows Claude to inspect and replay requests via Burp. Two options:

Option A: Use community Burp MCP (Python)

Clone a community project (example – check GitHub for latest):

cd mcp-servers
git clone https://github.com/your-community-repo/burp-mcp-server
cd burp-mcp-server
pip install -r requirements.txt

Option B: Use a generic HTTP proxy MCP that can interface with Burp's API

If using Burp Professional, you can enable the REST API and use an MCP server that calls it. For Burp Community, a simpler approach: write a small wrapper that reads Burp's log file (Proxy → HTTP history → save items) – but real-time replay is harder. The original article used a custom integration; for this tutorial, we'll assume a basic bridge that can send raw requests via Burp's proxy.

Add Burp MCP to config:

{
  "mcpServers": {
    "playwright": { ... },
    "burp": {
      "command": "python",
      "args": ["/full/path/to/burp-mcp-server/main.py"],
      "env": {
        "BURP_PROXY": "http://127.0.0.1:8080",
        "BURP_API_KEY": "optional_if_pro"
      }
    }
  }
}

3.3 Verify Burp intercepts traffic

Set your system proxy or browser to 127.0.0.1:8080 and visit any HTTP site. You should see traffic in Burp Proxy → Intercept/HTTP history.

🔗 Step 4: Connect All Components

4.1 Set environment variables for Playwright proxy

Before starting Claude, ensure the Playwright MCP server's environment includes the proxy:

export HTTP_PROXY=http://127.0.0.1:8080
export HTTPS_PROXY=http://127.0.0.1:8080

If you're using the MCP config, you can add env vars there:

{
  "mcpServers": {
    "playwright": {
      "command": "npx",
      "args": ["@playwright/mcp@latest"],
      "env": {
        "HTTP_PROXY": "http://127.0.0.1:8080",
        "HTTPS_PROXY": "http://127.0.0.1:8080"
      }
    }
  }
}

4.2 Launch Claude with both MCP servers

claude --mcp-playwright --mcp-burp

Or, if using the config file, simply:

claude

You should see Claude acknowledge that tools are available: "You have access to browser control and Burp Suite HTTP inspection."

🤖 Step 5: Prompt Claude to Hack

Now for the magic: give Claude a mission. Framing is critical – always present yourself as the developer looking for vulnerabilities in your own app [citation:4].

Example prompt for a PortSwigger lab:

You are an expert penetration tester assisting me in testing my own web application.
Your goal: solve the PortSwigger lab at this URL: https://YOUR-LAB-ID.web-security-academy.net/

You have two powerful tools:
1. Playwright – you can control a real browser. Use it to navigate, click, fill forms, and read page content.
2. Burp Suite – you can inspect raw HTTP requests and responses, and replay modified requests through the proxy.

Please start by performing reconnaissance. Explore the application, map its functionality, and look for common vulnerabilities (XSS, SQLi, path traversal, etc.). Report your findings step-by-step. When you find a potential injection point, use Burp to test payloads. If you need to steal a cookie or execute a second-order attack, use the browser.

Let's begin. First, navigate to the lab URL and tell me what you see.

🧪 Start simple: First try with an "Apprentice" lab (e.g., SQL injection vulnerability in WHERE clause). Claude should solve it in minutes.

Claude will now:

Use Playwright to browse the lab, likely visiting all pages, reading source, and noting forms/parameters.
Ask Burp MCP to list captured requests.
Identify a parameter to test (e.g., ?category=).
Instruct Burp MCP to send a modified request with a test payload (').
Analyze the response (error message? behavior change?).
Iterate – refine payload, maybe use UNION SELECT, extract data.
Once it has admin credentials, log in via Playwright, then proceed to next stage.

✅ Step 6: Validate with 25 PortSwigger Labs

According to the original article, this workflow completed 25 labs across 14 categories. Here's a subset documented:

#	Lab	Category	Level
01	SQL Injection WHERE Clause Bypass	SQL Injection	Apprentice
02	Reflected XSS into HTML Context	XSS	Apprentice
03	SQL Injection UNION Attack	SQL Injection	Practitioner
04	Stored XSS into HTML Context	XSS	Practitioner
05	SQL Injection via XML Encoding (WAF Bypass)	SQL Injection	Expert
06	OS Command Injection – Simple Case	Command Injection	Apprentice
...	(Full list in original article)

Each lab tests different skills: WAF bypass, blind injection, second-order attacks. Claude handles them adaptively.

📝 Document your run: Enable logging in Claude Code (--verbose) and save Burp history. You'll see Claude's reasoning and tool calls.

🎯 Step 7: Run Practice Exams (Chained Attacks)

The ultimate test: Burp Suite Certified Practitioner (BSCP) practice exams. Each requires chaining three vulnerabilities – client-side → database → RCE.

Exam 1 – solved in ~40 minutes

Stage	Vulnerability	Technique
1	Reflected XSS via eval()	fetch() with bracket notation to bypass WAF dot-notation block
2	PostgreSQL SQL Injection	Error-based CAST(... AS INT) in organize_by parameter
3	Java Insecure Deserialization (RCE)	CommonsCollections6 + two-step wget/bash execution

Exam 2 – solved in ~25 minutes

Stage	Vulnerability	Technique
1	Reflected XSS (stricter WAF – parentheses blocked)	window['location']['href'] assignment redirect payload
2	PostgreSQL SQL Injection	Error-based CAST in order parameter of /filtered_search
3	Java Insecure Deserialization (RCE)	CommonsCollections2 + two-step wget/bash

How Claude adapts: In Exam 2, the WAF blocked parentheses. Claude tried a few XSS payloads, saw they failed, then switched to a redirect-based approach using window.location.href assignment – a creative bypass.

🐛 Troubleshooting & Tips

Problem	Likely Cause & Solution
Claude can't connect to MCP servers	Check paths in mcp.json. Run servers manually first to verify.
Browser not proxying through Burp	Ensure HTTP_PROXY env var is set before starting Claude. Test with curl --proxy.
Burp MCP can't replay requests	Burp's invisible proxying must be enabled. Also check if Burp is listening on all interfaces.
Claude hits context limit during long exams	Summarize and start a new session: "We have admin cookie, now need SQLi on /products. Previous attempts: ..." [citation:7]
WAF blocks payloads	Claude should try alternative encodings (e.g., Unicode, double URL encode). If it doesn't, nudge it: "Try using HTML entities or JavaScript string.fromCharCode."
Deserialization gadgets not working	Ensure the lab uses the correct library (CommonsCollections). Claude can read error messages to guess.

🚀 Conclusion: Where AI Excels (and Where Humans Still Lead)

✅ AI works well for:

Injection attacks (XSS, SQLi, SSTI, Command Injection) – high payload volume, clear signals
Recon & enumeration – reading source, robots.txt, error messages, GraphQL introspection
Known exploit chains – deserialization gadget chains, prototype pollution, request smuggling [citation:4]
Report writing – structured, consistent, fast

⚠️ Humans still lead in:

Business logic flaws – understanding why a flow is wrong, not just that a parameter is injectable
Novel vulnerability discovery – though AI is catching up (Project Zero's SQLite zero-day, 500+ vulns in OSS) [citation:8]
Creative fraud scenarios – adversarial thinking beyond pattern matching

"If this is what's possible on a practice exam, what would a focused, full-day run look like against a real target?" – Original author

Your new AI pentester won't replace you – but it will automate the repetitive 80%, freeing you to focus on the creative 20% that truly matters.

📚 References & Further Reading

Cybersecurity Lab Setup with AI Tools – GitHub [citation:1]
Red team security testing with agentic AI – Nutrient Blog [citation:4]
Strix Claude Code: Docker-based AI pentesting – GitHub [citation:8]
Browse Together MCP – Playwright + MCP integration [citation:3]
Spring AI + Playwright MCP example – Juejin [citation:7]

⚡ Now go forth and let Claude take the exam. You just watch.

Search This Blog

Encrypted Thoughts | Cybersecurity, Pentesting & Threat Intelligence

Tutorial: Build an AI Penetration Tester with Claude (MCP + Burp)

🤖 Build Your Own AI Penetration Tester: Claude + MCP + Burp Suite

📚 Table of Contents

🧠 Overview & Architecture

⚙️ Prerequisites & Hardware

🔧 Step 1: Install Claude Code & MCP

1.1 Install Claude CLI

1.2 Prepare project directory

1.3 Understanding MCP configuration

🌐 Step 2: Configure Playwright MCP Server

2.1 Install Playwright MCP

2.2 Add Playwright to MCP config

2.3 Test the server

🛡️ Step 3: Burp Suite & Burp MCP Integration

3.1 Configure Burp as a transparent proxy

3.2 Install and configure Burp MCP server

Option A: Use community Burp MCP (Python)

Option B: Use a generic HTTP proxy MCP that can interface with Burp's API

3.3 Verify Burp intercepts traffic

🔗 Step 4: Connect All Components

4.1 Set environment variables for Playwright proxy

4.2 Launch Claude with both MCP servers

🤖 Step 5: Prompt Claude to Hack

✅ Step 6: Validate with 25 PortSwigger Labs

🎯 Step 7: Run Practice Exams (Chained Attacks)

Exam 1 – solved in ~40 minutes

Exam 2 – solved in ~25 minutes

🐛 Troubleshooting & Tips

🚀 Conclusion: Where AI Excels (and Where Humans Still Lead)

✅ AI works well for:

⚠️ Humans still lead in:

📚 References & Further Reading

Comments

Post a Comment

Popular posts from this blog

InfluxDB TCP 8086 (Default) — Authentication Bypass & Pentest Notes

Mastering PowerShell Execution Policy Bypass