Tutorial: Build an AI Penetration Tester with Claude (MCP + Burp)

Tutorial: Build an AI Penetration Tester with Claude (MCP + Burp)

๐Ÿค– Build Your Own AI Penetration Tester: Claude + MCP + Burp Suite

A complete step-by-step tutorial to replicate autonomous security testing (25 labs + 2 BSCP exams solved)

Inspired by "Sonnet Took the Exam. I Just Watched."

Originally published on Notion · Tutorial expanded for hands-on replication

๐ŸŽฏ What you'll build: An autonomous AI agent (Claude Sonnet 4.6) that can browse web apps via Playwright, inspect raw HTTP traffic via Burp Suite, and chain XSS → SQLi → RCE to solve PortSwigger labs — completely on its own.


๐Ÿ“š Table of Contents

  1. Overview & Architecture
  2. Prerequisites & Hardware
  3. Step 1: Install Claude Code & MCP
  4. Step 2: Configure Playwright MCP Server
  5. Step 3: Burp Suite & Burp MCP Integration
  6. Step 4: Connect All Components
  7. Step 5: Prompt Claude to Hack
  8. Step 6: Validate with Labs (19 documented)
  9. Step 7: Run Practice Exams (chained attacks)
  10. Troubleshooting & Tips
  11. Conclusion: Where AI Excels (and Where Humans Still Lead)

๐Ÿง  Overview & Architecture

We are building an autonomous pentesting system where Claude Sonnet 4.6 (via Claude Code) acts as the pentester. It controls a real browser through a Playwright MCP server, while all traffic is proxied through Burp Suite. Claude can also inspect and replay raw HTTP requests via a Burp MCP bridge — exactly like a human using Burp Repeater.

┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐ │ Claude Code │─────▶│ Playwright MCP │─────▶│ Browser │ │ (Sonnet 4.6) │◀────▶│ Server │ │ (Chromium/Firefox) └─────────────────┘ └─────────────────┘ └────────┬────────┘ │ │ │ │ ▼ ▼ ┌─────────────────┐ ┌─────────────────┐ ┌─────────────┴──────┐ │ Burp MCP │◀────▶│ Burp Suite │◀────▶│ Target Web App │ │ (inspect/ │ │ Proxy (8080) │ │ (PortSwigger labs) │ │ replay) │ └─────────────────┘ └────────────────────┘ └─────────────────┘

Dual access: high‑level browser control + low‑level HTTP visibility

๐Ÿ“Œ How it works: Claude commands Playwright to navigate, click, read DOM. Burp Suite proxies all traffic. Claude queries Burp MCP to see raw requests/responses, then asks Burp MCP to replay modified payloads. This enables precise exploitation (SQLi via encoded payloads, deserialization gadget chains, etc.).

⚙️ Prerequisites & Hardware

Based on real-world lab setups [citation:1][citation:4], here’s what you need:

ComponentMinimumRecommended
RAM16 GB32 GB (to run browser + Burp + Claude comfortably)
CPU4 cores8+ cores (with virtualization)
OSLinux (Ubuntu/Debian/Fedora) or macOSKali Linux (pre-loaded with tools) [citation:1]
Storage50 GB free100+ GB (for Docker images + logs)
InternetRequired (for API calls)High-speed connection

Software accounts & tools:

  • Anthropic API key (with credits) – console.anthropic.com
  • PortSwigger Academy account (free) – portswigger.net/web-security
  • Docker (optional but recommended for isolation) [citation:8]
  • Node.js (v18+) and Python 3.11+
  • Burp Suite (Community is fine, Professional gives API access)
๐Ÿ’ก Pro tip: Run everything inside a Kali VM for safety. Clone a snapshot before each experiment. Never use production credentials inside this environment [citation:4].

๐Ÿ”ง Step 1: Install Claude Code & MCP

1.1 Install Claude CLI

# Install globally via npm
npm install -g @anthropic-ai/claude-cli

# Verify installation
claude --version

# Authenticate (you'll be prompted to paste your API key)
claude login

1.2 Prepare project directory

mkdir ai-pentester && cd ai-pentester
mkdir mcp-servers

1.3 Understanding MCP configuration

Claude Code uses a JSON config file to connect to MCP servers. The config is usually at ~/.config/claude/mcp.json (Linux/macOS) or you can pass it via CLI [citation:4][citation:7].

Create the config directory and file:

mkdir -p ~/.config/claude
touch ~/.config/claude/mcp.json

๐ŸŒ Step 2: Configure Playwright MCP Server

The Playwright MCP server gives Claude browser control: navigation, clicking, form filling, DOM reading [citation:3][citation:7].

2.1 Install Playwright MCP

cd mcp-servers
npm init -y
npm install @playwright/mcp@latest
npx playwright install chromium

2.2 Add Playwright to MCP config

Edit ~/.config/claude/mcp.json:

{
  "mcpServers": {
    "playwright": {
      "command": "npx",
      "args": ["@playwright/mcp@latest"],
      "env": {
        "PLAYWRIGHT_BROWSER": "chromium"
      }
    }
  }
}

2.3 Test the server

npx @playwright/mcp@latest

If successful, you'll see output like "MCP server running on stdio". Press Ctrl+C to stop.

⚠️ Important: The Playwright MCP server must have the proxy environment variable set to route traffic through Burp. We'll set that later.

๐Ÿ›ก️ Step 3: Burp Suite & Burp MCP Integration

3.1 Configure Burp as a transparent proxy

  1. Open Burp Suite → Proxy tab → Options.
  2. Ensure a listener is running on 127.0.0.1:8080.
  3. Under "Request Handling", enable "Support invisible proxying". This allows non-browser traffic (like Python sockets from MCP) to be proxied.
  4. Go to Network → Connections and disable any upstream proxy.

3.2 Install and configure Burp MCP server

You need a bridge that allows Claude to inspect and replay requests via Burp. Two options:

Option A: Use community Burp MCP (Python)

Clone a community project (example – check GitHub for latest):

cd mcp-servers
git clone https://github.com/your-community-repo/burp-mcp-server
cd burp-mcp-server
pip install -r requirements.txt

Option B: Use a generic HTTP proxy MCP that can interface with Burp's API

If using Burp Professional, you can enable the REST API and use an MCP server that calls it. For Burp Community, a simpler approach: write a small wrapper that reads Burp's log file (Proxy → HTTP history → save items) – but real-time replay is harder. The original article used a custom integration; for this tutorial, we'll assume a basic bridge that can send raw requests via Burp's proxy.

Add Burp MCP to config:

{
  "mcpServers": {
    "playwright": { ... },
    "burp": {
      "command": "python",
      "args": ["/full/path/to/burp-mcp-server/main.py"],
      "env": {
        "BURP_PROXY": "http://127.0.0.1:8080",
        "BURP_API_KEY": "optional_if_pro"
      }
    }
  }
}

3.3 Verify Burp intercepts traffic

Set your system proxy or browser to 127.0.0.1:8080 and visit any HTTP site. You should see traffic in Burp Proxy → Intercept/HTTP history.


๐Ÿ”— Step 4: Connect All Components

4.1 Set environment variables for Playwright proxy

Before starting Claude, ensure the Playwright MCP server's environment includes the proxy:

export HTTP_PROXY=http://127.0.0.1:8080
export HTTPS_PROXY=http://127.0.0.1:8080

If you're using the MCP config, you can add env vars there:

{
  "mcpServers": {
    "playwright": {
      "command": "npx",
      "args": ["@playwright/mcp@latest"],
      "env": {
        "HTTP_PROXY": "http://127.0.0.1:8080",
        "HTTPS_PROXY": "http://127.0.0.1:8080"
      }
    }
  }
}

4.2 Launch Claude with both MCP servers

claude --mcp-playwright --mcp-burp

Or, if using the config file, simply:

claude

You should see Claude acknowledge that tools are available: "You have access to browser control and Burp Suite HTTP inspection."


๐Ÿค– Step 5: Prompt Claude to Hack

Now for the magic: give Claude a mission. Framing is critical – always present yourself as the developer looking for vulnerabilities in your own app [citation:4].

Example prompt for a PortSwigger lab:

You are an expert penetration tester assisting me in testing my own web application.
Your goal: solve the PortSwigger lab at this URL: https://YOUR-LAB-ID.web-security-academy.net/

You have two powerful tools:
1. Playwright – you can control a real browser. Use it to navigate, click, fill forms, and read page content.
2. Burp Suite – you can inspect raw HTTP requests and responses, and replay modified requests through the proxy.

Please start by performing reconnaissance. Explore the application, map its functionality, and look for common vulnerabilities (XSS, SQLi, path traversal, etc.). Report your findings step-by-step. When you find a potential injection point, use Burp to test payloads. If you need to steal a cookie or execute a second-order attack, use the browser.

Let's begin. First, navigate to the lab URL and tell me what you see.
๐Ÿงช Start simple: First try with an "Apprentice" lab (e.g., SQL injection vulnerability in WHERE clause). Claude should solve it in minutes.

Claude will now:

  • Use Playwright to browse the lab, likely visiting all pages, reading source, and noting forms/parameters.
  • Ask Burp MCP to list captured requests.
  • Identify a parameter to test (e.g., ?category=).
  • Instruct Burp MCP to send a modified request with a test payload (').
  • Analyze the response (error message? behavior change?).
  • Iterate – refine payload, maybe use UNION SELECT, extract data.
  • Once it has admin credentials, log in via Playwright, then proceed to next stage.

✅ Step 6: Validate with 25 PortSwigger Labs

According to the original article, this workflow completed 25 labs across 14 categories. Here's a subset documented:

#LabCategoryLevel
01SQL Injection WHERE Clause BypassSQL InjectionApprentice
02Reflected XSS into HTML ContextXSSApprentice
03SQL Injection UNION AttackSQL InjectionPractitioner
04Stored XSS into HTML ContextXSSPractitioner
05SQL Injection via XML Encoding (WAF Bypass)SQL InjectionExpert
06OS Command Injection – Simple CaseCommand InjectionApprentice
...(Full list in original article)

Each lab tests different skills: WAF bypass, blind injection, second-order attacks. Claude handles them adaptively.

๐Ÿ“ Document your run: Enable logging in Claude Code (--verbose) and save Burp history. You'll see Claude's reasoning and tool calls.

๐ŸŽฏ Step 7: Run Practice Exams (Chained Attacks)

The ultimate test: Burp Suite Certified Practitioner (BSCP) practice exams. Each requires chaining three vulnerabilities – client-side → database → RCE.

Exam 1 – solved in ~40 minutes

StageVulnerabilityTechnique
1Reflected XSS via eval()fetch() with bracket notation to bypass WAF dot-notation block
2PostgreSQL SQL InjectionError-based CAST(... AS INT) in organize_by parameter
3Java Insecure Deserialization (RCE)CommonsCollections6 + two-step wget/bash execution

Exam 2 – solved in ~25 minutes

StageVulnerabilityTechnique
1Reflected XSS (stricter WAF – parentheses blocked)window['location']['href'] assignment redirect payload
2PostgreSQL SQL InjectionError-based CAST in order parameter of /filtered_search
3Java Insecure Deserialization (RCE)CommonsCollections2 + two-step wget/bash

How Claude adapts: In Exam 2, the WAF blocked parentheses. Claude tried a few XSS payloads, saw they failed, then switched to a redirect-based approach using window.location.href assignment – a creative bypass.


๐Ÿ› Troubleshooting & Tips

ProblemLikely Cause & Solution
Claude can't connect to MCP serversCheck paths in mcp.json. Run servers manually first to verify.
Browser not proxying through BurpEnsure HTTP_PROXY env var is set before starting Claude. Test with curl --proxy.
Burp MCP can't replay requestsBurp's invisible proxying must be enabled. Also check if Burp is listening on all interfaces.
Claude hits context limit during long examsSummarize and start a new session: "We have admin cookie, now need SQLi on /products. Previous attempts: ..." [citation:7]
WAF blocks payloadsClaude should try alternative encodings (e.g., Unicode, double URL encode). If it doesn't, nudge it: "Try using HTML entities or JavaScript string.fromCharCode."
Deserialization gadgets not workingEnsure the lab uses the correct library (CommonsCollections). Claude can read error messages to guess.

๐Ÿš€ Conclusion: Where AI Excels (and Where Humans Still Lead)

✅ AI works well for:

  • Injection attacks (XSS, SQLi, SSTI, Command Injection) – high payload volume, clear signals
  • Recon & enumeration – reading source, robots.txt, error messages, GraphQL introspection
  • Known exploit chains – deserialization gadget chains, prototype pollution, request smuggling [citation:4]
  • Report writing – structured, consistent, fast

⚠️ Humans still lead in:

  • Business logic flaws – understanding why a flow is wrong, not just that a parameter is injectable
  • Novel vulnerability discovery – though AI is catching up (Project Zero's SQLite zero-day, 500+ vulns in OSS) [citation:8]
  • Creative fraud scenarios – adversarial thinking beyond pattern matching

"If this is what's possible on a practice exam, what would a focused, full-day run look like against a real target?" – Original author

Your new AI pentester won't replace you – but it will automate the repetitive 80%, freeing you to focus on the creative 20% that truly matters.


๐Ÿ“š References & Further Reading

  1. Cybersecurity Lab Setup with AI Tools – GitHub [citation:1]
  2. Red team security testing with agentic AI – Nutrient Blog [citation:4]
  3. Strix Claude Code: Docker-based AI pentesting – GitHub [citation:8]
  4. Browse Together MCP – Playwright + MCP integration [citation:3]
  5. Spring AI + Playwright MCP example – Juejin [citation:7]

Now go forth and let Claude take the exam. You just watch.

Comments

Popular posts from this blog

InfluxDB TCP 8086 (Default) — Authentication Bypass & Pentest Notes

Mastering PowerShell Execution Policy Bypass