By Eduard Schwarzkopf — Nov 20, 2025

Building a Kali Linux MCP Server That Actually Works

I learn new things best when I actually play with them and apply what I've learned. Model Context Protocol (MCP). The hottest new Protocol on the market! After some head-scratching and iteration, I finally managed to connect my first MCP Server and the best part? It hacks! No, like really, it can hack for me!

I containerised Kali Linux, wired up the MCP bridge, and now run a reproducible, on-demand Kali toolbelt for my agents via Docker Compose.

Let me tell you the real story. I thought this would be a quick weekend hack. Three debugging sessions later, I learned more about Docker, privilege escalation, and MCP integration than any tutorial could teach me. Grab your coffee (or tea, I don't judge (maybe a little)).

What is MCP and Why Should You Care?

Model Context Protocol is Anthropic's standard for connecting AI systems to external tools and data sources. Think of it as a universal translator between your AI agents and the real world.

Instead of building custom integrations for every tool, MCP provides a standardised way for AI models to:

Execute commands safely
Access file systems
Query databases
Interact with APIs
And yes, run security tools

The beauty? Your AI agents can now orchestrate complex workflows that would normally require manual intervention.

The Journey: From Idea to Working System

The Initial Plan

My plan was simple. Use Kali as a Toolbelt for an Agent, so it can do some security stuff for me. Here are my steps:

Spin up Kali Linux in Docker
Install the MCP-Kali-Server from GitHub
Configure opencode to connect
Hack the planet
???
Profit

Spoiler Alert: Still in progress of step 4...

First Attempt: The Naive Approach

I searched for ready-to-use MCP Servers for Kali, and I actually found a blog post. Noice! I can just copy and paste the instruction and use it. You know what, let my agent do this for me:

Hey Orchestrator, analyze the blog post here and set up Kali MCP server that I can use. Here is the link: https://medium.com/@sasisachins2003/penetration-testing-made-simple-kali-mcp-with-docker-and-claude-desktop-6d50a6a60300

Got it! Let me scan the blog post and setup a ready to use Kali Linux MCP Server... anylzing blog post creating plan pulling data do some magic cast fireball writing files .... I'm done, you now got a ready to use Kali MCP, simple run docker compose up and connect it to your agent.

Awesome! Now let's hack the planet! But from what I've learned before, let's first verify, before actually using it.

*Open file docker-compose.yaml*

Hmmm.... i see....

*open file Dockerfile*

yep, i saw that in the post... ah ok... yes yes... yeah, its fucked

My agent unfortunately messed up the setup. So...

The MCP Integration Challenge

Getting the MCP server running was its own adventure. The MCP-Kali-Server project provides a Flask API that wraps common Kali tools, but integrating it with opencode required understanding how MCP actually works.

MCP uses JSON-RPC for communication. Your AI agent sends structured requests to the MCP server, which translates them into tool executions. The server then returns structured responses that the AI can understand and act upon.

The Technical Implementation

Dockerfile: Building the Foundation

Let me show you the Dockerfile that finally worked. This 55-line beast installs comprehensive Kali toolsets and sets up the MCP server environment:

/docker/Dockerfile

# Build an image that contains Kali tools + MCP-Kali-Server repo
FROM kalilinux/kali-rolling:latest
ENV DEBIAN_FRONTEND=noninteractive

# Basic tooling + python + git
RUN apt-get update && \
    apt-get install -y --no-install-recommends \
      ca-certificates curl gnupg lsb-release git python3 python3-pip \
      && rm -rf /var/lib/apt/lists/*

# Install Kali MCP package (optional if you prefer package-managed install)
# keep || true so layer doesn't fail on package changes in ephemeral environments
RUN apt-get update && \
    apt-get install -y --no-install-recommends mcp-kali-server \
    kali-tools-top10 \
    kali-tools-web \
    kali-tools-database \
    kali-tools-passwords \
    kali-tools-wireless \
    kali-tools-reverse-engineering \
    kali-tools-exploitation \
    kali-tools-social-engineering \
    kali-tools-sniffing-spoofing \
    kali-tools-post-exploitation \
    kali-tools-forensics \
    kali-tools-hardware \
    kali-tools-crypto-stego \
    kali-tools-vulnerability \
    kali-tools-web \
    kali-tools-information-gathering \
    && \
    rm -rf /var/lib/apt/lists/*


# Clone the MCP-Kali-Server repository into image
RUN git clone https://github.com/Wh0am123/MCP-Kali-Server.git /opt/MCP-Kali-Server

# Install python requirements from repo (the repo contains requirements.txt)
# Allow failures (|| true) if pip fails on platform-specific wheels; logs will show errors.
RUN pip3 install --no-cache-dir -r /opt/MCP-Kali-Server/requirements.txt || true

# Add entrypoint
COPY entrypoint.sh /usr/local/bin/entrypoint.sh
RUN chmod +x /usr/local/bin/entrypoint.sh

# Expose Kali API port; we map to localhost in docker-compose
EXPOSE 5000

# Add any additional packages
RUN apt-get update && \
    apt-get install -y --no-install-recommends gobuster

# Run entrypoint which starts the Flask API
CMD ["/usr/local/bin/entrypoint.sh"]

Why this works: We're installing comprehensive Kali toolsets, not just the base image. The kali-tools-* packages give us everything from nmap and gobuster to sqlmap and Metasploit. The Python environment setup ensures our MCP server has all its dependencies.

Key insight: Installing tools at build time makes the container larger but eliminates runtime dependency issues. In security tooling, reliability trumps container size.

Docker Compose: Orchestrating the Chaos

The Docker Compose configuration handles the networking and privilege requirements that gave me so much trouble:

/docker/docker-compose.yaml

services:
  kali:
    build:
      context: .
      dockerfile: Dockerfile
    container_name: kali-mcp
    privileged: true
    ports:
      - "127.0.0.1:5000:5000" # bind API to localhost only
    volumes:
      - ./share:/root/share
    environment:
      - PYTHONUNBUFFERED=1
    tty: true
    restart: unless-stopped
    stdin_open: true
    cap_add:
      - NET_ADMIN # For network interface manipulation
      - SYS_ADMIN # For mounting filesystems and other system operations

The privileged flag: Yes, this breaks Docker's security model. But we're running security tools that need raw network access. In a controlled environment, this is acceptable.

Volume mounts: Persistent storage for scan results and wordlists. If you don't know them, learn and use them!

Entrypoint Script: Simple but Essential

The entrypoint script is deceptively simple:

/docker/entrypoint.sh

#!/bin/bash
set -euo pipefail

exec python3 /opt/MCP-Kali-Server/kali_server.py --ip 0.0.0.0 --port 5000

Why a separate script? Flexibility. We can add initialisation logic, environment checks, or tool updates without rebuilding the container.

Opencode Configuration: The MCP Bridge

The final piece is configuring opencode to connect to our containerised Kali server:

/config/opencode.json

{
    "$schema": "https://opencode.ai/config.json",
    "mcp": {
        "kali_mcp": {
            "type": "local",
            "command": [
                "docker",
                "exec",
                "-i",
                "kali-mcp",
                "python3",
                "/opt/MCP-Kali-Server/mcp_server.py",
                "--server",
                "http://127.0.0.1:5000"
            ],
            "environment": {
                "PYTHONUNBUFFERED": "1"
            },
            "enabled": true,
            "timeout": 30000
        }
    },
}

The localhost binding: Notice the 127.0.0.1:5000 address. This ensures the MCP server is only accessible locally, adding a security layer while maintaining functionality.

The Results: What We Achieved

Reproducible Security Workflows

With the MCP bridge in place, I can now ask my AI agents to:

"Scan this IP range and identify open services"
"Check this web application for common vulnerabilities"
"Generate a wordlist for this target domain"
"Run a comprehensive security assessment"

The AI orchestrates the tools, interprets results, and provides actionable insights. No more manual tool switching or result copying.

Faster Reconnaissance and Triage

What used to take hours of manual tool execution now happens in minutes. The AI can run multiple tools in parallel, correlate results, and highlight the most critical findings.

Controlled Environment Benefits

Everything runs in a containerised environment. No tool conflicts. No host system pollution. Spin up, test, tear down. Perfect for both learning and production security assessments.

Automated Tool Orchestration

The real magic happens when tools work together. The AI can use nmap results to inform gobuster scans, use discovered endpoints for vulnerability testing, and chain together complex attack scenarios.

The Reality Check: Agent Limitations and Long-Running Scans

Now, let me be honest about something that bit me during this project. Agents are impatient. Really impatient.

You know how humans get restless waiting for a long scan to finish? Agents are worse. They'll start a comprehensive nmap scan, wait about three minutes, then basically throw their hands up and say "this is taking too long, let me try something else" and abort the operation.

This creates real challenges for security workflows. A thorough network scan might take several hours. A comprehensive wordlist attack could run overnight. Vulnerability assessments often require patience and persistence. But current agent architectures? They're built for quick interactions, not marathon operations.

The real-world impact is significant. You can't rely on agents for:

Large-scale network discovery that takes hours
Comprehensive wordlist attacks with massive dictionaries
Deep vulnerability scans that methodically test every endpoint
Any operation where "set it and forget it" is the expected workflow

But here's where it gets interesting. The solution isn't to make agents more patient. No! It's to build better systems. I've been thinking about experimenting with pipeline tools like n8n to trigger scans and continue workflows when they complete. The idea is simple: separate scan initiation from result processing.

Instead of asking an agent to "run this scan and tell me the results," you design asynchronous workflows. The agent kicks off the scan, the pipeline monitors completion, and a fresh agent processes the results when they're ready. It's like having a relay team instead of asking one runner to complete a marathon.

This limitation actually points to something important about the future of AI-powered security testing. We need orchestration systems that can handle long-running operations gracefully. The current "conversational" model of AI interaction doesn't map well to security workflows that might span hours or days.

For now, I work around this by asking the agent to do all quick operations and tell, when there is a long task that I need to trigger myself and provide the results later. Manually! Urgh.... I suspect this is a temporary workaround...

nothing is as permanent as a temporary fix

Security Considerations

Important: This setup is designed for controlled environments and ethical security testing. Never use these tools against systems you don't own or lack permission to test.

Best practices:

Run on isolated networks
Use VPNs for external testing
Document all activities
Follow responsible disclosure practices
Respect rate limits and target resources

Conclusion

Building this Kali Linux MCP server showed me the power of bridging traditional security tools with modern AI capabilities.

The debugging journey was frustrating but educational. Every failed attempt taught me something new about containerization, networking, or security tool requirements. The final working system feels like a genuine milestone – not just because it works, but because of everything I learned building it.

Remember: The goal isn't just to automate security tools. It's to create reproducible, scalable workflows that enhance human security professionals rather than replace them. AI agents can handle the repetitive scanning and correlation work, freeing us to focus on analysis, strategy, and creative problem-solving.

For any questions, hit me up on Mastodon, LinkedIn or check out the code repository. The complete setup is available, and I'd love to see what improvements you come up with.

Happy Coding and Hack the Planet!

Disclaimer: This setup is intended for authorised security testing and educational purposes only. Always ensure you have proper permission before testing any systems, and follow responsible disclosure practices for any vulnerabilities discovered.