Skip to content

Architecture

protoPen is an autonomous pen-testing and security research agent that runs on a Steam Deck with attached RF/WiFi/RFID peripherals. It combines hardware-in-the-loop security assessments with threat intelligence capabilities.

System Diagram

┌─────────────────────────────────────────────────────────────────┐
│  Clients                                                        │
│  ┌──────────┐  ┌──────────────┐  ┌───────────────────────────┐  │
│  │ Chat UI  │  │ OpenAI API   │  │ A2A (protoWorkstacean)    │  │
│  │ (Gradio) │  │ /v1/chat     │  │ JSON-RPC /a2a             │  │
│  └────┬─────┘  └──────┬───────┘  └─────────────┬─────────────┘  │
│       └───────────────┬┘                        │                │
│                       ▼                         │                │
│              ┌────────────────┐                  │                │
│              │   server.py    │◄─────────────────┘                │
│              │   (FastAPI)    │                                   │
│              └───────┬────────┘                                   │
│                      │                                           │
│                      │                                           │
│                      ▼                                           │
│              ┌───────────────────┐                               │
│              │    LangGraph       │                               │
│              │  create_agent()    │                               │
│              │  + middleware      │                               │
│              └───────────────────┘                               │
│                      │                                           │
│                      ▼                                           │
│  ┌─────────────────────────────────────────────────────────┐    │
│  │                   Subagents (task tool)                  │    │
│  │  ┌──────────┐ ┌─────────┐ ┌────────┐                    │    │
│  │  │ Threat   │ │  Vuln   │ │ Intel  │  (Security          │    │
│  │  │ Scanner  │ │ Analyst │ │Reporter│   Research)         │    │
│  │  ├──────────┤ ├─────────┤ ├────────┤                    │    │
│  │  │  Recon   │ │ Exploit │ │Reporter│  (Pentest)         │    │
│  │  └──────────┘ └─────────┘ └────────┘                    │    │
│  └─────────────────────┬───────────────────────────────────┘    │
│                        ▼                                         │
│  ┌─────────────────────────────────────────────────────────┐    │
│  │                     Tool Layer                           │    │
│  │  portapack │ flipper │ marauder │ blackarch              │    │
│  │  device_manager │ engagement │ target_intel              │    │
│  │  cve_search │ security_feeds │ github_trending            │    │
│  │  browser │ security_memory │ discord_feed                │    │
│  └─────────────────────┬───────────────────────────────────┘    │
│                        ▼                                         │
│  ┌───────────────┐  ┌───────────────┐  ┌──────────────────┐    │
│  │  USB Serial   │  │  Network      │  │  Knowledge       │    │
│  │  (PortaPack,  │  │  (nmap, WiFi, │  │  Store (SQLite   │    │
│  │   Flipper,    │  │   bettercap,  │  │  + sqlite-vec    │    │
│  │   Marauder)   │  │   web)        │  │  + FTS5)         │    │
│  └───────────────┘  └───────────────┘  └──────────────────┘    │
│                                                                  │
│  ┌─────────────────────────────────────────────────────────┐    │
│  │                   Observability                          │    │
│  │  audit.py (JSONL) │ metrics.py (Prometheus)              │    │
│  │  tracing.py (Langfuse) │ Discord alerts                  │    │
│  └─────────────────────────────────────────────────────────┘    │
└─────────────────────────────────────────────────────────────────┘

Agent Backend

protoPen uses LangChain's create_agent() with a compiled LangGraph state machine. Features:

  • Middleware chain: Knowledge injection, audit logging, memory consolidation, message capture
  • Persistent sessions: SQLite-backed checkpointer that survives container restarts
  • Streaming: astream_events for real-time tool progress and text generation
  • Subagent delegation: The task tool spawns specialized create_react_agent subgraphs

Subagents

The lead agent delegates work to eleven specialized subagents via the task tool (synchronously, or detached with run_in_background=True — see Autonomy).

Security Research Domain

SubagentRoleTools
Threat ScannerScans CVE feeds, Exploit-DB, security RSS, GitHub for new threatscve_search, security_feeds, github_trending, browser, security_memory
Vuln AnalystDeep-reads advisories and PoCs, correlates with target intel, rates exploitabilitycve_search, security_feeds, security_memory, browser
Intel ReporterSynthesizes threat intel reports, publishes security digests to Discordsecurity_memory, discord_feed

Pentest Domain

SubagentRoleTools
OrchestratorMeta-orchestrator: runs a full scripted pipeline, interprets findings, probes high-value targets(delegates across the pentest tools)
ReconPassive reconnaissance -- RF survey, WiFi scan, network enumdevice_manager, portapack, flipper, marauder, blackarch, engagement
ExploitActive exploitation -- PMKID capture, signal replay, vuln scan (mode-gated)device_manager, portapack, flipper, marauder, blackarch, websocket_test, opsec, engagement
ReporterFinding synthesis -- triage, correlation, report generationengagement, security_memory

Blue / Purple Team Domain

SubagentRoleTools
DefenderCIS audits, hardening checks, container security, port baselinescis_audit, hardening_check, container_audit, engagement
Incident ResponderLog analysis, IOC matching, timeline reconstruction, containmentir_toolkit, net_monitor, engagement, security_memory
Purple TeamCorrelates red+blue results, measures ATT&CK coverage gapspurple_team, cis_audit, container_audit, net_monitor, ir_toolkit, engagement, security_memory

Self-Curation Domain

SubagentRoleTools
DreamMemory consolidation -- prune stale/duplicate/superseded facts (no shell, no SQL)memory_list, forget_memory, recent_activity, security_memory

Each subagent gets a filtered tool set and a focused system prompt. The lead agent decides which subagent to invoke based on the task type.

Engagement Lifecycle

  1. Start -- Name the engagement, define scope, set mode (passive/active/redteam)
  2. Recon -- Map the environment. Subagent: Recon
  3. Exploit -- Test vulnerabilities. Subagent: Exploit (mode-gated)
  4. Report -- Synthesize findings. Subagent: Reporter
  5. End -- Close the engagement, persist findings and report

All findings are logged in real time. Critical/high findings trigger Discord alerts.

System Prompt Composition

The system prompt is assembled from multiple sources:

  1. SOUL.md -- Agent identity, personality, values
  2. Hardware status -- Boot-time sitrep (devices connected, network, engagement state)
  3. Skills -- Research methodology and pentest playbooks from skills/ directory
  4. Subagent instructions -- Available subagent types and delegation rules
  5. Security context -- Dynamic injection via KnowledgeMiddleware (recent advisories, threat intel)
  6. Guidelines -- Operational rules and output conventions

Observability Stack

ComponentPurposeStorage
audit.pyJSONL log of every tool call with args, result, duration, session/sandbox/audit/audit.jsonl
metrics.pyPrometheus counters/histograms for LLM calls, tool latency, sessions/metrics endpoint
tracing.pyLangfuse spans for tool calls, organized by research phaseLangfuse server
Discord alertsReal-time webhook notifications for critical/high findingsDiscord channel

Part of the protoLabs autonomous development studio.