Back to Blog
ai-infraagenticaidockermcporchestrationcontrol-plane

Building a Local Agentic Control Plane with Mac Mini, Mac Studio, Ollama, Qwen, Matrix, and iPhone Approval

How to build a local agentic control plane: architecture, MCP server integration, and self-hosted AI orchestration patterns.

February 5, 2026·17 min read

Building a Local Agentic Control Plane with Mac Mini, Mac Studio, Ollama, Qwen, Matrix, and iPhone Approval

Introduction

This post documents the build-out of a local, human-in-the-loop AI agent system. The goal was not to create another chatbot. The goal was to create a practical control plane where a local AI agent can receive work, reason over the task using a local model, pause for human approval, notify me on my phone, and then execute only after I approve.

The system is intentionally local-first. It does not rely on paid cloud AI APIs. The Mac Studio runs the model. The Mac Mini runs the agent controller. Matrix and Element provide the messaging and approval interface. The iPhone becomes a mobile approval console.

At the end of this implementation, the working flow became:

Drop task file → watcher detects it → approval required → Matrix notification → approve from phone → agent runs → summary generated

This is the foundation for spec-driven agentic development.


Why Build This?

Most AI coding tools are interactive. You open an IDE, type a prompt, and wait for code. That is useful, but it is not the same as a durable automation system.

The target system here is closer to an internal DevOps pipeline:

  1. A task arrives.
  2. The agent reads a spec.
  3. The agent asks a local model for a plan.
  4. The agent pauses if approval is required.
  5. A human approves from a phone.
  6. The agent runs.
  7. The system records output.

This design is useful for local coding agents, infrastructure automation, benchmark pipelines, future DGX GPU job triggers, personal AI operations, and spec-driven development workflows.

The core motivation is cognitive load reduction. When you are running multiple projects simultaneously, context-switching is the real cost. An async approval model means work moves forward without requiring you to be at a terminal. You approve from your phone, and the system handles the rest.


How This Compares to Other Agent Systems

Several open-source systems tackle autonomous coding agents. Understanding where they fit helps clarify what this system is optimizing for.

System Trigger model Approval gate Isolation Operator model Spec format
OpenClaw Chat message to gateway None (general routing) Not specified Multi-user Prompt-based
OpenHands API call or web UI None (runs end-to-end) Docker sandbox Multi-user GitHub issue
SWE-agent GitHub issue None (auto-resolves) Docker container Research pipeline GitHub issue
Aider Interactive CLI prompt None (interactive) None Single user Chat turn
AutoGPT Config file None (loop until done) Docker Multi-user Task file
This system File drop in inbox/ First-class SDLC gate Docker sandbox Single operator execution-plan.md

What makes this different

File-based inbox trigger. Dropping a .md file into a folder is auditable, git-trackable, and requires zero API surface. You can inspect every pending task as a plain text file. You can version-control your task queue. This is intentional.

SDLC gating at predefined stages. OpenClaw, OpenHands, and most other systems treat the entire run as one atomic operation. This system introduces the concept of stages (dev, staging, prod) where approval is only required at gates that matter. A dev-stage task runs automatically. A staging task requires your sign-off. The blast radius stays small.

Single-operator design. Most agent frameworks optimize for team collaboration and multi-user orchestration. This system optimizes for one person managing multiple parallel projects with minimal context-switching. The iPhone is a first-class interface because that is where approval happens in practice.

Local-first without apology. Ollama and Qwen run on dedicated hardware. No cloud API key rotation. No usage limits. No data leaving the Tailscale network. This matters for long-running workloads on hardware like the DGX.


Hardware and Services Used

Device Role
Mac Mini Always-on agent runner and control plane
Mac Studio Ollama model server running Qwen
MacBook Pro Operator workstation using terminal, VS Code, and Element
iPhone Mobile approval UI through Element
DGX Future heavy GPU workload execution target

The Mac Mini is the control plane because it can remain always-on. The Mac Studio is reserved for model serving. The MacBook Pro can move around without interrupting agent jobs.


High-Level Architecture

                 iPhone / MacBook Pro
                  Element client
                       |
                       | Matrix messages
                       v
             Matrix Synapse on Mac Studio
                       |
                       v
              Mac Mini Agent Runner
       watcher.py / spec_runner.py / listener.py
                       |
                       | Ollama HTTP API
                       v
              Mac Studio Local Models
                 qwen2.5-coder:32b
                       |
                       v
            Workspace files and outputs

The system has three important loops:

  1. File loop: trigger files appear in ~/agentic/inbox.
  2. Approval loop: approval requests move to ~/agentic/inbox/pending and notify Matrix.
  3. Execution loop: approved files return to inbox and the agent runs.

Why three separate loops instead of one

Most agent frameworks collapse these into one flow: receive task → run task → done. Splitting them into three distinct loops gives you control points.

The file loop means the queue is always inspectable. You can look at inbox/ at any time and see exactly what is waiting. There is no hidden in-memory state.

The approval loop means the system is async by default. You do not have to be at a terminal when a task arrives. The file sits in pending/ until you act on it. This is what makes the iPhone a viable interface instead of an afterthought.

The execution loop means approval and execution are decoupled. The watcher does not know about Matrix. The Matrix listener does not know about spec_runner. Each component has one job and fails independently.


Folder Layout

~/agentic/
├── configs/
│   ├── agent.env
│   └── hosts.env
├── inbox/
│   └── pending/
├── logs/
├── outbox/
│   └── rejected/
├── repos/
├── runner/
│   ├── watcher.py
│   ├── spec_runner.py
│   └── matrix_approval_listener.py
├── scripts/
│   ├── approve.sh
│   ├── reject.sh
│   ├── pending.sh
│   ├── status.sh
│   └── send_matrix.sh
├── secrets/
│   └── matrix-bot.env
└── workspaces/
    └── mini-control-plane/
        ├── .venv/
        ├── .project/
        │   └── specs/
        │       └── execution-plan.md
        ├── hello.txt
        └── agent-summary.md

This structure matters because it separates concerns. The inbox is not the workspace. The runner is not the model. The Matrix listener is not the executor. Each part has one job.


Step 1: Use Stable Hostnames Instead of IP Addresses

The system uses Tailscale MagicDNS names instead of raw DHCP IP addresses. This matters because home router DHCP assignments can change after reboots, outages, or network changes.

Create the host config:

mkdir -p ~/agentic/configs
vi ~/agentic/configs/hosts.env

Add:

# Tailscale MagicDNS hostnames
MACBOOK_HOST=your-macbook-hostname
MACMINI_HOST=your-mac-mini-hostname
MACSTUDIO_HOST=your-mac-studio-hostname
IPHONE_HOST=your-iphone-hostname

# Service endpoints
OLLAMA_BASE_URL=http://your-mac-studio-hostname:11434
MATRIX_HOMESERVER=http://your-mac-studio-hostname:8008
MATRIX_CLIENT_URL=http://your-mac-studio-hostname:8008
AGENT_RUNNER_HOST=your-mac-mini-hostname

# Paths
AGENTIC_HOME=$HOME/agentic
WORKSPACE_ROOT=$HOME/agentic/workspaces
REPO_ROOT=$HOME/agentic/repos
LOG_DIR=$HOME/agentic/logs
INBOX_DIR=$HOME/agentic/inbox
OUTBOX_DIR=$HOME/agentic/outbox

Then create:

vi ~/agentic/configs/agent.env
source $HOME/agentic/configs/hosts.env

OLLAMA_MODEL=qwen2.5-coder:32b
AGENT_MODE=local
SAFE_MODE=true

Validate that Mac Mini can reach Ollama on Mac Studio:

source ~/agentic/configs/agent.env
curl $OLLAMA_BASE_URL/api/tags

The successful result showed models such as qwen2.5-coder:32b, confirming that the Mac Mini could call the Mac Studio model server.


Step 2: Create the Agent Workspace

mkdir -p ~/agentic/{configs,inbox,logs,outbox,repos,runner,scripts,secrets,workspaces}
mkdir -p ~/agentic/inbox/pending
mkdir -p ~/agentic/outbox/rejected
mkdir -p ~/agentic/workspaces/mini-control-plane/.project/specs

Create the first spec file:

vi ~/agentic/workspaces/mini-control-plane/.project/specs/execution-plan.md
# Execution Plan

## Metadata
- id: test-001
- priority: low
- safe_mode: true

## Objective
Run a simple agent workflow test.

## Tasks
1. Create hello.txt with a message
2. Run git status
3. Generate a summary file

## Constraints
- Do not delete files
- Do not run destructive commands

## Expected Output
- hello.txt created
- agent-summary.md generated

The key design idea is that the spec is separate from the runner. The runner does not hardcode a human prompt. It reads a project-level execution plan.


Step 3: Create the Agent Runner

Create:

vi ~/agentic/runner/spec_runner.py
import os
from pathlib import Path
from datetime import datetime
import requests
import subprocess

AGENTIC_HOME = Path.home() / "agentic"
WORKSPACE_ROOT = Path(os.getenv("WORKSPACE_ROOT", AGENTIC_HOME / "workspaces"))
WORKSPACE_NAME = os.getenv("AGENT_WORKSPACE", "mini-control-plane")
OLLAMA_BASE_URL = os.getenv("OLLAMA_BASE_URL", "http://your-mac-studio-hostname:11434")
OLLAMA_MODEL = os.getenv("OLLAMA_MODEL", "qwen2.5-coder:32b")

project = WORKSPACE_ROOT / WORKSPACE_NAME
spec_file = project / ".project" / "specs" / "execution-plan.md"

if not spec_file.exists():
    raise FileNotFoundError(f"Missing spec file: {spec_file}")

plan = spec_file.read_text()

prompt = f"""
You are a local spec-driven coding agent.

Read this execution plan and create a short implementation plan.
Do not run destructive commands.

Execution plan:
{plan}
"""

response = requests.post(
    f"{OLLAMA_BASE_URL}/api/generate",
    json={
        "model": OLLAMA_MODEL,
        "prompt": prompt,
        "stream": False,
    },
    timeout=180,
)

response.raise_for_status()
model_output = response.json()["response"]

hello_file = project / "hello.txt"
hello_file.write_text("Hello from Mac Mini agent runner using Qwen on Mac Studio.\n")

git_status = subprocess.check_output(
    ["git", "status", "--short"],
    cwd=project,
    text=True,
)

summary = f"""# Agent Run Summary

Generated: {datetime.now().isoformat()}

Workspace: {WORKSPACE_NAME}
Model: {OLLAMA_MODEL}

## Model Plan

{model_output}

## Git Status

{git_status}
"""

(project / "agent-summary.md").write_text(summary)

print(summary)

Check syntax:

~/agentic/workspaces/mini-control-plane/.venv/bin/python -m py_compile ~/agentic/runner/spec_runner.py

Run it:

source ~/agentic/configs/agent.env
~/agentic/workspaces/mini-control-plane/.venv/bin/python ~/agentic/runner/spec_runner.py

This created hello.txt and agent-summary.md.


Step 4: Create the Watcher

The watcher is the event loop. It keeps checking the inbox for new .md task files.

Create:

vi ~/agentic/runner/watcher.py
import time
import os
from pathlib import Path
import subprocess

INBOX = Path(os.path.expanduser("~/agentic/inbox"))
LOG = Path(os.path.expanduser("~/agentic/logs/watcher.log"))

def log(msg):
    with open(LOG, "a") as f:
        f.write(msg + "\n")
    print(msg)

def run_agent(trigger_file):
    log(f"[+] Processing {trigger_file}")

    data = {}
    for line in trigger_file.read_text().splitlines():
        if ":" in line:
            k, v = line.split(":", 1)
            data[k.strip()] = v.strip()

    workspace = data.get("workspace", "mini-control-plane")
    action = data.get("action", "run")
    approval_required = data.get("approval_required", "false").lower()

    log(f"[>] workspace={workspace} action={action}")
    log(f"[>] approval_required={approval_required}")

    if approval_required == "true":
        pending_dir = Path(os.path.expanduser("~/agentic/inbox/pending"))
        pending_dir.mkdir(parents=True, exist_ok=True)

        new_path = pending_dir / trigger_file.name
        trigger_file.rename(new_path)

        log(f"[!] Waiting for approval → {new_path}")

        subprocess.run([
            os.path.expanduser("~/agentic/scripts/send_matrix.sh"),
            f"Approval required: {trigger_file.name}"
        ])

        return

    try:
        subprocess.run(
            [
                os.path.expanduser("~/agentic/workspaces/mini-control-plane/.venv/bin/python"),
                os.path.expanduser("~/agentic/runner/spec_runner.py")
            ],
            check=True
        )
        log("[✓] Agent run complete")
    except Exception as e:
        log(f"[!] Error: {e}")

    trigger_file.unlink()

def main():
    log("[*] Watcher started")

    while True:
        for f in INBOX.glob("*.md"):
            run_agent(f)
        time.sleep(3)

if __name__ == "__main__":
    main()

Run:

cd ~/agentic/runner
python3 watcher.py

Expected:

[*] Watcher started

Step 5: Add Local Approval Scripts

approve.sh

vi ~/agentic/scripts/approve.sh
#!/bin/bash
set -e

PENDING_DIR="$HOME/agentic/inbox/pending"
INBOX_DIR="$HOME/agentic/inbox"

FILE_NAME="$1"

if [ -z "$FILE_NAME" ]; then
  echo "Usage: approve.sh <pending-file-name>"
  echo "Example: approve.sh test-approval.md"
  exit 1
fi

PENDING_FILE="$PENDING_DIR/$FILE_NAME"
TARGET_FILE="$INBOX_DIR/$FILE_NAME"

if [ ! -f "$PENDING_FILE" ]; then
  echo "Pending file not found: $PENDING_FILE"
  exit 1
fi

sed 's/approval_required: true/approval_required: false/g' "$PENDING_FILE" > "$TARGET_FILE"
rm "$PENDING_FILE"

echo "Approved and moved back to inbox: $TARGET_FILE"
chmod +x ~/agentic/scripts/approve.sh

reject.sh

vi ~/agentic/scripts/reject.sh
#!/bin/bash
set -e

PENDING_DIR="$HOME/agentic/inbox/pending"
REJECTED_DIR="$HOME/agentic/outbox/rejected"

FILE_NAME="$1"

if [ -z "$FILE_NAME" ]; then
  echo "Usage: reject.sh <pending-file-name>"
  exit 1
fi

mkdir -p "$REJECTED_DIR"

PENDING_FILE="$PENDING_DIR/$FILE_NAME"
TARGET_FILE="$REJECTED_DIR/$FILE_NAME"

if [ ! -f "$PENDING_FILE" ]; then
  echo "Pending file not found: $PENDING_FILE"
  exit 1
fi

mv "$PENDING_FILE" "$TARGET_FILE"

echo "Rejected and moved to: $TARGET_FILE"
chmod +x ~/agentic/scripts/reject.sh

pending.sh

vi ~/agentic/scripts/pending.sh
#!/bin/bash
echo "Pending approvals:"
ls -1 "$HOME/agentic/inbox/pending" 2>/dev/null || echo "None"
chmod +x ~/agentic/scripts/pending.sh

status.sh

vi ~/agentic/scripts/status.sh
#!/bin/bash

echo "=== Agentic Status ==="
echo

echo "Pending approvals:"
ls -1 "$HOME/agentic/inbox/pending" 2>/dev/null || echo "None"
echo

echo "Rejected:"
ls -1 "$HOME/agentic/outbox/rejected" 2>/dev/null || echo "None"
echo

echo "Recent watcher log:"
tail -20 "$HOME/agentic/logs/watcher.log" 2>/dev/null || echo "No watcher log found"
chmod +x ~/agentic/scripts/status.sh

Step 6: Configure Matrix Notifications

Create Matrix bot user using Synapse client API:

source ~/agentic/configs/agent.env

curl -s -X POST "$MATRIX_HOMESERVER/_matrix/client/v3/register" \
  -H "Content-Type: application/json" \
  -d '{
    "username": "agentbot",
    "password": "<your-strong-password>",
    "auth": { "type": "m.login.dummy" }
  }' | jq

Store bot secrets:

vi ~/agentic/secrets/matrix-bot.env
MATRIX_BOT_USER=@agentbot:<your-tailscale-domain>
MATRIX_BOT_TOKEN=<token>
MATRIX_ROOM_ID=<room_id>

Secure it:

chmod 600 ~/agentic/secrets/matrix-bot.env

Create the Matrix room:

source ~/agentic/configs/agent.env
source ~/agentic/secrets/matrix-bot.env

curl -s -X POST "$MATRIX_HOMESERVER/_matrix/client/v3/createRoom" \
  -H "Authorization: Bearer $MATRIX_BOT_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "preset": "private_chat",
    "name": "Agentic Control Plane",
    "topic": "Mac Mini agent approvals and notifications"
  }' | jq

Invite your Matrix user if necessary:

curl -s -X POST "$MATRIX_HOMESERVER/_matrix/client/v3/rooms/$MATRIX_ROOM_ID/invite" \
  -H "Authorization: Bearer $MATRIX_BOT_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "user_id": "@youruser:<your-tailscale-domain>"
  }' | jq

Step 7: Create the Matrix Send Script

vi ~/agentic/scripts/send_matrix.sh
#!/bin/bash

source "$HOME/agentic/configs/agent.env"
source "$HOME/agentic/secrets/matrix-bot.env"

MESSAGE="$1"

curl -s -X POST "$MATRIX_HOMESERVER/_matrix/client/v3/rooms/$MATRIX_ROOM_ID/send/m.room.message" \
  -H "Authorization: Bearer $MATRIX_BOT_TOKEN" \
  -H "Content-Type: application/json" \
  -d "{
    \"msgtype\": \"m.text\",
    \"body\": \"$MESSAGE\"
  }"
chmod +x ~/agentic/scripts/send_matrix.sh
~/agentic/scripts/send_matrix.sh "TEST FROM MINI"

Expected: the test message appears in Element.


Step 8: Create the Matrix Approval Listener

vi ~/agentic/runner/matrix_approval_listener.py
import os
import time
import subprocess
import requests
from pathlib import Path

HOME = Path.home()
AGENTIC = HOME / "agentic"

def load_env(path):
    env = {}
    if not path.exists():
        return env
    for line in path.read_text().splitlines():
        line = line.strip()
        if not line or line.startswith("#") or "=" not in line:
            continue
        k, v = line.split("=", 1)
        env[k.strip()] = v.strip()
    return env

agent_env = load_env(AGENTIC / "configs" / "hosts.env")
bot_env = load_env(AGENTIC / "secrets" / "matrix-bot.env")

MATRIX_HOMESERVER = agent_env.get("MATRIX_HOMESERVER", "http://your-mac-studio-hostname:8008")
MATRIX_BOT_TOKEN = bot_env["MATRIX_BOT_TOKEN"]
MATRIX_ROOM_ID = bot_env["MATRIX_ROOM_ID"]

APPROVE_SCRIPT = str(AGENTIC / "scripts" / "approve.sh")
REJECT_SCRIPT = str(AGENTIC / "scripts" / "reject.sh")

next_batch = None

def log(msg):
    print(msg, flush=True)

def handle_message(body):
    body = body.strip()

    if body.startswith("approve "):
        filename = body.replace("approve ", "", 1).strip()
        log(f"[+] Approving {filename}")
        subprocess.run([APPROVE_SCRIPT, filename], check=False)

    elif body.startswith("reject "):
        filename = body.replace("reject ", "", 1).strip()
        log(f"[-] Rejecting {filename}")
        subprocess.run([REJECT_SCRIPT, filename], check=False)

def sync_once():
    global next_batch

    url = f"{MATRIX_HOMESERVER}/_matrix/client/v3/sync"
    params = {"timeout": "30000"}
    if next_batch:
        params["since"] = next_batch

    r = requests.get(
        url,
        headers={"Authorization": f"Bearer {MATRIX_BOT_TOKEN}"},
        params=params,
        timeout=40,
    )
    r.raise_for_status()

    data = r.json()
    next_batch = data.get("next_batch", next_batch)

    rooms = data.get("rooms", {}).get("join", {})
    room = rooms.get(MATRIX_ROOM_ID, {})
    events = room.get("timeline", {}).get("events", [])

    for event in events:
        if event.get("type") != "m.room.message":
            continue

        content = event.get("content", {})
        body = content.get("body", "")

        sender = event.get("sender", "")
        if sender == bot_env.get("MATRIX_BOT_USER"):
            continue

        handle_message(body)

def main():
    log("[*] Matrix approval listener started")
    log("[*] Commands: approve <file.md> | reject <file.md>")

    while True:
        try:
            sync_once()
        except Exception as e:
            log(f"[!] Listener error: {e}")
            time.sleep(5)

if __name__ == "__main__":
    main()

Check syntax:

~/agentic/workspaces/mini-control-plane/.venv/bin/python -m py_compile ~/agentic/runner/matrix_approval_listener.py

Run:

cd ~/agentic/runner
~/agentic/workspaces/mini-control-plane/.venv/bin/python matrix_approval_listener.py

Step 9: Full End-to-End Test

Start watcher:

cd ~/agentic/runner
python3 watcher.py

Start listener:

cd ~/agentic/runner
~/agentic/workspaces/mini-control-plane/.venv/bin/python matrix_approval_listener.py

Create approval task:

cat > ~/agentic/inbox/phone-approval-test.md <<'TASK'
workspace: mini-control-plane
action: run
approval_required: true
TASK

Expected Matrix message:

Approval required: phone-approval-test.md

Reply in Element:

approve phone-approval-test.md

Expected final watcher output:

[>] approval_required=false
[✓] Agent run complete

Expected output files:

cat ~/agentic/workspaces/mini-control-plane/hello.txt
cat ~/agentic/workspaces/mini-control-plane/agent-summary.md

What Was Achieved

The final working flow is:

Task file created
  ↓
Watcher detects it
  ↓
Approval required=true
  ↓
File moved to pending
  ↓
Matrix notification sent
  ↓
Approve from phone
  ↓
Matrix listener runs approve.sh
  ↓
File moves back to inbox with approval_required=false
  ↓
Watcher runs spec_runner.py
  ↓
Qwen generates implementation plan
  ↓
Agent writes hello.txt and agent-summary.md

This is a local, human-approved, AI-assisted execution pipeline.


Design Philosophy

Local-first

The model runs locally on Mac Studio through Ollama. The agent controller runs locally on Mac Mini. Matrix is self-hosted. This avoids cloud AI API dependency.

Human-in-the-loop

The agent does not execute automatically when approval is required. It pauses and asks for approval.

File-driven control

Files are simple and inspectable. A file in inbox is a request. A file in pending is waiting. A file in rejected was denied.

Separation of concerns

Component Responsibility
watcher.py Detect and route trigger files
spec_runner.py Execute approved task
send_matrix.sh Send notification
matrix_approval_listener.py Convert Matrix replies into local approvals
approve.sh Approve pending task
reject.sh Reject pending task

Lessons Learned

  1. Verify scripts exist before wiring them into workflows.
  2. Use the correct Python virtual environment.
  3. Avoid raw IPs; use Tailscale MagicDNS names.
  4. Test local approval before phone approval.
  5. Ensure Matrix room membership before expecting messages in Element.
  6. Keep the AI planning role separate from shell execution.

Security Model and Why It Matters

The current system runs spec_runner.py directly on the Mac Mini host. This means the agent has the same filesystem access as the user who launched it. For a personal system in a trusted environment, that is acceptable.

The next evolution is Docker containerization. The goal is not to prevent the agent from doing work — it is to make the attack surface explicit and bounded.

The planned container security model:

Container mounts (explicit, bounded):
  ~/agentic/inbox       → /inbox       (read-write)
  ~/agentic/outbox      → /outbox      (read-write)
  ~/agentic/workspaces  → /workspaces  (read-write)
  ~/agentic/logs        → /logs        (read-write)
  ~/agentic/configs     → /configs     (read-only)
  ~/agentic/secrets     → /secrets     (read-only)

Host filesystem: not mounted
Docker socket: not mounted (agent cannot spawn containers)
Network: Tailscale host names only (Ollama, Matrix)
User: non-root inside container

This means if an agent goes off-script — either from a bad prompt or a confused model — it can only affect the explicitly mounted directories. It cannot touch SSH keys, environment variables outside the mounts, or other running processes on the host. This is the principle of least privilege applied to AI agents.

The phrase "AI should not have keys to the kingdom" is the right mental model. The kingdom is the host filesystem. The keys are unrestricted mount access. Docker removes the keys.


SDLC Gating Philosophy

The approval gate in this system is not a general-purpose safety check. It is an SDLC gate, and that distinction matters.

In a software development lifecycle, there are predefined promotion stages:

dev → staging → prod

Each stage represents a different risk profile:

Stage Blast radius Approval Examples
dev Low — isolated sandbox None needed Try an idea, generate a file, draft code
staging Medium — shared environment Required Integration tests, deploy to staging, config changes
prod High — live systems Required Deploy to production, run DGX jobs, push to main

The watcher now reads a stage field from the trigger file and auto-derives whether approval is needed:

workspace: mini-control-plane
action: run
stage: staging

A dev stage task runs immediately. A staging or prod task moves to pending/ and notifies Matrix. You do not need to set approval_required: true explicitly — the stage tells the system what to do.

This design is deliberate. It means you can run dozens of dev tasks without any interruption while still maintaining hard gates on the things that matter. The system stays out of your way until it needs to involve you.

The key insight: approval should be rare and meaningful, not frequent and routine. If every task requires approval, the gate becomes noise and operators stop paying attention. Gates only work when they protect something real.


Where This Is Going

The current system is a working v1. Here is the roadmap for what comes next.

Gradio Dashboard

A browser-based control panel running on the Mac Mini. Four panels:

  • Live log viewer — streams watcher.log in real time so you can see what the system is doing without SSH
  • Pending approvals — lists files in inbox/pending/ with one-click Approve/Reject buttons (no more typing Matrix commands)
  • Config viewer — shows hosts.env and agent.env with secrets masked
  • Process status — shows whether watcher.py and matrix_approval_listener.py are running, with start/stop buttons

This is the "one board to rule them all" — a single URL on the local network that gives full system visibility without opening a terminal.

Docker Containerization

Three containers:

  • runner — watcher.py with bounded filesystem mounts
  • matrix-listener — matrix_approval_listener.py
  • dashboard — Gradio dashboard on port 7860

One command to start everything:

docker-compose up

And one command to bring it to any machine:

docker-compose -f docker/docker-compose.yml up

This means the same stack can run on Mac Mini today, DGX tomorrow, and a cloud VM when needed.

Multi-Workspace Support

Today the system hardcodes mini-control-plane as the workspace. The next version will use the workspace field in the trigger file to route to any project directory under ~/agentic/workspaces/. Each workspace has its own execution-plan.md and its own spec.

Plugin / IDE Integration

The end goal is a Claude Code hook or VS Code extension that can drop trigger files into the inbox directly from the editor. You select a spec, choose a stage, and the agent picks it up — without leaving your current context. The editor becomes the control surface; the agent system handles the async execution.


Lessons Learned

  1. Verify scripts exist before wiring them into workflows.
  2. Use the correct Python virtual environment.
  3. Avoid raw IPs; use Tailscale MagicDNS names.
  4. Test local approval before phone approval.
  5. Ensure Matrix room membership before expecting messages in Element.
  6. Keep the AI planning role separate from shell execution.
  7. Separate approval logic from execution logic — each component should fail independently.
  8. Design the approval gate to be rare. Frequent gates become noise. Gates only protect what is real.
  9. File-based queues are more debuggable than in-memory queues. You can always look in the folder.
  10. Tailscale MagicDNS eliminates an entire class of "works on my machine" network bugs.

Conclusion

This project built a working local agentic control plane. It combines local AI models, file-based workflows, Matrix messaging, and human approval into one practical system.

The important achievement is not the test file hello.txt. The important achievement is the control loop:

AI can propose and execute work, but only after human approval.

That is the foundation of trustworthy agentic automation.