Z Skills System

`/research-and-go`

The full pipeline in one command. Decomposes a broad goal into sub-plans, drafts each with adversarial review, then executes all of them autonomously. One command, walk away.

Plan

`/research-and-plan`

Decomposes broad goals into focused sub-plans via domain research, dependency analysis, and scope bounding. Each sub-plan drafted upfront via /draft-plan; outputs a meta-plan for /run-plan.

`/draft-plan`

Adversarial plan drafting. Multiple agents research, draft, review, play devil's advocate, refine over up to 3 rounds. Scope check escalates to /research-and-plan when the task is too big.

`/plans`

Plan dashboard. View index, find next ready plan, batch-execute.

Build

`/run-plan`

The plan execution engine. Dispatches implementation in a worktree, verifies with a separate agent, updates progress, writes reports. Staleness check refreshes dependent plans. Rebase-before-commit for clean history. Supports finish mode and cron scheduling.

`/do`

Lightweight dispatcher for ad-hoc tasks. Worktree isolation, auto-push, cron scheduling. For work too small for /run-plan.

`/commit`

Inventories changes, classifies by scope, traces imports recursively, protects other agents' work. Pre-staging index check. Supports push and land modes.

Quality

`/verify-changes`

Pre-commit quality gate. Reviews diffs, checks test coverage, runs all suites, manually verifies UI with playwright-cli, fixes problems, re-verifies until clean. Pre-existing failure protocol.

`/qe-audit`

Commit audit: reviews recent commits for test gaps. Bash mode: adversarial stress-testing. Both file GitHub issues.

`/manual-testing`

Playwright-cli recipes with exact CSS selectors, event sequences, and auth bypass for browser-based verification.

Fix

`/fix-issues`

Batch bug-fixing sprints with auto-sync from GitHub. Prioritizes by severity, dispatches parallel worktree agents, collects results, auto-lands passing fixes. Supports cron for overnight execution.

`/fix-report`

Interactive companion to /fix-issues. Presents sprint results with diffs and test results. Every step ends with STOP AND WAIT for user judgment.

`/review-feedback`

Triages exported user feedback JSON, deduplicates against existing issues, files via gh.

`/investigate`

Root-cause debugging: reproduce, trace, prove the cause with evidence, write regression test, then fix. No guessing.

Utility & Reference

`/briefing`

Project status briefing: summary (triage), report (markdown), verify (sign-off items), current (in-flight), worktrees (cleanup readiness).

`/doc`

Documentation audit and gap-filling across modules, examples, and high-level docs. Three modes, including newsletter generation.

`/setup-zskills`

Bootstrap the Z Skills system into a new project: copies skill files, configures settings, adapts instructions to the target codebase.

Block Diagram Add-on

`/add-block`

Full 13-step lifecycle for new block types: plan, implement, register, UI, docs, tests, example, codegen, verification, landing.

`/add-example`

Example model creation: research the concept, design layout, build model file, register, test, screenshot, verify.

`/model-design`

Layout guidelines for block diagrams and state charts, based on MAAB/NASA standards.

Typical Workflow

Command	What It Does	When
`/briefing`	See what happened, what needs attention, what's in flight	Start of session
`/draft-plan plans/FOO.md`	Research, draft, and refine a plan through adversarial review	Starting a feature
`/run-plan plans/FOO.md finish auto`	Execute all remaining phases autonomously	Implementing
`/verify-changes`	Audit diffs, run tests, manual-test UI, fix issues	Before committing
`/commit`	Stage with dependency tracing, protect other agents' work	Landing code

How They Work Together

1. Build Features

/research-and-plan

Decompose broad goals into sub-plans with dependency analysis.

/draft-plan

Research, draft, review, devil's advocate. Up to 3 rounds.

/run-plan

Worktree agent implements each phase. Delegate mode for meta-plans.

/verify-changes

Diff review, test coverage, manual test. Recursive until clean.

/commit land

Trace dependencies, cherry-pick to main, protect other agents' work.

For broad goals, start with /research-and-plan which decomposes into sub-plans. For focused features, start directly with /draft-plan. Meta-plan phases use delegate mode — /run-plan invokes the delegated skill on main instead of a worktree.

2. Add Components and Examples

/add-block UserAuth

Research

Check for plan, spawn research agent if needed.

Implement

Worktree: component class, registration, UI, docs, tests.

Test

Unit tests, /add-example, manual testing, screenshots.

Verify & Land

Fresh agent verifies. Report with sign-off. Cherry-pick to main.

3. Quality Assurance

/qe-audit every day at 9am • /verify-changes • /manual-testing

/qe-audit

Scheduled commit audit. Files GitHub issues for gaps.

/verify-changes

Recursive: audit diffs, run tests, manual-test, fix, re-verify.

/manual-testing

Exact selectors, real events. No page.evaluate().

/fix-issues

QA issues feed into batch bug-fix sprints.

4. Fix Bugs at Scale

/fix-issues sync → /fix-issues 30 correctness auto every 4h → /fix-issues plan

Sync

Update trackers from GitHub. Research new issues.

Fix

Parallel worktree agents, 1 per issue. Reproduce, fix, test.

Report

SPRINT_REPORT.md. /fix-report for user sign-off.

Plan

"Too complex" skips → /draft-plan → /run-plan.

Schedulable Skills

Six skills support cron scheduling with a common API pattern.

Skill	Example	What Gets Scheduled
`/run-plan`	`/run-plan plans/FOO.md auto every 4h now`	One phase per cron fire, auto-advances to next
`/fix-issues`	`/fix-issues 10 auto every 4h now`	Full sprint each fire (sync, prioritize, fix, land)
`/do`	`/do Check docs every day at 9am`	Repeats the task description each fire
`/qe-audit`	`/qe-audit every day at 9am now`	Commit audit + issue filing each fire
`/plans`	`/plans work 3 auto every 6h`	Execute next 3 ready plans each fire
`/briefing`	`/briefing report 24h every day at 9am`	Daily briefing report

Common Flags

Flag	Meaning
`auto`	Bypass user approval gates (implied by `every`)
`every <schedule>`	`4h`, `12h`, `day at 9am`, `weekday at 2pm`
`now`	Run immediately AND schedule (without `now`, only schedules)
`stop`	Cancel the cron
`next`	Show when the next fire is

All crons are session-scoped (die when the session ends), self-deduplicating (new schedule replaces old), and self-perpetuating (each fire re-registers the cron).

Shared Patterns

Seven patterns recur across the system.

Pattern	What It Does
Worktree isolation	Each agent works in a separate git worktree. `scripts/port.js` gives each a deterministic dev server port.
Cron scheduling	`every 2h`, `now`, `stop`, `next` -- six skills self-schedule for autonomous execution.
Transcript-based hooks	The hook reads the session transcript (written by the runtime, not the agent) to verify tests actually ran.
Fresh-agent verification	The agent that wrote the code must not verify it. A separate agent with no memory audits diffs.
User gates	Critical decisions (landing code, closing issues) require explicit user approval. Skills STOP and wait.
Report generation	Persistent Markdown reports with sign-off checklists, screenshots, and verification instructions.
Dependency tracing	Before staging, recursively trace imports. If `A.js` imports `B.js`, both get committed.

Safety and Reliability

Agents work in worktrees, but they still need guardrails. The system uses a PreToolUse hook (block-unsafe.sh) that reads the session transcript to enforce key invariants:

What's Enforced	How
Tests before committing code	Transcript must contain test runner invocation
Manual testing before committing UI changes	Transcript must contain `playwright-cli`
Tests before cherry-picking to main	Transcript must contain test runner invocation
No destructive git commands	`git checkout --`, `git restore`, `git stash drop`, `git add .` all blocked
No force pushes or kill commands	`git push --force`, `rm -rf`, `kill -9` blocked
No piping test output	Blocks test commands with `\|` -- must capture to file
No directory-level log staging	Blocks `git add .claude/logs/`

The hook runs on every Bash tool call. It reads the transcript (written by the Claude Code runtime, not the agent) so evidence cannot be fabricated.

Skills add softer enforcement at decision points: pre-landing checklists, verification timeouts (45 min for verifiers, 2h for implementers), and a Failure Protocol that kills crons and preserves state on errors.

Incident Log and Lessons Learned (click to expand)

Every safety rule traces back to a real incident. Here are the ones that shaped the system.

Incident	What Happened	What It Drove
17-File Wipe	Agent "cleaned up" 17 files of other agents' uncommitted work	Hook blocks `git checkout --`, `git restore`
9-File Stash Drop	`git stash pop` failed; agent ran `git stash drop`, destroying the only copy	Hook blocks `git stash drop/clear`
Silent 13-Fix Revert	Agent checked out old files to investigate, never restored them, committed -- reverting 13 fixes	Hook blocks `git checkout <commit> -- <file>`
50-Fix Sweep	`git add .` swept another session's unfinished changes into a 50-fix commit	Hook blocks `git add .` / `git add -A`
Visual Feature Disaster	100% visual feature landed with zero manual testing or screenshots	Transcript hooks require `playwright-cli` for UI commits
Issue Misread	"Reset button" paraphrased as "clear canvas" -- wrong feature built	Verbatim issue body + plan text in all agent dispatches
3-Hour Test Thrash	Agents wasted 3 hours trying to run tests in worktrees without dev servers	Verbatim test recipe included in every agent dispatch
Auto-mode Bleed	After auto-mode execution, agent kept committing without permission in interactive mode. Context compaction preserved the auto-commit pattern.	Explicit mode reset after auto runs
Pre-staged Sweep	Commit swept 146 lines of another session's work because files were pre-staged in the index	Pre-staging index check in `/commit`
Force-removed Worktree Logs	Agent dismissed modified logs as "just log files" and force-removed worktree, losing the build record	Log extraction requirement before worktree removal

What we've learned about LLM compliance

Approach	Effectiveness
Hook hard blocks	High
Concrete recipes at decision points	High
Verbatim text requirements	High
Transcript-based verification	High
Past failure stories in prompts	Low-Med
Rules in long appendix sections	Low

17 skills that plan, build, test, fix, and ship — so one developer can run a full engineering team.

github.com/zeveck/zskills

Built with Claude Code (Claude Opus 4.6) — Z Skills System.