Z Skills System

Z Skills turns Claude Code into a disciplined engineering team. 17 core skills that plan, build, test, fix, and ship — with verification at every step. Each skill encodes a workflow that used to need hands-on guidance, so it runs with a single command.

Currently: 17 core skills and 3 block-diagram extensions covering plan drafting, feature implementation, bug fixing, verification, testing, documentation, and safe code landing. Six support cron scheduling for unattended execution.

/research-and-go

The full pipeline in one command. Decomposes a broad goal into sub-plans, drafts each with adversarial review, then executes all of them autonomously. One command, walk away.

Plan

/research-and-plan

Decomposes broad goals into focused sub-plans via domain research, dependency analysis, and scope bounding. Each sub-plan drafted upfront via /draft-plan; outputs a meta-plan for /run-plan.

/draft-plan

Adversarial plan drafting. Multiple agents research, draft, review, play devil's advocate, refine over up to 3 rounds. Scope check escalates to /research-and-plan when the task is too big.

/plans

Plan dashboard. View index, find next ready plan, batch-execute.

Build

/run-plan

The plan execution engine. Dispatches implementation in a worktree, verifies with a separate agent, updates progress, writes reports. Staleness check refreshes dependent plans. Rebase-before-commit for clean history. Supports finish mode and cron scheduling.

/do

Lightweight dispatcher for ad-hoc tasks. Worktree isolation, auto-push, cron scheduling. For work too small for /run-plan.

/commit

Inventories changes, classifies by scope, traces imports recursively, protects other agents' work. Pre-staging index check. Supports push and land modes.

Quality

/verify-changes

Pre-commit quality gate. Reviews diffs, checks test coverage, runs all suites, manually verifies UI with playwright-cli, fixes problems, re-verifies until clean. Pre-existing failure protocol.

/qe-audit

Commit audit: reviews recent commits for test gaps. Bash mode: adversarial stress-testing. Both file GitHub issues.

/manual-testing

Playwright-cli recipes with exact CSS selectors, event sequences, and auth bypass for browser-based verification.

Fix

/fix-issues

Batch bug-fixing sprints with auto-sync from GitHub. Prioritizes by severity, dispatches parallel worktree agents, collects results, auto-lands passing fixes. Supports cron for overnight execution.

/fix-report

Interactive companion to /fix-issues. Presents sprint results with diffs and test results. Every step ends with STOP AND WAIT for user judgment.

/review-feedback

Triages exported user feedback JSON, deduplicates against existing issues, files via gh.

/investigate

Root-cause debugging: reproduce, trace, prove the cause with evidence, write regression test, then fix. No guessing.

Utility & Reference

/briefing

Project status briefing: summary (triage), report (markdown), verify (sign-off items), current (in-flight), worktrees (cleanup readiness).

/doc

Documentation audit and gap-filling across modules, examples, and high-level docs. Three modes, including newsletter generation.

/setup-zskills

Bootstrap the Z Skills system into a new project: copies skill files, configures settings, adapts instructions to the target codebase.

Block Diagram Add-on

/add-block

Full 13-step lifecycle for new block types: plan, implement, register, UI, docs, tests, example, codegen, verification, landing.

/add-example

Example model creation: research the concept, design layout, build model file, register, test, screenshot, verify.

/model-design

Layout guidelines for block diagrams and state charts, based on MAAB/NASA standards.

Typical Workflow

CommandWhat It DoesWhen
/briefing See what happened, what needs attention, what's in flight Start of session
/draft-plan plans/FOO.md Research, draft, and refine a plan through adversarial review Starting a feature
/run-plan plans/FOO.md finish auto Execute all remaining phases autonomously Implementing
/verify-changes Audit diffs, run tests, manual-test UI, fix issues Before committing
/commit Stage with dependency tracing, protect other agents' work Landing code

How They Work Together

1. Build Features

/research-and-plan

Decompose broad goals into sub-plans with dependency analysis.

/draft-plan

Research, draft, review, devil's advocate. Up to 3 rounds.

/run-plan

Worktree agent implements each phase. Delegate mode for meta-plans.

/verify-changes

Diff review, test coverage, manual test. Recursive until clean.

/commit land

Trace dependencies, cherry-pick to main, protect other agents' work.

For broad goals, start with /research-and-plan which decomposes into sub-plans. For focused features, start directly with /draft-plan. Meta-plan phases use delegate mode — /run-plan invokes the delegated skill on main instead of a worktree.

2. Add Components and Examples

/add-block UserAuth

Research

Check for plan, spawn research agent if needed.

Implement

Worktree: component class, registration, UI, docs, tests.

Test

Unit tests, /add-example, manual testing, screenshots.

Verify & Land

Fresh agent verifies. Report with sign-off. Cherry-pick to main.

3. Quality Assurance

/qe-audit every day at 9am/verify-changes/manual-testing

/qe-audit

Scheduled commit audit. Files GitHub issues for gaps.

/verify-changes

Recursive: audit diffs, run tests, manual-test, fix, re-verify.

/manual-testing

Exact selectors, real events. No page.evaluate().

/fix-issues

QA issues feed into batch bug-fix sprints.

4. Fix Bugs at Scale

/fix-issues sync/fix-issues 30 correctness auto every 4h/fix-issues plan

Sync

Update trackers from GitHub. Research new issues.

Fix

Parallel worktree agents, 1 per issue. Reproduce, fix, test.

Report

SPRINT_REPORT.md. /fix-report for user sign-off.

Plan

"Too complex" skips → /draft-plan/run-plan.

Schedulable Skills

Six skills support cron scheduling with a common API pattern.

SkillExampleWhat Gets Scheduled
/run-plan /run-plan plans/FOO.md auto every 4h now One phase per cron fire, auto-advances to next
/fix-issues /fix-issues 10 auto every 4h now Full sprint each fire (sync, prioritize, fix, land)
/do /do Check docs every day at 9am Repeats the task description each fire
/qe-audit /qe-audit every day at 9am now Commit audit + issue filing each fire
/plans /plans work 3 auto every 6h Execute next 3 ready plans each fire
/briefing /briefing report 24h every day at 9am Daily briefing report

Common Flags

FlagMeaning
autoBypass user approval gates (implied by every)
every <schedule>4h, 12h, day at 9am, weekday at 2pm
nowRun immediately AND schedule (without now, only schedules)
stopCancel the cron
nextShow when the next fire is

All crons are session-scoped (die when the session ends), self-deduplicating (new schedule replaces old), and self-perpetuating (each fire re-registers the cron).

Shared Patterns

Seven patterns recur across the system.

PatternWhat It Does
Worktree isolation Each agent works in a separate git worktree. scripts/port.js gives each a deterministic dev server port.
Cron scheduling every 2h, now, stop, next -- six skills self-schedule for autonomous execution.
Transcript-based hooks The hook reads the session transcript (written by the runtime, not the agent) to verify tests actually ran.
Fresh-agent verification The agent that wrote the code must not verify it. A separate agent with no memory audits diffs.
User gates Critical decisions (landing code, closing issues) require explicit user approval. Skills STOP and wait.
Report generation Persistent Markdown reports with sign-off checklists, screenshots, and verification instructions.
Dependency tracing Before staging, recursively trace imports. If A.js imports B.js, both get committed.

Safety and Reliability

Agents work in worktrees, but they still need guardrails. The system uses a PreToolUse hook (block-unsafe.sh) that reads the session transcript to enforce key invariants:

What's EnforcedHow
Tests before committing code Transcript must contain test runner invocation
Manual testing before committing UI changes Transcript must contain playwright-cli
Tests before cherry-picking to main Transcript must contain test runner invocation
No destructive git commands git checkout --, git restore, git stash drop, git add . all blocked
No force pushes or kill commands git push --force, rm -rf, kill -9 blocked
No piping test output Blocks test commands with | -- must capture to file
No directory-level log staging Blocks git add .claude/logs/

The hook runs on every Bash tool call. It reads the transcript (written by the Claude Code runtime, not the agent) so evidence cannot be fabricated.

Skills add softer enforcement at decision points: pre-landing checklists, verification timeouts (45 min for verifiers, 2h for implementers), and a Failure Protocol that kills crons and preserves state on errors.

Incident Log and Lessons Learned (click to expand)

Every safety rule traces back to a real incident. Here are the ones that shaped the system.

IncidentWhat HappenedWhat It Drove
17-File Wipe Agent "cleaned up" 17 files of other agents' uncommitted work Hook blocks git checkout --, git restore
9-File Stash Drop git stash pop failed; agent ran git stash drop, destroying the only copy Hook blocks git stash drop/clear
Silent 13-Fix Revert Agent checked out old files to investigate, never restored them, committed -- reverting 13 fixes Hook blocks git checkout <commit> -- <file>
50-Fix Sweep git add . swept another session's unfinished changes into a 50-fix commit Hook blocks git add . / git add -A
Visual Feature Disaster 100% visual feature landed with zero manual testing or screenshots Transcript hooks require playwright-cli for UI commits
Issue Misread "Reset button" paraphrased as "clear canvas" -- wrong feature built Verbatim issue body + plan text in all agent dispatches
3-Hour Test Thrash Agents wasted 3 hours trying to run tests in worktrees without dev servers Verbatim test recipe included in every agent dispatch
Auto-mode Bleed After auto-mode execution, agent kept committing without permission in interactive mode. Context compaction preserved the auto-commit pattern. Explicit mode reset after auto runs
Pre-staged Sweep Commit swept 146 lines of another session's work because files were pre-staged in the index Pre-staging index check in /commit
Force-removed Worktree Logs Agent dismissed modified logs as "just log files" and force-removed worktree, losing the build record Log extraction requirement before worktree removal

What we've learned about LLM compliance

ApproachEffectiveness
Hook hard blocksHigh
Concrete recipes at decision pointsHigh
Verbatim text requirementsHigh
Transcript-based verificationHigh
Past failure stories in promptsLow-Med
Rules in long appendix sectionsLow

17 skills that plan, build, test, fix, and ship — so one developer can run a full engineering team.

github.com/zeveck/zskills

Built with Claude Code (Claude Opus 4.6) — Z Skills System.