AI-NATIVE ENGINEERING
The systembehind my output
I stopped treating Claude like a chatbot and started running it like an OS. Around 60 skills, 120+ workflow rules, 6 specialist subagents, 20+ MCP servers, all wired through .claude, .cursor and AGENTS.md at both global and per-repo scope. The point is simple. Make the agent read less, write less code, and write less prose, so it compounds across sessions instead of starting over.
~60
Skills (global + custom)
120+
Workflow rules
6
Specialist subagents
20+
MCP servers
The token economy
Four reflexes that govern what the agent reads, writes and remembers. That's why it compounds across sessions instead of resetting.
Headroom
Reduce what the model reads
Context economy. A graphify knowledge graph, Obsidian memory, and a grep-before-read hook let the agent load a map of the codebase instead of the whole thing. That leaves room for it to actually reason.
Ponytail
Reduce the code it writes
A lazy senior dev reflex. YAGNI first, then stdlib, then a native platform feature, then one line, then the minimum that works. Best code is the code you never write. Shortest working diff wins.
Caveman
Reduce the prose it writes
Terse output. Drop the articles, filler and hedging and you cut output tokens by about 75% with the technical content fully intact. Code and commits stay normal.
graphify
Compound what it remembers
Any input becomes a clustered knowledge graph, with HTML, JSON and an audit report. It's the backbone of a five-layer memory setup. On one project, loading that graph instead of re-feeding raw context each session dropped my Claude Code token use by roughly 71×. Knowledge sticks around instead of resetting. Method is in the writeup linked below.
The lifecycle
The ordered loop every change runs through. Autonomous, gated end to end.
The 8-step loop every change runs through, wrapped in Ralph so it can run on its own.
Autonomy & workflow
I kept losing half-finished work when the context window filled up mid-task. So these agents keep their state in files and git instead. Nothing dies when the window rotates.
Ralph Loop
Autonomous multi-session iteration
Custom. Progress lives in files and git. It rotates to fresh context near the limit and learns from failures through guardrails, with an 8-step enterprise workflow built in. Adapted from Huntley's Ralph technique.
subagent-driven-development
Parallel agents, two-stage review
Spins up a fresh subagent for each independent task, then reviews every one for spec compliance and code quality before it lands. One subagent per concern.
Strict Plan Mode
No code without a plan
Cursor rule. Every task is gated behind a real ADO work-item ID and an explicit plan before any code gets written. No drive-by edits.
AI-Native Development
Treat AI as a long-running collaborator
Meta-rule that tracks context across sessions and saves decisions. The rest of the harness hangs off it.
Frontend
Taste and motion on top, Next.js and RSC discipline underneath.
design-taste-frontend
Anti-slop interfaces
Global. Landing pages and portfolios that don't look templated. Audit first, real design systems, a strict pre-flight check before anything ships.
emil-design-eng
The invisible polish
Global. Emil Kowalski's take on UI polish, component design, and the animation calls that make software feel good.
Static-First Product
Static-first product model
Custom Cursor skill. Applies a static-first product model across a multi-product app, covering routing, APIs, and template-driven animation parity.
Next.js Best Practices
RSC boundaries done right
Plugin. Next.js file conventions, server and client boundaries, data and metadata patterns, image and font optimization.
Backend & architecture
Secure implementation, clean layer boundaries, and deepening modules instead of piling on more code.
Code Implementation
Secure Backend implementation
Dev rule. Pre-gate verification, mandatory security compliance, and strict layer boundaries before a line of backend code ships.
Improve Codebase Architecture
Find the deepening opportunities
Global. Scans for refactor and consolidation wins using the project's own domain language and recorded decisions, aiming at testable, AI-navigable code.
Production Refactor
Make code testable first
Subagent. Does the mechanical prod-code refactors a test strategy needs, like constructor injection and seams, before any tests get written.
Type Hygiene
No inline types, no slop
Cursor rule. Import discipline, shared-type reuse, no inline definitions, no redundant casts. Keeps the type surface clean.
Azure observability & RCA
After one too many 2am pages spent grepping AKS logs by hand, I wired this up. Log-first root cause analysis on AKS and Application Insights, driven by KQL.
Environment Config
Env mapping + auto-login
Dev rule. Subscription mapping and dynamic env resolution with auto-login across dev, preprod, prod and DR through the Azure MCP.
RCA Orchestrator
Trace by request / trace ID
Dev rule. Log-first root cause analysis for Kubernetes incidents. Resolve the environment, auto-login, then trace a request end to end.
App Insights KQL
ContainerLogV2 + AppInsights queries
Dev rule. Ready-made KQL entry points and a log schema reference for ContainerLogV2 and Application Insights.
GA4 Drop-off Analysis
Where users fall off
GA rule. Reads GA4 for funnel drop-offs, API failures, and tracking gaps, then suggests a roadmap to fix them.
Testing & QA
A four-agent unit test pipeline, plus disciplined debugging and E2E.
test-architect → author → critic → runner
Four agents, S-grade tests
Subagent pipeline. The architect designs the strategy with no code, the author writes JUnit 5 with Mockito and AssertJ, the critic grades against an 8-dimension rubric, and the runner executes Maven and reports the coverage delta.
Test-Driven Development
Red-green-refactor, enforced
Global. Failing test first, minimum code to pass, then refactor. Features and bug fixes both.
diagnose / systematic-debugging
No flailing on hard bugs
Global and plugin. Reproduce → minimise → hypothesise → instrument → fix → regression-test, for both bugs and performance regressions.
E2E API Verification
One command, whole pipeline
Custom. Docker Compose E2E. Build images, boot services, run integration tests, surface failures. Local and CI.
Security & SDLC gates
The non-negotiable checks between a diff and main.
Pre-PR Security Scan
Vulnerabilities before the PR
Custom. Runs an open-source dependency vulnerability scan, parses the report, and drives selective remediation by severity.
Security Checks
OWASP / MASVS checklist
Dev rule. A mandatory PASS/FAIL security checklist mapped to OWASP Top 10, ASVS, and API security. Enforced, not advisory.
Code Review Gate
AI review before commit
Dev rule. A read-only subagent review gate that runs after staging and before commit, with a bounded fix loop.
Commit Workflow
Staged, reviewed, traceable
Dev rule. Stage → review gate → conventional commit → push → security scan → PR → tracker update. Every step mandatory.
Knowledge & memory
How the agent remembers. This is the layer that makes everything else cheap.
graphify
Any input → knowledge graph
Global. Builds a clustered knowledge graph from code, docs, or papers, with HTML, JSON and an audit. The backbone of low-token recall.
understand-anything
Codebase → interactive graph
Plugin, 8 skills. Map the architecture, query it in chat, analyze diffs, extract domains, and generate onboarding, all from one knowledge graph.
obsidian-workflow
Decisions into a vault
Custom. Saves architecture decisions, debug post-mortems, and ADO notes to an Obsidian vault through MCP. Memory that outlives the session.
humanizer
Strip the AI tells
Global. Strips the tells of AI prose like inflated symbolism, rule-of-three, and em-dash overuse, so the writing reads like a person wrote it. This page ran through it.
The toolchain
MCP servers wired into the agent. These are the hands it reaches the outside world with.
Worth stealing
Collections and writeups I recommend pulling skills from.
A skill is just a folder with a SKILL.md inside. Drop it in .claude/skills for global use, or a repo's .claude or .cursor for local, and the agent picks it up when it needs it. Everything below is real and in daily use.