AI-NATIVE ENGINEERING

The systembehind my output

I stopped treating Claude like a chatbot and started running it like an OS. Around 60 skills, 120+ workflow rules, 6 specialist subagents, 20+ MCP servers, all wired through .claude, .cursor and AGENTS.md at both global and per-repo scope. The point is simple. Make the agent read less, write less code, and write less prose, so it compounds across sessions instead of starting over.

~60

Skills (global + custom)

120+

Workflow rules

6

Specialist subagents

20+

MCP servers

The token economy

Four reflexes that govern what the agent reads, writes and remembers. That's why it compounds across sessions instead of resetting.

READ

Headroom

Reduce what the model reads

Context economy. A graphify knowledge graph, Obsidian memory, and a grep-before-read hook let the agent load a map of the codebase instead of the whole thing. That leaves room for it to actually reason.

CODE

Ponytail

Reduce the code it writes

A lazy senior dev reflex. YAGNI first, then stdlib, then a native platform feature, then one line, then the minimum that works. Best code is the code you never write. Shortest working diff wins.

PROSE

Caveman

Reduce the prose it writes

Terse output. Drop the articles, filler and hedging and you cut output tokens by about 75% with the technical content fully intact. Code and commits stay normal.

MEMORY

graphify

Compound what it remembers

Any input becomes a clustered knowledge graph, with HTML, JSON and an audit report. It's the backbone of a five-layer memory setup. On one project, loading that graph instead of re-feeding raw context each session dropped my Claude Code token use by roughly 71×. Knowledge sticks around instead of resetting. Method is in the writeup linked below.

The lifecycle

The ordered loop every change runs through. Autonomous, gated end to end.

SDLCDaily team use

The 8-step loop every change runs through, wrapped in Ralph so it can run on its own.

01Requirement02Plan03Implement04Test05Security scan06Review gate07Commit & PR08Cleanup

Autonomy & workflow

I kept losing half-finished work when the context window filled up mid-task. So these agents keep their state in files and git instead. Nothing dies when the window rotates.

Ralph Loop

Autonomous multi-session iteration

Custom. Progress lives in files and git. It rotates to fresh context near the limit and learns from failures through guardrails, with an 8-step enterprise workflow built in. Adapted from Huntley's Ralph technique.

CustomAutonomyGit

subagent-driven-development

Parallel agents, two-stage review

Spins up a fresh subagent for each independent task, then reviews every one for spec compliance and code quality before it lands. One subagent per concern.

PluginAgents

Strict Plan Mode

No code without a plan

Cursor rule. Every task is gated behind a real ADO work-item ID and an explicit plan before any code gets written. No drive-by edits.

Cursor ruleSDLC

AI-Native Development

Treat AI as a long-running collaborator

Meta-rule that tracks context across sessions and saves decisions. The rest of the harness hangs off it.

Cursor ruleMeta

Frontend

Taste and motion on top, Next.js and RSC discipline underneath.

design-taste-frontend

Anti-slop interfaces

Global. Landing pages and portfolios that don't look templated. Audit first, real design systems, a strict pre-flight check before anything ships.

GlobalDesign

emil-design-eng

The invisible polish

Global. Emil Kowalski's take on UI polish, component design, and the animation calls that make software feel good.

GlobalMotion

Static-First Product

Static-first product model

Custom Cursor skill. Applies a static-first product model across a multi-product app, covering routing, APIs, and template-driven animation parity.

CustomNext.js

Next.js Best Practices

RSC boundaries done right

Plugin. Next.js file conventions, server and client boundaries, data and metadata patterns, image and font optimization.

PluginNext.js

Backend & architecture

Secure implementation, clean layer boundaries, and deepening modules instead of piling on more code.

Code Implementation

Secure Backend implementation

Dev rule. Pre-gate verification, mandatory security compliance, and strict layer boundaries before a line of backend code ships.

Dev ruleBackend

Improve Codebase Architecture

Find the deepening opportunities

Global. Scans for refactor and consolidation wins using the project's own domain language and recorded decisions, aiming at testable, AI-navigable code.

GlobalRefactor

Production Refactor

Make code testable first

Subagent. Does the mechanical prod-code refactors a test strategy needs, like constructor injection and seams, before any tests get written.

SubagentTestability

Type Hygiene

No inline types, no slop

Cursor rule. Import discipline, shared-type reuse, no inline definitions, no redundant casts. Keeps the type surface clean.

Cursor ruleTypeScript

Azure observability & RCA

After one too many 2am pages spent grepping AKS logs by hand, I wired this up. Log-first root cause analysis on AKS and Application Insights, driven by KQL.

Environment Config

Env mapping + auto-login

Dev rule. Subscription mapping and dynamic env resolution with auto-login across dev, preprod, prod and DR through the Azure MCP.

Dev ruleAzure

RCA Orchestrator

Trace by request / trace ID

Dev rule. Log-first root cause analysis for Kubernetes incidents. Resolve the environment, auto-login, then trace a request end to end.

Dev ruleRCA

App Insights KQL

ContainerLogV2 + AppInsights queries

Dev rule. Ready-made KQL entry points and a log schema reference for ContainerLogV2 and Application Insights.

Dev ruleKQL

GA4 Drop-off Analysis

Where users fall off

GA rule. Reads GA4 for funnel drop-offs, API failures, and tracking gaps, then suggests a roadmap to fix them.

GA ruleAnalytics

Testing & QA

A four-agent unit test pipeline, plus disciplined debugging and E2E.

test-architect → author → critic → runner

Four agents, S-grade tests

Subagent pipeline. The architect designs the strategy with no code, the author writes JUnit 5 with Mockito and AssertJ, the critic grades against an 8-dimension rubric, and the runner executes Maven and reports the coverage delta.

SubagentsJUnit

Test-Driven Development

Red-green-refactor, enforced

Global. Failing test first, minimum code to pass, then refactor. Features and bug fixes both.

GlobalTDD

diagnose / systematic-debugging

No flailing on hard bugs

Global and plugin. Reproduce → minimise → hypothesise → instrument → fix → regression-test, for both bugs and performance regressions.

GlobalDebugging

E2E API Verification

One command, whole pipeline

Custom. Docker Compose E2E. Build images, boot services, run integration tests, surface failures. Local and CI.

CustomDocker

Security & SDLC gates

The non-negotiable checks between a diff and main.

Pre-PR Security Scan

Vulnerabilities before the PR

Custom. Runs an open-source dependency vulnerability scan, parses the report, and drives selective remediation by severity.

CustomSecurity

Security Checks

OWASP / MASVS checklist

Dev rule. A mandatory PASS/FAIL security checklist mapped to OWASP Top 10, ASVS, and API security. Enforced, not advisory.

Dev ruleOWASP

Code Review Gate

AI review before commit

Dev rule. A read-only subagent review gate that runs after staging and before commit, with a bounded fix loop.

Dev ruleReview

Commit Workflow

Staged, reviewed, traceable

Dev rule. Stage → review gate → conventional commit → push → security scan → PR → tracker update. Every step mandatory.

Dev ruleGit

Knowledge & memory

How the agent remembers. This is the layer that makes everything else cheap.

graphify

Any input → knowledge graph

Global. Builds a clustered knowledge graph from code, docs, or papers, with HTML, JSON and an audit. The backbone of low-token recall.

GlobalMemory

understand-anything

Codebase → interactive graph

Plugin, 8 skills. Map the architecture, query it in chat, analyze diffs, extract domains, and generate onboarding, all from one knowledge graph.

PluginKnowledge

obsidian-workflow

Decisions into a vault

Custom. Saves architecture decisions, debug post-mortems, and ADO notes to an Obsidian vault through MCP. Memory that outlives the session.

CustomObsidian

humanizer

Strip the AI tells

Global. Strips the tells of AI prose like inflated symbolism, rule-of-three, and em-dash overuse, so the writing reads like a person wrote it. This page ran through it.

GlobalWriting

The toolchain

MCP servers wired into the agent. These are the hands it reaches the outside world with.

Cloud & infraAzure MCPkubectl MCPAzure DevOps MCPGitHub MCP
DesignFigma MCPFramer MCPExcalidraw MCP
TestingPlaywright MCPagent-browserQuarkus JDBC MCP
Knowledgememory MCPmempalaceObsidian MCP
Publishing & SEOMedium MCPSubstack MCPSlideshot MCPBing WebmasterSearch Console

Worth stealing

Collections and writeups I recommend pulling skills from.

Collection

Skills for Real Engineers

by Matt Pocock

Small, composable skills targeting four engineering failure modes: misalignment, verbosity, non-functional code, and architectural debt.

Collection

David's Agent Skills

by David

A 40+ skill public archive spanning agents, research, frontend polish, productivity, and automation.

Collection

agent-scripts

by Peter Steinberger

Shared hard rules (AGENTS.md), routing skills, dependency-light helper scripts, and validation hooks — symlinked into downstream projects.

Collection

georgeskills

by George Wang

80+ modular execution skills for a 'personal operating system' — data exports, research, product, design, workflow, and career.

Collection

Karpathy Skills

by multica-ai

Four principles from Karpathy's observations on LLM coding pitfalls: think before coding, simplicity first, surgical changes, goal-driven execution.

Skill

last30days-skill

by mvanhorn

Researches any topic across Reddit, X, YouTube, HN, and Polymarket, then ranks findings by real engagement instead of editorial SEO.

Course

AI Engineering from Scratch

by Rohit Goel

503 lessons over 20 phases (math → agents → production), each shipping a reusable prompt, skill, agent, or MCP server. MIT-licensed.

Registry

skills.sh

by Open ecosystem

A searchable registry + leaderboard of agent skills (compatible with Claude Code, Cursor, Copilot, Gemini & more). Install any with `npx skillsadd <owner/repo>`.

Agent

Hermes Agent

by Nous Research

A self-improving personal agent that lives on your server, creates skills from experience, and orchestrates Claude Code for coding tasks. MIT-licensed.

Writeup

71× Fewer Tokens

by Ketan Chavan

How Karpathy's LLM Wiki, Obsidian and graphify cut my Claude Code token use by roughly 71×, by treating the agent like an OS with real memory.

A skill is just a folder with a SKILL.md inside. Drop it in .claude/skills for global use, or a repo's .claude or .cursor for local, and the agent picks it up when it needs it. Everything below is real and in daily use.

Send me a message

or download my resume