Compare/agent-skills vs Auto-Arch Tournament

AI tool comparison

agent-skills vs Auto-Arch Tournament

Which one should you ship with? Here is the side-by-side panel verdict, pricing read, reviewer split, and community vote comparison.

A

Developer Tools

agent-skills

Production-grade engineering skills library for AI coding agents

Ship

75%

Panel ship

Community

Free

Entry

agent-skills is a structured library of 20 production-grade engineering skills for AI coding agents, published by Addy Osmani (former Google Chrome DevTools lead, author of Essential JavaScript Design Patterns). It provides a complete spec-to-ship workflow via 7 slash commands (/spec, /plan, /build, /test, /review, /code-simplify, /ship) that work across Claude Code, Cursor, Gemini CLI, Windsurf, and GitHub Copilot — any agent that supports CLAUDE.md or equivalent configuration files. The library includes three specialist personas that activate on demand: a security auditor (checks for injection vulnerabilities, hardcoded secrets, OWASP Top 10), a code reviewer (focuses on maintainability, complexity, and test coverage), and a test engineer (generates unit, integration, and edge-case tests). Four reference checklists (API design, accessibility, performance, deployment) give agents shared evaluation criteria. Each skill is written as a Markdown instruction file following the CLAUDE.md conventions popularized by the karpathy-skills library. agent-skills accumulated 6,693 GitHub stars in its first trending week, outpacing most comparable skill collections. Osmani's framing — treating agent skills as a first-class engineering asset rather than ad-hoc prompts — resonates with teams trying to standardize how they use AI coding tools. The library is MIT-licensed and designed to be forked and extended.

A

Developer Tools

Auto-Arch Tournament

An AI agent loop that redesigns your RISC-V CPU and formally proves every win

Ship

75%

Panel ship

Community

Paid

Entry

Auto-Arch Tournament is an autonomous research system where an AI agent iteratively proposes, implements, and validates microarchitectural improvements to a RISC-V CPU. Starting from a standard 5-stage pipeline, the loop runs hypotheses in parallel, each going through formal verification (53 symbolic checks), cycle-accurate simulation, multi-seed FPGA place-and-route, and CoreMark CRC validation. Only hypotheses that beat the current champion get merged; everything else gets discarded. Starting from 301 iterations/second, the system hit 577 iter/s (+92%) across 73 attempts in 9.8 hours — producing a design 26% faster and 40% smaller in LUTs than the baseline. The insight the author drives home is that the real innovation isn't the AI agent — it's the verifier. The orchestrator is hardcoded to prevent agents from manipulating their own evaluation gates, a simple but critical design constraint that turns a creative process into a trustworthy one. Without a rigorous verification harness, agent-driven optimization becomes a confidence trick. This is early but fascinating proof that AI-driven hardware design loops can produce commercially meaningful gains. The repo uses Claude Code or Codex as the coding agent, SystemVerilog for the RTL, and standard open-source EDA tooling (Yosys, nextpnr, Verilator). It's a compelling template for anyone building agentic optimization loops where correctness matters.

Decision
agent-skills
Auto-Arch Tournament
Panel verdict
Ship · 3 ship / 1 skip
Ship · 3 ship / 1 skip
Community
No community votes yet
No community votes yet
Pricing
Free / Open Source
Open Source
Best for
Production-grade engineering skills library for AI coding agents
An AI agent loop that redesigns your RISC-V CPU and formally proves every win
Category
Developer Tools
Developer Tools

Reviewer scorecard

Builder
80/100 · ship

Having security audits, test generation, and spec creation as first-class slash commands changes how you think about agent-assisted development. The cross-tool compatibility (Claude, Cursor, Gemini) means you can standardize across a team with mixed tool preferences. Fork it, customize the checklists, and you have a company playbook.

80/100 · ship

The hardcoded orchestrator pattern is the real take-home here. Building AI loops that can't game their own eval is a solved problem when you just... don't give the agent write access to the evaluator. Obvious in hindsight, rarely implemented.

Skeptic
45/100 · skip

This is well-packaged prompt engineering, not a fundamentally new capability. The value depends entirely on the underlying agent following instructions reliably — which varies wildly across tools and models. Teams that haven't established basic code review processes will use this as a crutch rather than building genuine engineering discipline.

45/100 · skip

63 out of 73 proposals failed. That's an 86% failure rate and heavy use of API credits on a narrow RISC-V benchmark. Impressive for a demo but the economics don't work yet for serious chip design at scale.

Futurist
80/100 · ship

The real innovation here is treating agent behavior as versionable, shareable code. The next step is organizations maintaining their own agent-skills forks as living engineering standards — the CLAUDE.md pattern is becoming a de facto org-level configuration layer for how teams interact with AI.

80/100 · ship

AI-driven hardware design is going to collapse the chip design cycle from years to weeks. This is a primitive ancestor of the tools that will design the next generation of AI accelerators.

Creator
80/100 · ship

The /spec and /plan commands are genuinely useful for non-engineers who need to communicate feature requirements to an AI agent. Clear structured specs reduce the back-and-forth of vague prompts — this could be the bridge between product thinking and implementation.

80/100 · ship

The blog post that comes with this repo is one of the best pieces of technical writing I've seen in months. The transparency about failure rates and the verifier insight make it genuinely educational.

Weekly AI Tool Verdicts

Get the next comparison in your inbox

New AI tools ship daily. We compare them before you waste an afternoon.

Bookmarks

Loading bookmarks...

No bookmarks yet

Bookmark tools to save them for later