Alternatives

632 v0 2.0 Alternatives Our Panel Actually Ships

Looking for v0 2.0 alternatives? Our panel reviewed 632options. Here's what ships.

1
V
v0 3.0 by Vercel
Ship100% Ship

Generate full-stack apps with auth, APIs, and DB schemas from prompts

The primitive here is a full-stack code generator that emits Next.js app router structure — API routes, auth boilerplate, Drizzle/Prisma schema, the works — from a natural language spec. The DX bet is that complexity lives in the generation layer, not in config, which is the right call: you get readable, editable code you can eject from at any point. The moment of truth is whether the generated schema is actually coherent under foreign key constraints and not just a bag of CREATE TABLE statements, and from what I've seen the output holds up better than I expected. The gap with the weekend alternative is real: scaffolding auth + API routes + a relational schema by hand still takes 4-6 hours even for experienced devs; this collapses that to 20 minutes of editing. Ships on the specific decision to emit ownership-friendly, ejectable code rather than locking you into a visual runtime.The Builder
2
R
Replit AI Agent 2.0
Ship100% Ship

Prompt to deployed full-stack app, no scaffolding required

The primitive here is a prompt-to-deployed-CRUD-app pipeline with GitHub sync as the escape hatch — and that escape hatch is the whole reason I'm not skipping this. The DX bet Replit made is 'hide infrastructure complexity at the cost of opinionated runtime choices,' which is the right trade for the target user. The moment of truth is 'can I get something running that I'd share with a client in under 10 minutes' — and based on the publicly documented flow, it passes that test for simple apps. The weekend-alternative comparison breaks down because the actual deployment pipeline, preview environment, and debugging co-pilot loop are genuinely non-trivial to replicate; this isn't wrapping three API calls, it's wrapping an entire infra layer. What earns the ship: GitHub sync means you're not fully captive, which is the specific technical decision that separates this from locked-in demo tools.The Builder
3
L

Run Llama 4 Scout on your GPU — INT4/INT8, no cloud required

The primitive here is clean: INT4/INT8 weight quantization on a frontier-class MoE model that actually fits on consumer hardware. The DX bet Meta made is to route you through the official llama repo rather than some SaaS onboarding funnel, which means you're dealing with HuggingFace-compatible checkpoints and llama.cpp integration — things practitioners already have wired up. The moment of truth is loading the INT4 variant on a 16GB VRAM card and getting a coherent response in under 30 seconds; if that works cleanly without manual quantization config, this earns its ship. My specific reservation: if the README is marketing copy with a single `pip install` block at the bottom and no guidance on KV cache tuning or context window tradeoffs at INT4, that's a miss — but the open weights policy means you're not locked in, and that alone separates this from 90% of 'edge AI' announcements.The Builder
4
M

Open-weight 8B model with native function calling and JSON mode

The primitive here is an open-weight instruction-tuned model with first-class function calling and JSON mode baked into the model weights — not bolted on via prompt engineering or a wrapper library. The DX bet is: give developers structured output guarantees at 8B scale so they can build reliable agentic pipelines without the latency and cost of larger models. The moment of truth is calling the function-calling API locally with Ollama or vLLM and seeing whether the JSON schema adherence actually holds under adversarial inputs — and reports from the community suggest it mostly does. This is not something you replicate with a weekend script; consistent structured output at this parameter count is a real engineering achievement. The specific decision that earns the ship: Apache 2.0 license means you can actually deploy this in production without a legal conversation.The Builder
5
M
Mistral Large 3
Ship100% Ship

128K context, 30-language code gen, frontier performance at lower cost

The primitive is clear: a dense transformer with a 128K context window and fine-tuned multilingual code generation, accessible via a REST API with OpenAI-compatible endpoints — no novel abstraction, no forced SDK, just a capable model you can swap in. The DX bet is correct: OpenAI-compatible API surface means the migration cost from an existing GPT-4 integration is essentially a base URL swap and a model string change. The moment of truth is hitting the 128K window with a real codebase — if the retrieval quality holds across that context, this earns its place. My one gripe: 'significantly improved multilingual code generation' is marketing until there's a public benchmark with methodology attached; I'm shipping on the API design and positioning, not the benchmark claim.The Builder
6
S
SmolLM3
Ship100% Ship

3B parameter on-device model that punches above its weight class

The primitive is clean: a quantization-friendly 3B transformer with ONNX and GGUF exports baked in at launch, not as an afterthought. The DX bet here is 'zero ceremony before inference' — you pull the model, you run it, and the two most common runtimes are already handled. Apache 2.0 is the right call; anything else would have killed adoption in enterprise edge deployments before it started. The specific technical decision that earns the ship is shipping GGUF and ONNX simultaneously on day one — that's the team actually thinking about the deployment surface instead of just the training run.The Builder
7
G
GPT-5 Mini
Ship100% Ship

GPT-5 intelligence at a fraction of the cost for production-scale apps

The primitive here is dead simple: same OpenAI API contract, cheaper inference, marginally reduced capability ceiling — just swap the model string and watch your bill drop. The DX bet is that zero migration cost is the whole product, and that's exactly the right call. No new SDKs, no new auth flow, no new mental model to adopt. The moment of truth is a one-line change from 'gpt-5' to 'gpt-5-mini' in your existing code, and it just works — that's a genuine engineering win. The specific decision that earns the ship is OpenAI's commitment to API surface compatibility; they've made 'downgrade to save money' a 60-second decision instead of a project.The Builder
8
H

Deploy any open model to AWS, Azure, or GCP in one click

The primitive here is clean: HF Hub becomes a deployment surface, not just a model registry. The DX bet is that 'click deploy from model card' beats 'write a SageMaker notebook, configure an IAM role, and pray.' That bet is correct—the moment of truth is the first 10 minutes where a developer usually drowns in cloud provider IAM, container registries, and endpoint config. This skips all of that. The weekend alternative—a Lambda that hits a SageMaker endpoint you provisioned manually—takes 4-6 hours minimum. The specific decision that earns the ship: serverless endpoints with per-request billing through your existing cloud account mean you're not adding a new vendor, you're just adding a deployment shortcut.The Builder
9
C

Async multi-file code tasks that run while you keep shipping

The primitive here is a persistent, async execution context for multi-file edits — not just a chat thread, but a task queue with a real working directory. The DX bet is that developers want fire-and-forget delegation for large refactors the same way they'd push a CI job, and that's exactly the right call. The moment of truth is whether the agent actually resolves import chains and test failures without coming back to ask three clarifying questions, and if Cursor's existing context model holds up, this isn't replicable with a weekend script — the tight editor integration for diffing and accepting changes is the actual moat here.The Builder
10
C
Cursor 2.0
Ship100% Ship

AI code editor with background agents that refactor while you ship

The primitive here is a persistent, headless coding agent that operates on your repo as a subprocess while your main editor session stays hot — that's meaningfully different from tab-completion or inline chat, and it's the right DX bet. Background tasks offload the complexity to a task queue you can inspect, which means you're not blocked waiting for a 40-file refactor to finish. The diff review interface is where this earns it: if the agent's output is a black box you approve or reject wholesale, you're just rubber-stamping; but if the diff surface lets you selectively accept hunks with the same granularity as a git patch, Cursor has done the hard design work that most agent tools skip entirely.The Builder
11
C
Cohere Command R3
Ship100% Ship

128K context RAG model with self-serve enterprise fine-tuning

The primitive here is clean: a hosted RAG-optimized language model with a first-class fine-tuning API you can actually call without a sales call. The DX bet is that self-serve fine-tuning lowers the activation energy for enterprise customization — and that's the right bet. The 128K window is table stakes at this point, but the multilingual grounding improvements are where Cohere has actually done real work rather than just scaling context. The moment of truth is whether the fine-tuning API docs are good enough to onboard without hand-holding — if it's one endpoint with a clear schema and a sensible job-polling pattern, this earns the ship. The specific decision that works here is putting fine-tuning behind an API instead of a wizard, which means it composes into deployment pipelines.The Builder
12
C
Command R Ultra
Ship100% Ship

Enterprise RAG model with 128K context and hallucination grounding

The primitive here is a grounded completion model with a 128K context window optimized specifically for RAG — not a general-purpose model pretending to do RAG. The DX bet is correct: Cohere puts the complexity in the grounding layer rather than forcing developers to engineer their own citation chains or hallucination guards, which is exactly where it belongs. The moment of truth is whether chunking strategy and connector setup work cleanly on first call, and Cohere's API docs have historically been among the cleaner ones in this space — no six-env-var preamble. What earns the ship is the specific technical decision to build grounding as a first-class output feature rather than post-hoc prompting, which means you're not babysitting the prompt template to get citations.The Builder
13
A

Auto-route prompts to the right model, cut API costs 40–60%

The primitive is a complexity classifier that sits in front of your model pool and makes the cheap-vs-expensive call so you don't have to — genuinely useful infra that I've hacked together manually more than once. The DX bet is endpoint-compatibility: one URL swap, existing SDK calls, no schema changes, which is exactly right. The moment of truth is registering your model pool and watching the first routing decision happen transparently; if the observability surface shows which model each request hit and why, this earns its keep immediately. The specific decision that earns the ship: making this a passthrough layer with no new SDK dependency rather than another SDK you have to adopt.The Builder
14
M
Mistral 3B Edge
Ship100% Ship

Sub-4GB open-weight LLM that runs entirely on your device

The primitive here is clean: a quantized 3B-parameter transformer that fits in under 4GB of RAM and runs inference locally without a network call. The DX bet is smart — instead of building yet another runtime, Mistral ships weights and lets Ollama, LM Studio, and Core ML handle the execution layer. That's the right call. First 10 minutes look like `ollama run mistral3b-edge` and you're inferring — no environment variables, no API keys, no billing page. The Apache 2.0 license means you can actually ship this in a product without a lawyer involved. The specific decision that earns the ship: Mistral let the deployment tooling ecosystem do its job instead of vertically integrating into another half-baked runtime.The Builder
15
M
Mistral 4B Edge
Ship100% Ship

Apache 2.0 on-device LLM that actually fits in your pocket

The primitive here is clean: a quantization-friendly transformer checkpoint you can drop into a mobile inference runtime — llama.cpp, MLX, or ExecuTorch — without a licensing negotiation. The DX bet Mistral made is the right one: Apache 2.0 with no use-case restrictions means the integration complexity lives in your stack, not in a contract. The moment of truth is `ollama run mistral-4b-edge` or loading via Core ML, and that works today. This isn't replicable with three API calls and a Lambda — local inference at 4B parameter quality without a cloud bill is a genuinely different architecture decision, and Mistral executed it.The Builder
16
P

Frontier reasoning meets live web grounding in one API call

The primitive here is clean: LLM inference with search grounding baked in at the API layer, so you're not duct-taping a search API to your context window yourself. The DX bet is that developers would rather pay per-token for a pre-grounded model than orchestrate Bing/Google Search APIs plus chunking logic plus citation parsing — that bet is correct for 80% of use cases. At $3/M input tokens with 200K context, this is actually priced for production use, not just demos. The skip scenario is when you need deterministic source control, because you're trusting Perplexity's crawl decisions, not your own.The Builder
17
V
Vercel AI SDK 5.0
Ship100% Ship

Native MCP, unified providers, and reliable streaming for AI apps

The primitive here is clean: a unified transport layer plus typed streaming hooks that sit between your app and any model provider. The DX bet is that complexity lives in the abstraction, not in your code — and for 5.0 that bet mostly pays off. Native MCP support as a first-class primitive is the specific decision that earns the ship: instead of bolting tool-calling onto a bespoke protocol per provider, you get a standardized interface that composes. The moment of truth is `useChat` with a streaming response — it just works, error states included, which is not something I can say about the DIY fetch-plus-EventSource path most teams reinvent badly. The weekend-alternative case gets harder with every release here; the streaming reliability fixes alone would take a competent engineer a week to get right across reconnects and backpressure.The Builder
18
G

From GitHub issue to merged PR — autonomously, no checkout required

The primitive here is straightforward: a browser-based agent loop that takes an issue as input, generates a plan, writes diffs across the repo, runs CI, and opens a PR — no local environment required. The DX bet is that GitHub owns enough context (issues, PRs, CI results, repo history) to make the planning step actually useful, and that bet is largely correct for well-structured repos with good issue hygiene. The moment of truth is filing an issue and watching it generate a coherent implementation plan before touching code — when it works, it's genuinely faster than spinning up a branch. The specific decision that earns the ship: hooking into existing CI pipelines rather than running in a sandboxed toy environment means the output is tested against real constraints, which is the difference between a demo and a tool.The Builder
19
C
Codex CLI 2.0
Ship100% Ship

OpenAI's terminal-native autonomous coding agent with multi-file editing

The primitive here is a model-backed shell agent that can read, write, and execute across a working directory — not just a code completer, an actual task runner. The DX bet is terminal-first, which is the right call: no Electron wrapper, no browser tab, no drag-and-drop nonsense. GitHub Actions integration out of the box means the moment-of-truth test (can I run this in CI without duct tape?) actually passes. The weekend-alternative argument collapses here because the multi-file context management and test-execution loop would take a competent engineer a week to replicate robustly. What earns the ship: it's open-source, so you can actually read what it's doing instead of trusting a marketing claim.The Builder
20
L
Llama 4 Scout
Ship100% Ship

Open-weight 17B model with 10M token context for long-doc AI

The primitive here is a locally-runnable transformer with a 10M token context window — not a platform, not a wrapper, just weights you can pull and run. The DX bet is that you bring your own serving infrastructure, which is absolutely the right call for a model release; Meta's job is to ship weights and docs, not babysit your deployment stack. The moment of truth is running `huggingface-cli download` and actually getting the model loaded, and the Llama ecosystem tooling (llama.cpp, vLLM, Transformers) is mature enough that the weekend alternative — writing your own long-context RAG pipeline around a smaller model — is genuinely worse now. A 10M context window changes what RAG even means: you can drop entire codebases or document corpora into context rather than chunking. That earned the ship.The Builder
21
S
SmolVLM 2.5
Ship100% Ship

2B-param vision-language model that punches way above its weight

The primitive here is clean: a quantized vision-language model small enough to run inference locally, with ONNX and llama.cpp exports included at launch — not as an afterthought. That's the right DX bet. The moment of truth is 'can I run document understanding on a MacBook without a round-trip to an API?' and the answer is actually yes. The specific technical decision that earns the ship is shipping the quantized exports alongside the weights instead of making developers figure out quantization themselves — that's the difference between a research artifact and a tool people actually use.The Builder
22
M

Open-weight sparse MoE model: 141B total, 39B active per pass

The primitive is clean: a 141B sparse MoE transformer where you only pay compute for 39B parameters per forward pass, released under Apache 2.0 with weights you can actually download and run. The DX bet is correct — Mistral put the complexity in the architecture and kept the interface boring, meaning it drops into any vLLM or Ollama setup without ceremony. The moment of truth is spinning it up locally or via the API, and it survives that test because the HuggingFace integration is standard and the weights are real. The 'weekend alternative' here is just GPT-4 via API with no self-hosting option — this is categorically different because you own the weights. Specific ship decision: Apache 2.0 plus a genuinely efficient MoE architecture is not a wrapper, it's infrastructure.The Builder
23
C
Claude 4 Sonnet
Ship100% Ship

Anthropic's sharpest coding model yet, with better benchmarks and desktop automation

The primitive here is a frontier language model with documented SWE-bench and HumanEval regressions tracked release-over-release — that's actual engineering accountability, not marketing. The DX bet is right: API-first, no new SDK required, drop-in replacement for Sonnet 3.7 in existing integrations. The computer-use improvements are the part I'd actually reach for — reliable desktop automation has been the missing piece for agentic workflows that touch legacy software. Benchmark methodology is Anthropic's own, so I'd weight it 70% until independent evals catch up, but the direction is credible.The Builder
24
S
SmolAgents 2.0
Ship100% Ship

Lightweight Python agents with native MCP protocol support and visual debugging

The primitive is clean: a code-first agent runner that treats MCP servers as first-class tool providers, so you don't manually wire every integration. The DX bet is that keeping the library small and deferring tool discovery to the MCP layer is the right call — and it is, because it means your agent doesn't become a monolith every time someone adds a new capability. The moment of truth is `from smolagents import CodeAgent` plus an MCP server URL — if that works in under five minutes with a real tool, this earns its place. The visual debugger on the Hub is the specific decision that pushes this to a ship: runtime graph tracing in a framework that explicitly values staying small is exactly the kind of thoughtful addition that proves the team understands developer pain, not just developer marketing.The Builder
25
M
Mistral 3.1
Ship100% Ship

Open-weight model with native tool calling and 256K context window

The primitive here is clean: an open-weight transformer with first-class tool calling baked into the model weights, not bolted on via prompt engineering or a wrapper layer. That distinction matters — native tool calling means the model was trained to emit structured function calls reliably, not instructed to mimic JSON output and hope for the best. The DX bet is Apache 2.0 plus HuggingFace distribution, which means you can pull the weights, run inference locally or on your own cloud, and never touch a vendor API if you don't want to. The 256K context is the headline number, but the tool calling implementation is the real unlock for agentic pipelines. My only gripe: the announcement page reads more like a press release than a technical spec — I want ablation studies on tool call accuracy and context retrieval benchmarks, not marketing copy.The Builder
26
M
Mistral Large 3
Ship100% Ship

Frontier model with native code execution and 128K context

The primitive here is a hosted LLM with a sandboxed execution runtime baked in — no orchestrating a separate code-sandbox container, no managing Jupyter kernels, no stitching together tool-call plumbing just to run a numpy operation. That is the right DX bet: collapse the model-plus-execution layer into one API surface so developers stop paying the integration tax. The 128K context means you can pass large codebases or data files without chunking gymnastics. The moment of truth is the first tool-call response that returns real stdout — if that works cleanly in the first 10 minutes, the rest of the story writes itself. I'd want to see the execution sandbox spec'd out publicly before trusting it in production, but this is a real capability, not a demo.The Builder
27
S
SmolVLM2 Turbo
Ship100% Ship

Sub-2B vision-language model that actually runs on your phone

The primitive here is clean: a quantized, exportable VLM checkpoint that fits in under 2GB and ships with ONNX and MLX export paths out of the box. The DX bet is that developers want a model they can `pip install` and run locally in under 10 minutes, not a cloud endpoint they have to rate-limit around — and that bet is correct. The moment of truth is `pipeline('image-to-text')` in transformers, and it survives it. This is not a wrapper around someone else's API; it's a trained artifact with documented architecture tradeoffs, and that earns the ship.The Builder
28
P

Embed multi-step web research and synthesis into any app via API

The primitive is clean: POST a research query, get back a synthesized answer with citations, skip the five-layer RAG pipeline you'd otherwise have to build and maintain. The DX bet is that developers don't want to manage search provider keys, chunking strategies, and deduplication — they want a research result. That's the right bet. The 100-query free tier lets you actually evaluate this before committing, which earns immediate trust. My only gripe: the output format needs to be predictable enough to parse reliably in production, and until I see the schema docs in detail I'm reserving judgment on whether this is genuinely composable or a black box dressed up as an API.The Builder
29
M

Open-weight 22B model for edge and consumer hardware inference

The primitive is clean: a quantizable 22B transformer you can run locally with llama.cpp, Ollama, or vLLM without begging an API for permission. The DX bet Mistral made here is 'zero configuration if you already have a standard inference stack' — and that bet lands, because the model slots into every major local runner without special tooling. Apache 2.0 is the real technical decision that earns the ship: no commercial use restrictions means this actually gets embedded in products, not just benchmarked and forgotten. The moment of truth is `ollama pull mistral3small` and getting a responsive chat in under five minutes on a 24GB GPU — that survives the test.The Builder
30
L

Run Llama 4 on your phone or laptop — no cloud required

The primitive here is straightforward: INT4/INT8 quantized Llama 4 weights with deployment guides targeting llama.cpp, ExecuTorch, and MLX — the DX bet is 'we give you the weights and the deployment path, you own the runtime,' which is the right call. The moment of truth is cloning the repo, running the quantized Scout on an M-series Mac, and seeing if the latency is actually usable — the deployment guide covers that path without making you wrangle six environment variables first. This is not a weekend replication project; quantizing a 17B MoE model to run coherently on-device is legitimately hard, and Meta shipping inference guides that target real runtimes instead of a proprietary SDK is the specific decision that earns the ship.The Builder
31
O

Strong reasoning, lower cost — o3-mini-high lands in the API

The primitive is a reasoning-tuned inference endpoint with structured output support baked in from day one — not bolted on after complaints. Function calling at launch matters because it means you can actually drop this into an agentic pipeline today without workarounds. The DX bet here is that reduced pricing removes the 'this is too expensive to experiment with' friction that killed o3 adoption in prototyping cycles, and that bet is correct. The specific technical win: structured outputs plus elevated reasoning at this price tier makes eval pipelines and chain-of-thought agents practical where they weren't before.The Builder
32
H

One-click model deployment across cloud backends, unified billing

The primitive here is clean: a unified auth and billing proxy sitting between the Hub's model catalog and a set of inference backends. The DX bet is that developers don't want to juggle five accounts and five API key rotation schemes when they're prototyping across models — and that bet is correct. The moment of truth is swapping from one backend to another without touching your headers or your billing setup, and if that actually works end-to-end with a single HF token, that's a genuine week of setup time saved. The weekend alternative — managing separate Together/Fireworks/Cerebras accounts with a routing script — is exactly the pain this removes, and unlike most 'we unified the APIs' pitches, HF actually has the distribution to make providers care about being in this catalog.The Builder
33
S

Open-source real-time video & 3D segmentation from Meta AI

The primitive is clean: promptable segmentation over images, video frames, and sparse 3D point clouds via a unified inference interface — no fine-tuning required. The DX bet Meta made is that developers want a composable foundation model they can drop into a pipeline, not a SaaS endpoint they have to negotiate with, and that bet is exactly right. Where SAM 1 required post-processing hacks to propagate masks across frames, SAM 3 handles temporal consistency natively, which eliminates a whole category of brittle glue code I've personally written. The specific technical decision that earns the ship: open weights with a documented Python API that doesn't require you to memorize a config file before you can run inference on a single image.The Builder
34
G
GPT-5 Mini API
Ship100% Ship

60% cheaper, sub-200ms — GPT-5's speed twin for high-throughput apps

The primitive is clean: same API contract as GPT-5, lower cost, lower latency, no migration overhead. The DX bet here is zero-friction adoption — you swap the model string, you get sub-200ms at 60% cost, done. That's the right call. The moment of truth is a latency-sensitive loop where GPT-5 was blocking UX — this solves that without a new SDK, new auth, new anything. The specific decision that earns the ship is that OpenAI didn't add config surface to justify the new model tier; they just made the right defaults cheaper.The Builder
35
C
Cursor 1.0
Ship100% Ship

AI code editor with full codebase agent mode and native Git

The primitive here is a diff-aware, repo-scoped agent that can read context, plan edits across files, run tests, and commit — not just autocomplete with extra steps. The DX bet is embedding the agent into the editor loop rather than making it a sidebar chat, and that's the right call: the moment of truth is when you ask it to refactor a module and it actually touches the right files without you babysitting the context window. The specific decision that earns the ship is native Git integration — agents that can't branch and commit are toys; ones that can are infrastructure.The Builder
36
M
Mistral 3B
Ship100% Ship

A 3B model that punches above 7B weight — open, fast, on-device

The primitive is clean: a quantization-friendly transformer checkpoint that fits in phone RAM and runs fast without a GPU babysitter. The DX bet Mistral made is correct — Apache 2.0 means no legal gymnastics, weights on Hugging Face means you pull it with three lines of transformers code, and the model card actually documents the eval methodology rather than burying it. The moment of truth for any on-device model is 'does it fit in 4GB with room for a KV cache and still produce coherent output,' and 3B at reasonable quant levels clears that bar. The specific decision that earns the ship: releasing under Apache 2.0 instead of a bespoke license is a concrete commitment to composability, and that's rare enough to call out.The Builder
37
V
Vercel AI SDK 5.0
Ship100% Ship

Swap LLM providers in one line, stream everything, observe it all

The primitive here is a provider-agnostic interface that normalizes streaming, tool calls, and observability across LLM APIs — and that is genuinely hard to do well because every provider invents their own streaming protocol. The DX bet is that the complexity gets absorbed at the SDK layer so your application code never sees a provider-specific data shape, which is exactly the right place to put it. The moment of truth is swapping from `openai` to `anthropic` in your provider config and watching your existing stream handlers not break — if that actually works without caveats, this earns its keep. The weekend-alternative comparison is the relevant one here: yes, you could wrap each provider yourself, but normalizing streaming deltas, partial tool call objects, and finish reasons across four providers is a month of yak-shaving, not a weekend script. The built-in observability hooks are the specific decision that pushes this to a ship — most SDKs bolt that on later or don't bother.The Builder
38
C
Codex CLI 2.0
Ship100% Ship

OpenAI's agentic coding agent lives in your terminal now

The primitive here is clean: a sandboxed agentic loop that reads your repo, writes diffs, and executes shell commands — all from stdin/stdout, composable with any Unix pipeline. The DX bet is that the terminal is the right abstraction layer, not a new IDE pane, and that's the correct call. The GitHub Actions integration is the moment of truth — if `npx codex run 'fix all failing tests'` in CI actually works without hallucinating imports or breaking unrelated files, this earns its keep. The specific technical decision that earns the ship: open source with a real repo, real npm package, real docs, and no 6-env-var bootstrap ceremony. Finally, a tool that ships as a tool.The Builder
39
V
v0 Agent
Ship100% Ship

Prompt to deployed full-stack Next.js app, no handholding required

The primitive here is straightforward: LLM-driven code generation wired directly into a CI/CD pipeline, so the deploy step isn't a separate act of will. The DX bet is that collapsing scaffold-debug-deploy into one agent loop removes the biggest friction point for solo builders — and that bet is largely correct. The moment of truth is asking it to wire up a Postgres-backed form with auth, and v0 Agent handles the Vercel KV and NextAuth integration without you spelunking through docs. The honest caveat: this is deeply opinionated toward the Vercel/Next.js stack, so the 'weekend alternative' comparison only holds if you were already deploying to Vercel anyway — if you're on Railway or Fly, you're not the user. Ships because the deploy integration is the actual differentiator, not the codegen.The Builder
40
H

Redesigned pipeline API with native async inference and MoE support

The primitive here is clean: a unified async-capable inference pipeline over any transformer model, with tokenizer backends finally collapsed into one interface instead of the slow/fast schism that's caused silent correctness bugs for years. The DX bet is that async-first design at the pipeline level is the right place to absorb concurrency complexity — and it is, because the alternative is every downstream user writing their own threadpool wrappers. Dropping Python 3.8 is the right call that got delayed two years too long; the moment of truth is whether your existing pipeline code migrates without breakage, and the unified tokenizer interface is the change most likely to bite you in ways that aren't obvious at import time. The MoE quantization support out of the box is the specific technical decision that earns the ship — that was genuinely painful to wire up manually and the library absorbing it is exactly what infrastructure should do.The Builder
41
C
Claude 4 Opus
Ship100% Ship

1M token context + autonomous agents from Anthropic's flagship model

The primitive here is a transformer inference endpoint with a 1M token context window and a structured agentic execution loop — two genuinely hard engineering problems that Anthropic has shipped, not just announced. The DX bet is that developers want a capable model with long context accessible through a clean API rather than a managed agent platform they have to adopt wholesale, and that's the right bet. The moment of truth is stuffing a large codebase into context and asking non-trivial questions — if that works reliably without hallucinated file references, this earns the price. The weekend-alternative test fails here: you cannot replicate 1M reliable context with chunking hacks and a vector store without sacrificing coherence. Earned the ship because the context window is a real primitive, not a marketing number.The Builder
42
M
Mistral Medium 3
Ship100% Ship

Production-ready LLM API with function calling, JSON mode, 128K context

The primitive here is clean: a mid-tier inference API with function calling, JSON mode, and a 128K context at a price point that doesn't require a procurement meeting. The DX bet is that developers want a capable model they can call without babysitting output parsing — structured JSON mode and typed function calling are the right answer to that problem. The moment of truth is your first tool-use call: if the schema adherence holds under realistic conditions (nested objects, optional fields, ambiguous inputs), this earns its keep. The weekend alternative — prompt-engineering GPT-4o-mini to return JSON and hoping for the best — is exactly what this replaces, and that's a real problem worth solving. Ships because the capability set maps directly to production agentic workloads and the cost delta against frontier models is a genuine engineering decision, not a marketing claim.The Builder
43
M

Open-source 8B model that claims to beat GPT-4o Mini. Apache 2.0.

The primitive here is clean: a permissively licensed, instruction-tuned 8B model you can pull from Hugging Face and run anywhere without asking anyone's permission. The DX bet is Apache 2.0 — no custom license, no non-commercial carve-outs, no 'you must not compete with us' clauses buried in the fine print. That single decision makes this composable in a way that Llama's license and most other open-weight models are not. The moment of truth is `huggingface-cli download mistral-8b-instruct-v3` and it survives it. Can a weekend engineer replicate this? No — fine-tuning a competitive 8B instruct model from scratch is months of work and six-figure GPU bills. The specific decision that earns the ship: Apache 2.0 with competitive benchmark numbers means this is now the default base for any production open-source LLM project that can't afford to care about proprietary licenses.The Builder
44
B
Beads (bd)
Ship100% Ship

Git-backed task graph that gives your coding agent persistent memory

The primitive here is clean: a dependency-aware DAG of tasks, stored as versioned JSONL inside your repo, with hash-based IDs that make merge collisions structurally impossible rather than a discipline problem. The DX bet — put the complexity in the data model, not the CLI — is exactly the right call, and `bd claim` for atomic task assignment is the kind of thing you only design if you've actually run two agents into each other and watched them both pull the same file. The weekend alternative here is a markdown TODO in a git repo, and it collapses the moment you have two agents or a branch switch; Beads earns its existence specifically because the naive solution fails in a documented and predictable way.The Builder
45
O
OpenSpace
Ship100% Ship

The agent framework that gets smarter with every task it runs

The primitive here is clean and nameable: a persistent skill store that sits between your host agent and the LLM, intercepting successful execution traces and codifying them into reusable, versioned callables — all wired together via MCP so it composes with whatever you're already running. The DX bet is right: complexity is pushed into the skill lineage layer and the local dashboard, not into your integration code. The weekend alternative would be a SQLite database of successful prompt chains with a retrieval wrapper, and that's roughly what this is — but the auto-repair loop and community cloud distribution are the parts you'd actually spend two weekends building badly. The specific technical decision that earns the ship: MCP as the integration layer rather than a bespoke SDK means you're not adopting a platform, you're adding a primitive.The Builder
46
O
OpenCode
Ship100% Ship

Privacy-first terminal coding agent — 75+ models, zero data retention

The primitive is clean: a local client/server AI coding agent where the server handles tool execution and model I/O against SQLite, and the frontend is swappable — TUI today, IDE extension tomorrow. The DX bet is that developers would rather manage their own API keys than pay a subscription tax, and that bet is correct for anyone who has ever watched Claude Code quietly bill $40 in an afternoon. The moment of truth is `opencode` in a terminal, Tab to switch between Build and Plan agents, and LSP-backed edits that actually know your project structure — it survives that test, and the Go binary means it starts fast and stays fast. The Build/Plan split is the specific technical decision that earned the ship: it's the right primitive for separating 'I want to understand this codebase' from 'I want to change it,' and it would have taken real thought to get that separation right without making it clunky.The Builder
47
E
Edgee
Ship100% Ship

One AI gateway, 200+ models, 50% cost cut via edge compression

The primitive is exactly what it says: a transparent reverse proxy with semantic compression on tool-result JSON before forwarding to the LLM — and that's a specific, real problem for anyone running agentic workloads where tool calls turn 500-token prompts into 15,000-token context windows in three hops. The DX bet is 'zero code changes' via base URL swap, which is the correct call — forcing SDK wrapping would have killed adoption on day one. The moment of truth is whether the semantic compression is actually lossless at the task level, not just token-level, and I'd want a reproducible eval suite before trusting it on production coding agents — but the architecture earns trust that the wrapper-brigade does not.The Builder
48
M

Microsoft's official graph-based multi-agent framework, MIT licensed

The primitive here is a graph-based agent orchestration runtime with checkpointing and streaming baked in — and unlike LangGraph or AutoGen, the OpenTelemetry integration isn't a third-party plugin bolted on after the fact, it's a first-class citizen, which means you get distributed traces without writing your own instrumentation. The DX bet is to put complexity at the graph definition layer and keep the runtime predictable, which is the right call for anything you'd actually run in production. The weekend-alternative ceiling is real — you can't replicate persistent checkpointing, human-in-the-loop resumption, and production observability with three Lambda functions — and that's exactly the bar this clears.The Builder
49
E
Extractor
Ship100% Ship

Robust LLM-powered web content extraction

Traditional web scraping is brittle. LLM-powered extraction that understands content structure is the right approach. Works on messy pages where CSS selectors fail.The Builder
50
L
LM Studio
Ship100% Ship

Desktop app for running local LLMs with a ChatGPT-like UI

The UI is gorgeous — it feels like a native Mac app. Browse models, download, chat. No terminal needed. If Ollama is for developers, LM Studio is for everyone else.The Creator
51
O
Ollama
Ship100% Ship

Run LLMs locally on your machine — no cloud needed

The Docker of LLMs. Pull a model, run it, use the API. Privacy, no cloud costs, works offline. Essential tool for any developer experimenting with local AI.The Builder
52
C
Cursor
Ship100% Ship

The AI code editor with autonomous agents that work while you code

Agent mode is the real leap. I describe a feature, Cursor researches the codebase, writes tests, implements, and debugs — I review while it works. Background agents mean I always have something to review rather than waiting on AI. Cursor Tab's sub-100ms completions are still the best autocomplete available.The Builder
53
T
Turbolite
Ship100% Ship

Sub-250ms cold JOIN queries from SQLite on S3

Sub-250ms JOINs from cold S3 reads is genuinely impressive. This solves the biggest pain point of SQLite in serverless — you no longer need to ship the whole DB file. The VFS approach is the right abstraction level. I would use this for analytics dashboards today.The Builder
54
E
Extractor
Ship100% Ship

Robust LLM-powered web data extraction in TypeScript

Schema-driven extraction with LLM fallback is exactly right. Traditional scrapers break on every site redesign — Extractor adapts because it understands the content semantically. The TypeScript-first approach with strong typing on outputs is chef's kiss for building data pipelines.The Builder
55
C
Claude Code
Ship100% Ship

Anthropic's agentic coding tool that lives in your terminal

This is my daily driver. The codebase awareness is unreal — it understands project structure, conventions, and dependencies without being told. Multi-file refactors just work.The Builder
56
V
v0
Ship100% Ship

AI-powered UI generation from prompts — by Vercel

The code quality is surprisingly good — real shadcn components, not generic divs with inline styles. Saves me 2-3 hours per UI component.The Builder
57
C
Cline
Ship100% Ship

Autonomous AI coding agent for VS Code

The approval flow is brilliant — you see every action before it executes. More transparent than Cursor's agent mode. Great for complex multi-file refactors.The Builder
58
A
Aider
Ship100% Ship

Open-source AI pair programmer for your terminal

The best open-source alternative to Claude Code. Model-agnostic, configurable, and the git integration is solid. Perfect if you want control over your tools.The Builder
59
J
Jan
Ship100% Ship

Open-source ChatGPT alternative that runs locally

The team ships fast and responds to feedback. Good sign.The Creator
60
O
Open WebUI
Ship100% Ship

Self-hosted ChatGPT-style UI for any LLM

This is the kind of tool that makes you wonder how you worked without it.The Skeptic
61
T
Tailwind CSS
Ship100% Ship

Utility-first CSS framework — build UIs without leaving your HTML

V4 is the fastest CSS framework to build with. No context switching between files, instant builds, and the design system constraints prevent spaghetti CSS. Industry standard for a reason.The Builder
62
C
Continue
Ship100% Ship

Open-source AI code assistant for VS Code and JetBrains

This is the kind of tool that makes you wonder how you worked without it.The Futurist
63
C
Cody by Sourcegraph
Ship100% Ship

AI coding assistant with full codebase context

This fills a real gap in the ecosystem. Worth adopting early.The Creator
64
A
Amazon Q Developer
Ship100% Ship

AI coding assistant built for AWS and enterprise

Fast, reliable, and the docs are actually good. Ship.The Builder
65
C
Claude Agent SDK
Ship100% Ship

Build production AI agents with Claude

First-party SDK with excellent TypeScript support. Tool use and streaming work flawlessly. The agent loop is well-designed.The Builder
66
B
bolt.new
Ship100% Ship

Full-stack web development in the browser

AI-generated full-stack apps running instantly in the browser. The StackBlitz WebContainer foundation makes it actually work.The Builder
67
T
Trigger.dev v3
Ship100% Ship

Background jobs with long-running support

Long-running jobs up to 24 hours solve the AI agent execution problem. The v3 architecture is built for modern workloads.The Builder
68
O
Oxlint
Ship100% Ship

Blazing fast JavaScript linter

50x faster than ESLint with zero config. Catches the most impactful lint rules without the plugin complexity.The Builder
69
G
Gemini API
Ship100% Ship

Google's multimodal AI model API

The free tier is incredibly generous. Multimodal capabilities and grounding with Google Search are unique advantages.The Builder
70
M
Marimo
Ship100% Ship

Next-generation Python notebook

Reactive execution eliminates the biggest Jupyter pain point — hidden state. Cells re-run when dependencies change.The Builder
71
I
Instructor
Ship100% Ship

Structured outputs from LLMs

The simplest way to get typed, validated outputs from LLMs. Pydantic integration is natural for Python developers.The Builder
72
B
Biome
Ship100% Ship

Fast formatter and linter for web projects

One tool replacing Prettier + ESLint with massively better performance. The migration from existing configs is smooth.The Builder
73
O
Outlines
Ship100% Ship

Structured text generation for LLMs

Guaranteed valid JSON from LLMs — no retry loops needed. The FSM approach is mathematically elegant and reliable.The Builder
74
P
PartyKit
Ship100% Ship

Real-time multiplayer infrastructure

Stateful edge servers are the right abstraction for real-time. The Cloudflare acquisition ensures long-term viability.The Builder
75
L
Langfuse
Ship100% Ship

Open-source LLM engineering platform

Best open-source LLM observability. Traces, prompt versioning, and evals in one tool. Self-hosting option is a must.The Builder
76
V
Vercel AI SDK
Ship100% Ship

TypeScript toolkit for building AI applications

useChat and useCompletion hooks make AI UIs trivial. Provider abstraction means switching models is a one-line change.The Builder
77
C
Continue
Ship100% Ship

Open-source AI code assistant

Open-source Copilot alternative that works with any model. Connect Ollama for fully local AI coding assistance.The Builder
78
A
Anthropic API
Ship100% Ship

Claude API for building AI applications

Best instruction-following of any model. Tool use and extended thinking are reliable. The API design is clean.The Builder
79
R
Rspack
Ship100% Ship

Rust-based JavaScript bundler

webpack compatibility with Rust speed. The migration path from webpack is smoother than switching to Vite or Turbopack.The Builder
80
S
shadcn/ui
Ship100% Ship

Beautifully designed components you own

The 'copy into your codebase' approach is genius. Full ownership, full customization, no version dependency hell.The Builder
81
H
Helicone
Ship100% Ship

Open-source LLM observability platform

One-line integration via proxy is genius. Change your base URL and instantly get logging, caching, and rate limiting.The Builder
82
T
TanStack Router
Ship100% Ship

Type-safe routing for React

Type-safe search params and route params are game-changing. Catch route errors at compile time, not runtime.The Builder
83
V
Val Town
Ship100% Ship

Social website to write and deploy TypeScript

The fastest way to deploy a serverless function. Write TypeScript in the browser, get an instant URL. No config, no deploy step.The Builder
84
B
Bruno
Ship100% Ship

Open-source API client stored in git

API collections in git, no account required, and offline-first. This is how API clients should work.The Builder
85
D
Drizzle ORM
Ship100% Ship

TypeScript ORM that's slim and fast

SQL-like API means no magic ORM behavior. The schema is TypeScript, the queries are type-safe, and it's fast.The Builder
86
T
Trigger.dev
Ship100% Ship

Open-source background jobs for developers

TypeScript-native background jobs with great DX. The dashboard for monitoring and debugging jobs is excellent.The Builder
87
C
Codeium
Ship100% Ship

Free AI code completion and chat

Free tier with no restrictions is remarkable. Completion quality rivals Copilot for most languages.The Builder
88
A
Astro
Ship100% Ship

The web framework for content-driven websites

Zero JS by default with islands architecture is the right approach for content sites. Performance is incredible out of the box.The Builder
89
P
PocketBase
Ship100% Ship

Open-source backend in one file

Single binary with auth, database, file storage, and real-time. Deploy your backend with one file. Incredible for small projects.The Builder
90
B
Bun
Ship100% Ship

All-in-one JavaScript runtime and toolkit

10x faster package installs, native TypeScript, and built-in test runner. It's replacing Node.js in my new projects.The Builder
91
T
Tauri
Ship100% Ship

Build small, fast desktop apps with web frontends

10x smaller bundles than Electron with native performance. Use your web frontend with a Rust backend.The Builder
92
D
Dagger
Ship100% Ship

Programmable CI/CD engine

CI pipelines in TypeScript instead of YAML. Local execution means you can debug pipelines on your machine.The Builder
93
H
Hono
Ship100% Ship

Ultrafast web framework for the edge

Runs everywhere — Workers, Deno, Bun, Node. The middleware system and RPC mode are well-designed.The Builder
94
I
Inngest
Ship100% Ship

Durable workflow engine for developers

Step functions with automatic retries and state management. The event-driven model is perfect for complex workflows.The Builder
95
C
Convex
Ship100% Ship

Reactive backend-as-a-service

Real-time reactivity without WebSocket boilerplate. Server functions co-located with schema definition is elegant.The Builder
96
V
Vitest
Ship100% Ship

Blazing fast unit test framework powered by Vite

Jest-compatible API with Vite's speed. ESM and TypeScript work without configuration. The watch mode is instant.The Builder
97
M
Mintlify
Ship100% Ship

Beautiful documentation that converts

Beautiful docs from markdown with zero design effort. API reference generation and search work great.The Builder
98
N
Nitro
Ship100% Ship

Universal server engine

Write server code once, deploy anywhere. The preset system handles platform-specific deployment automatically.The Builder
99
T
Turborepo
Ship100% Ship

High-performance build system for monorepos

Simple turbo.json config, powerful caching, and Vercel remote cache integration. The easiest monorepo build tool to adopt.The Builder
100
W
Wasp
Ship100% Ship

Full-stack web framework in a DSL

Define auth, routes, and background jobs in a simple DSL. The generated React + Node.js code is clean and customizable.The Builder
101
T
tRPC
Ship100% Ship

End-to-end type-safe APIs

Types from server to client with zero code generation. The DX is magical — change a server type, client updates instantly.The Builder
102
T
ToolJet
Ship100% Ship

Open-source low-code platform

Another solid open-source Retool alternative. The visual builder and data source connectors are comprehensive.The Builder
103
P
Payload CMS
Ship100% Ship

The most powerful TypeScript headless CMS

Code-first CMS that runs inside Next.js. Full TypeScript types, access control, and the admin UI is excellent.The Builder
104
L
Liveblocks
Ship100% Ship

Real-time collaboration infrastructure

React hooks for real-time presence, cursors, and collaborative editing. Makes adding multiplayer features trivial.The Builder
105
H
htmx
Ship100% Ship

High-power tools for HTML

Elegant simplicity. For CRUD apps and content sites, htmx eliminates the need for a JavaScript framework entirely.The Builder
106
T
Temporal
Ship100% Ship

Durable execution for distributed applications

If your distributed system needs reliability, Temporal is the answer. Durable execution eliminates an entire class of bugs.The Builder
107
O
OpenAI API
Ship100% Ship

GPT-4 and beyond — the most popular AI API

The most mature AI API with the largest ecosystem. Function calling, JSON mode, and assistants API cover every use case.The Builder
108
D
Deno
Ship100% Ship

Secure JavaScript and TypeScript runtime

Deno 2's Node.js compatibility changes everything. Secure by default, great tooling, and now practical for real projects.The Builder
109
E
Encore
Ship100% Ship

Development platform for type-safe distributed systems

Define infrastructure in code, Encore provisions it. Type-safe API definitions generate clients automatically.The Builder
110
Z
Zod
Ship100% Ship

TypeScript-first schema validation

Define schema once, get types and validation. The TypeScript inference is seamless. Essential for any TypeScript project.The Builder
111
P
Playwright
Ship100% Ship

Reliable end-to-end testing for modern web apps

Best E2E testing framework. Auto-wait, trace viewer, and codegen eliminate the biggest pain points of browser testing.The Builder
112
S
SWC
Ship100% Ship

Speedy web compiler written in Rust

20x faster than Babel with full compatibility. Used by Next.js which validates production readiness.The Builder
113
C
Clerk
Ship100% Ship

Drop-in authentication and user management

Best auth DX available. Pre-built components look great, the middleware is solid, and the dashboard is useful.The Builder
114
G
GitHub Actions
Ship100% Ship

CI/CD built into GitHub

CI/CD in the same place as your code. The marketplace has an action for everything. Matrix builds are powerful.The Builder
115
A
Appsmith
Ship100% Ship

Open-source low-code platform for internal tools

Open-source Retool alternative that you can self-host. JavaScript transformations and API bindings are flexible.The Builder
116
P
Phoenix LiveView
Ship100% Ship

Rich server-rendered UIs with Elixir

Real-time UI without writing JavaScript. The BEAM VM handles millions of concurrent connections effortlessly.The Builder
117
A
Appwrite
Ship100% Ship

Open-source backend as a service

Full BaaS that you can self-host. Functions, auth, storage, and databases with good SDKs.The Builder
118
T
TanStack Query
Ship100% Ship

Powerful async state management

Eliminates 90% of server state management boilerplate. Caching, refetching, and mutations just work.The Builder
119
P
Prisma
Ship100% Ship

Next-generation ORM for Node.js and TypeScript

Type-safe database queries with auto-generated client. Prisma Migrate and Studio round out the developer experience.The Builder
120
W
Wrangler
Ship100% Ship

CLI for Cloudflare Workers

The best local development experience for edge functions. Miniflare emulates the entire Cloudflare platform locally.The Builder
121
N
Nx
Ship100% Ship

Smart monorepo build system

Remote caching and affected-only testing save enormous CI time. The project graph visualization is invaluable for large repos.The Builder
122
S
StackBlitz
Ship100% Ship

Browser-based full-stack development

WebContainers running Node.js in the browser is technical magic. Perfect for bug reproductions, tutorials, and quick experiments.The Builder
123
R
Retool
Ship100% Ship

Build internal tools remarkably fast

Build admin panels in hours instead of weeks. SQL queries, API connections, and components just work together.The Builder
124
C
Chromatic
Ship100% Ship

Visual testing and review for Storybook

Visual regression testing catches bugs that unit tests miss. The Storybook publishing and review workflow is seamless.The Builder
125
S
Sanity
Ship100% Ship

The composable content cloud

GROQ queries and the schema definition in code are elegant. The Studio is highly customizable with React.The Builder
126
P
pnpm
Ship100% Ship

Fast, disk space efficient package manager

3x faster installs, strict dependency resolution, and disk space savings. The best JavaScript package manager.The Builder
127
S
Svelte
Ship100% Ship

Cybernetically enhanced web apps

The compiler approach produces smaller, faster output. Svelte 5 runes are elegant. SvelteKit is a joy to use.The Builder
128
N
Next.js
Ship100% Ship

The React framework for the web

Server Components, streaming, and the App Router represent the future of React. The Vercel deployment experience is unmatched.The Builder
129
R
Recharts
Ship100% Ship

Composable charting library for React

Declarative React components for charts. The API is intuitive and customization through composition is elegant.The Builder
130
S
Storybook
Ship100% Ship

Frontend workshop for building UI components in isolation

Non-negotiable for any serious component library. Visual testing, docs, and interaction testing in one place.The Builder
131
S
Strapi
Ship100% Ship

Open-source headless CMS

Open-source CMS you can self-host. The visual content-type builder and plugin system are well-designed.The Builder
132
R
React Native
Ship100% Ship

Build native mobile apps with React

New Architecture with Fabric renderer eliminates the old bridge bottleneck. Performance is now genuinely native-grade.The Builder
133
E
Expo
Ship100% Ship

Framework for building React Native apps

EAS Build, OTA updates, and the managed workflow eliminate the worst parts of mobile development. Indispensable.The Builder
134
U
Unleash
Ship100% Ship

Open-source feature flag management

Open-source feature flags that you can self-host. SDKs for every language and the evaluation is fast.The Builder
135
S
Sourcegraph
Ship100% Ship

Code search and intelligence platform

Universal code search across repos is a superpower for large orgs. Cody AI assistant with full codebase context is excellent.The Builder
136
H
HTTPie
Ship100% Ship

API testing client with a human-friendly CLI

The most readable CLI for HTTP requests. Intuitive syntax that doesn't require remembering curl flags.The Builder
137
D
Directus
Ship100% Ship

Open-source data platform and headless CMS

Point it at any SQL database and get an instant API + admin UI. The most flexible headless CMS approach.The Builder
138
S
Swagger / OpenAPI
Ship100% Ship

API documentation and design standard

The REST API description standard. Every API should have an OpenAPI spec. The tooling ecosystem is massive.The Builder
139
M

Fine-tune Llama 4 Maverick on a single consumer GPU with LoRA

The primitive here is a LoRA fine-tuning harness purpose-built for Llama 4 Maverick's architecture, and that specificity is the whole value — this isn't a generic PEFT wrapper, it's recipes that actually account for Maverick's MoE routing and attention layout. The DX bet is pre-built configs over a configuration API, which is the right call for this audience: most people fine-tuning Maverick don't want to tune learning rate schedules, they want a working baseline fast. The moment of truth is whether the 24GB VRAM claim holds on a real RTX 4090 with a non-trivial dataset, and Meta's done enough public work on LLaMA tooling that I'd trust the number until proven otherwise. This isn't something a weekend warrior replicates with three API calls — the memory optimization work around gradient checkpointing and quantized optimizer states is legitimately non-trivial. Ships because it solves a hard, specific problem and Meta has the receipts to back the claims.The Builder
140
O
OpenAI o3 Pro API
Ship75% Ship

OpenAI's most capable reasoning model now open for API access

The primitive is clean: a reasoning-optimized inference endpoint with function-calling and structured output baked in, not bolted on. The DX bet here is that you pay for latency and cost in exchange for dramatically fewer hallucinations and more reliable chain-of-thought on hard problems — and that's the right tradeoff for the specific class of tasks this targets. The moment of truth is sending it a gnarly multi-constraint problem that trips up o3 or GPT-4o, and it actually handles it. The weekend alternative is not a thing here — you're not replicating this with a prompt wrapper and retries.The Builder
141
A

Drag-and-drop real-time voice pipelines with GPT-4o Realtime

The primitive here is a node graph that compiles to a managed real-time audio streaming pipeline — not a wrapper around a single API call but an actual orchestration layer that handles buffering, turn-taking, and interrupt handling between STT, LLM, and TTS nodes. The DX bet is right: putting complexity in a visual composer rather than a YAML config or a 300-line SDK initialization is the correct tradeoff for a domain where the wiring is genuinely hard. The moment of truth is whether you can swap in a fine-tuned voice model without the whole graph breaking — and the public preview docs suggest that swap is first-class, which earned my ship. What would cause the skip is if the visual builder is a demo skin over a brittle JSON blob with no programmatic export, and I can't verify that from preview docs alone.The Builder
142
L
LangGraph Cloud
Ship75% Ship

Managed stateful agent workflows with human-in-the-loop at GA

The primitive is clear: a managed runtime for persistent, interruptible graph-state machines that survive process restarts and support human approval gates mid-execution. That's a real problem — anyone who's tried to bolt durable execution onto a stateless Lambda knows the pain. The DX bet is that graph-as-code (nodes, edges, conditional routing) is the right mental model for agent workflows, and for complex multi-agent pipelines that bet mostly holds up. The moment of truth is when you need to checkpoint mid-graph without rolling your own Redis state machine — and LangGraph Cloud actually earns its keep there. This is not a weekend script replacement; durable execution with human interruption points is genuinely hard infrastructure. The specific technical decision I'm shipping on: persistent state and human-in-the-loop are first-class primitives, not afterthoughts bolted onto a chat framework.The Builder
143
L

Fine-tune Llama 4 Scout on a single GPU with LoRA and quantization recipes

The primitive here is clean: LoRA adapters plus quantization-aware training recipes packaged so you can actually run them on a single RTX 4090 without writing your own CUDA memory management. The DX bet is that most fine-tuning practitioners are drowning in boilerplate and scattered examples, so Meta is betting that opinionated, tested recipes beat a generic trainer. That's the right bet. The moment-of-truth test — cloning the repo, pointing it at your dataset, and getting a training run started — needs to survive without 12 undocumented environment dependencies, and if Meta has actually done that work here, this earns its place as the reference implementation for Scout adaptation. The specific decision that earns the ship: QAT recipes baked in from day one, not bolted on later.The Builder
144
T
TreeQuest
Ship75% Ship

Multi-agent MCTS framework that makes LLMs actually reason

The primitive here is clean: MCTS as a search strategy over LLM-generated reasoning steps, where each node is an LLM call and the tree policy guides exploration. The DX bet is that they've abstracted the hard parts — rollout policy, value estimation, node selection — so you can plug in your own model backend without rewriting the search logic. The moment of truth is whether the repo actually runs out of the box with a real model, and the open-source release with documented examples suggests it does. This is not a three-API-call Lambda — MCTS over LLM calls with proper value estimation is genuinely nontrivial to implement correctly, and Sakana shipping a composable version of it earns the ship.The Builder
145
O

Build autonomous web agents that browse, fill forms, and act

The primitive is clean: a hosted browser-use agent you call via API instead of standing up your own Playwright infrastructure, vision model pipeline, and retry logic. The DX bet is that OpenAI owns the messy middle — DOM parsing, CAPTCHA handling, session state — so you don't have to. The moment of truth is whether the first task call actually completes a real-world form without requiring a 40-parameter config, and based on the beta reports, it mostly does. The weekend-build alternative is real — Playwright plus GPT-4o plus a queue is buildable in a day — but the hosted reliability, session management, and safety layer are the genuine value-add here. I'm shipping this because "hosted browser-use with managed sessions" is a specific, hard problem that a raw API call does not solve.The Builder
146
L

See every token Claude Code burns — per prompt, session, workspace

Been waiting for exactly this. The per-session token breakdown finally shows which commands are bankrupting my API budget and which are model-efficient. The system prompt inspector — showing what Claude Code actually sends as context — is worth the signup alone.The Builder
147
S
Superpowers
Ship75% Ship

The agentic coding methodology that makes AI agents plan before they code

If you've ever watched Claude Code spiral into confusion after three tool calls, Superpowers is the antidote. The spec-before-code workflow eliminates most context loss, and the parallel subagent model actually ships features faster than one monolithic agent thrashing around. Worth the upfront ceremony.The Builder
148
T
Tether QVAC SDK
Ship75% Ship

Build local-first AI agents that run offline on any device — no cloud needed

A single API covering text, vision, speech, OCR, and translation — locally, cross-platform, offline — built on llama.cpp with P2P model distribution via Holepunch. This is the toolkit for building genuinely private AI apps, especially on mobile where on-device inference is finally practical.The Builder
149
M
Matt Pocock Skills
Ship75% Ship

Battle-tested Claude agent skills from decades of engineering XP

The /grill-with-docs skill alone is worth installing — it forces the agent to read actual documentation before writing a single line. I've been burned so many times by agents hallucinating APIs. This is the discipline layer that was missing.The Builder
150
K
Kelviq
Ship75% Ship

Merchant of record + usage billing built for AI companies

Token-level metering with real-time entitlement enforcement in one API is the infrastructure I've been duct-taping together with Stripe + Lago + TaxJar for years. Kelviq collapsing that stack is worth serious evaluation, especially for early-stage AI products.The Builder
151
A
Apideck MCP Server
Ship75% Ship

Give AI agents real-time read/write access to 200+ SaaS apps via one MCP server

Normalized schemas across 200+ SaaS APIs exposed as MCP tools — this eliminates weeks of integration work per enterprise agent deployment. The ability to swap providers without changing agent code is the killer feature; it future-proofs your agent against vendor changes.The Builder
152
A
AI-Trader
Ship75% Ship

Agent-native trading platform where AI and humans share signals

The agent registration API is dead simple — read a skill file, register, and your bot is live in the community. For quant devs tired of walled-garden trading platforms, this is a compelling alternative that lets AI agents operate as first-class market participants.The Builder
153
C
CUA
Ship75% Ship

Open-source infra to build agents that drive real computers — any OS

The cross-platform API abstraction is genuinely well-designed — the same agent code that drives a Linux terminal works on macOS GUI apps without modification. CuaBot with Claude Code is a surprisingly capable local autonomous agent stack for tasks that have no API.The Builder
154
C
CloakBrowser
Ship75% Ship

Stealth Chromium that passes every bot detection test

This solves a genuinely painful problem that every scraping team deals with — bot detection breaking prod pipelines. The source-level patching approach is smart engineering that doesn't fall apart on Chrome updates. Drop-in Playwright compatibility means zero migration friction.The Builder
155
H
Hopper
Ship75% Ship

The first AI agent dev environment built for COBOL and mainframes

This solves a real crisis. I've watched financial institutions pay six-figure consultant fees for tasks that Hopper demos suggest could be automated in minutes. If it's reliable on diverse JCL and CICS environments, this is immediately commercial.The Builder
156
R
React Doctor
Ship75% Ship

Catch every anti-pattern your AI agent baked into your React app

The GitHub Actions integration with PR health score diffs is the feature I didn't know I needed. Installing it took three minutes and immediately flagged three useEffect anti-patterns Cursor introduced last week.The Builder
157
A
AgentMemory
Ship75% Ship

Persistent cross-session memory for Claude, Cursor, Codex & friends

51 MCP tools and zero-config hooks is a genuinely thoughtful design. The SQLite-only requirement means nothing to install or manage. This is exactly the kind of glue layer that makes multi-session agent workflows actually viable.The Builder
158
N
Needle
Ship75% Ship

A 26M-param model that routes tool calls on phones and watches

If you're building any kind of personal agent or on-device assistant, Needle solves the tool-routing problem cleanly. The MIT license and Hugging Face weights make integration straightforward—drop it in, point it at your tool list, done.The Builder
159
R

Prompt to deployed full-stack app — database, domain, and all

The primitive here is a hosted agentic loop that closes the gap between prompt and deployed URL — not just code generation, but actual provisioning: Nix-based environment, PostgreSQL spin-up, Replit's own CDN for domain. The DX bet is that zero-config is the right place to put all the complexity, and for the target user it mostly pays off. My concern is the moment of truth: when the agent writes broken SQL migrations or scaffolds a React component with the wrong state shape, the debugging surface is a chat thread, not a diff. That's fine for prototyping but it's a trap for anyone who thinks they're shipping production code. Still, compared to stitching together Vercel + Railway + Cursor yourself, this is genuinely faster for the 90% case — and the database provisioning being automatic is the specific decision that earns the ship.The Builder
160
V
Voker
Ship75% Ship

Analytics platform built specifically for AI agents

The pain point is totally real — debugging agent behavior in production today is a nightmare of manually reading transcripts. Intent detection + resolution tracking as first-class primitives is exactly what's missing from the current toolchain. The SDK integration is clean.The Builder
161
M

LoRA, QLoRA, and RLHF for Llama 4 Scout on consumer hardware

The primitive here is parameter-efficient fine-tuning with an RLHF reward loop, packaged so you don't have to wire up three separate libraries and debug tensor shape mismatches at 2am. The DX bet is putting LoRA, QLoRA, and the RLHF pipeline in one repo with a shared config surface — that's the right call because the biggest pain in fine-tuning isn't any single technique, it's getting them to coexist without version hell. The moment of truth is whether the quickstart actually runs on a 24GB consumer GPU without hidden dependencies; if it does, this earns its keep. The specific decision that earns the ship: shipping RLHF as a first-class citizen rather than an advanced-users-only footnote makes this meaningfully harder to replicate with a weekend Hugging Face script.The Builder
162
M
Mistral 4B Edge
Ship75% Ship

Open-source 4B model that runs fully on-device, no cloud needed

The primitive here is a quantized instruction-tuned LLM that fits in consumer VRAM without performance falling off a cliff — and that's a genuinely hard engineering problem, not a marketing one. The DX bet is correct: Apache 2.0 plus Hugging Face distribution means you're one `from_pretrained` call from running it, no API keys, no rate limits, no surprise bills. The weekend alternative is 'just use llama.cpp with Gemma' and honestly that's fine too, but Mistral's consistent quality bar on instruction-following at small scales makes this worth the swap. What earns the ship is the license — Apache 2.0 on a capable 4B is the right thing and Mistral did it without hedging.The Builder
163
L

Fine-tunable 17B MoE checkpoints from Meta, free to download and adapt

The primitive here is dead simple: MoE instruction checkpoint with open weights you can pull from Hugging Face, plug into your fine-tuning pipeline, and own. The DX bet Meta made is 'we handle pre-training, you handle adaptation,' which is exactly the right cut — nobody wants to pay $2M in compute to reproduce this. The moment of truth is `huggingface-cli download meta-llama/Llama-4-Scout-17B-Instruct` and whether your VRAM budget survives it; 17B active params on MoE is actually friendlier than it sounds, but the docs need to be explicit about quantization paths and minimum hardware. Compared to a weekend alternative, you cannot replicate a 17B MoE with domain-specific instruction tuning on a Lambda — this is the real deal, and the permissive research license means you're not signing your soul away.The Builder
164
S
SmolAgents 2.0
Ship75% Ship

Visual workflow builder for multi-agent AI pipelines, no code required

The primitive here is a thin orchestration layer over code-executing agents with an optional visual graph editor layered on top — and that layering is the right architectural call. The DX bet is that code-first developers shouldn't be forced through a GUI, while the visual builder handles the on-ramp for everyone else. The MCP integration is the honest differentiator: you get composable tool use without inventing yet another plugin schema. My one concern is that 'no-code visual builder' and 'code execution sandbox' are two very different trust surfaces sitting in the same release — I'd want to audit exactly what escapes the sandbox before I hand this to a non-technical user on shared infrastructure.The Builder
165
M

Llama 4 Scout & Maverick hosted API — no self-hosting required

The primitive is clean: hosted inference for Llama 4 MoE models via a standard API, no GPU cluster required. The DX bet Meta is making is 'OpenAI-compatible enough that switching costs are near-zero,' which is the right call — if they've actually implemented compatible endpoints, a one-line base URL swap gets you access to Scout's 17B active parameters or Maverick's larger context without rewriting your client code. The moment of truth is whether the rate limits on the free tier are generous enough to actually build against, or if you hit a wall before you can prototype anything real. I'm shipping this cautiously because the underlying models are legitimately good and the 'no self-hosting' unlock is real — but Meta's track record on sustained developer platform investment is spotty, and I want to see SLAs before I route production traffic here.The Builder
166
A

Declarative YAML orchestration for multi-agent AI pipelines on Azure

The primitive here is a declarative runtime that resolves agent graphs at execution time — YAML drives the wiring, the SDK handles the state machine. The DX bet is that configuration-as-code beats imperative orchestration for multi-model pipelines, and for teams already living in ARM templates and Bicep, that bet is correct. The OpenTelemetry integration is the actually important detail nobody is emphasizing enough: getting trace context threaded through agent hops without custom middleware is a real problem this solves. My concern is the classic Azure problem — the first 10 minutes will involve az login, resource group provisioning, and at least two managed identity configs before you run a single inference call. The weekend-script alternative exists for two-agent workflows; this earns its keep only when you're wiring four or more heterogeneous models with shared memory state.The Builder
167
R
Rova AI
Ship75% Ship

Autonomous QA agent that tests by goal, not by script

As a solo dev shipping daily, I've completely given up on maintaining Playwright tests — Rova's goal-based approach is the first testing tool that's actually kept up with my pace. The @rova Jira integration means bugs get caught before standup, not after a customer complaint.The Builder
168
G

Autonomous research agents with MCP and native charts in your app

The MCP integration is the real story — connecting Deep Research to our internal data warehouse with a single server definition and getting research-grade synthesis in return is exactly what enterprise AI apps need. This replaces three separate pipeline stages for us.The Builder
169
T
Tabstack
Ship75% Ship

Pass a URL and a schema, get back structured JSON — every time

Schema-first data extraction is exactly what AI pipelines need — define the shape of your data once and stop prompt-engineering JSON out of an LLM on every request. The Mozilla pedigree means they actually understand how browsers work under the hood.The Builder
170
O
Oh My codeX (OMX)
Ship75% Ship

Hooks, agent teams, and persistent state for the OpenAI Codex CLI

Parallel agents in isolated git worktrees is the feature every Codex power user has been waiting for — no more merge conflict hell when you run multi-step tasks. The 36 built-in workflow skills mean you're not starting from scratch. Install this the moment you start using Codex CLI seriously.The Builder
171
A

Community skill library that gives Codex CLI real-world superpowers

This is the npm registry moment for Codex skills — and Composio got there first. The SKILL.md format is dead simple, and the Slack/GitHub/Notion integrations mean these aren't just code tricks, they're workflow automations. If you're on Codex CLI, install your first three skills this afternoon.The Builder
172
C
Craft Agents
Ship75% Ship

Open-source desktop app for multi-session Claude agents with MCP & APIs

The three permission modes — Explore, Ask to Edit, Auto — is the right model for how I actually use agents. I want read-only exploration when I'm learning a codebase and auto mode when I'm in flow. That plus MCP server support makes this my new default agent UI.The Builder
173
S
Superpowers
Ship75% Ship

7-stage agentic methodology that stops AI from just winging it

The git worktrees per feature approach is something I wish I'd done from day one — isolated environments per task means agents can't accidentally clobber each other's work. The RED-GREEN-REFACTOR enforcement alone makes this worth the setup time.The Builder
174
M

Reusable Claude agent skills that fix AI coding's biggest failure modes

This is the missing manual for working with coding agents. The /tdd and /grill-me skills alone have already changed how I approach agent sessions — I actually get working code on the first pass now instead of a beautiful-looking mess that fails every test.The Builder
175
C
Claude Code Local
Ship75% Ship

Run Claude Code 100% on-device on Apple Silicon — zero API calls

65 tok/s Qwen locally is actually usable for real coding — the v2 fixes to tool-call formatting make a huge difference. For NDA client work where I can't send code to Anthropic, this has become essential. The MLX optimization is genuinely impressive engineering.The Builder
176
C

MCP server that teaches AI coding agents to avoid technical debt

The 20% → 90-100% fix rate improvement is the stat that matters. I've watched Cursor blindly create tech debt while 'fixing' things — an MCP that injects code health context before the LLM writes is exactly the right intervention point. Already running this on production code.The Builder
177
D
Devin for Terminal
Ship75% Ship

Local CLI coding agent that keeps working when you close your laptop

The 'keep working when you close your laptop' pitch is exactly right. I've lost countless Devin sessions to network hiccups. Persistent cloud-backed execution from my terminal is the architecture I've wanted since day one. This is how async development should work.The Builder
178
S
Social Fetch
Ship75% Ship

Pull real-time data from TikTok, Instagram, YouTube, X, LinkedIn via one API

Maintaining scrapers for six platforms is genuinely painful. If Social Fetch keeps up with API changes and anti-bot measures, the time savings alone justify the cost. The TypeScript SDK and OpenAPI spec mean zero friction to integrate.The Builder
179
A
Actian VectorAI DB
Ship75% Ship

Portable vector DB for edge & on-prem — 22x faster than Milvus at 10M vectors

The edge/on-prem angle is underserved. Most vector DB benchmarks are cloud-optimized and fall apart on constrained hardware. If the 22x QPS claim holds up under independent testing, this is the default for edge RAG.The Builder
180
D
DOOM MCP
Ship75% Ship

Play DOOM inline inside Claude or ChatGPT — full game, no browser needed

The signed-token progressive enhancement pattern is the part worth stealing. This is a clean reference architecture for MCP interactive apps, and DOOM just happens to be the demo case.The Builder
181
A

An AI agent loop that redesigns your RISC-V CPU and formally proves every win

The hardcoded orchestrator pattern is the real take-home here. Building AI loops that can't game their own eval is a solved problem when you just... don't give the agent write access to the evaluator. Obvious in hindsight, rarely implemented.The Builder
182
C
Cua
Ship75% Ship

Open-source infra for computer-use agents across Mac, Linux & Windows

Cua solves the hardest part of computer-use agents — getting a stable, reproducible environment that doesn't fight your OS. The background automation mode alone is worth it for devs building macOS agents. 15k stars in a short window is a strong signal.The Builder
183
G
Google ADK
Ship75% Ship

Google's open-source Python framework for production AI agent systems

ADK hits the sweet spot between the simplicity of a prompt wrapper and the complexity of LangChain. The MCP integration and built-in dev UI make it the most productive framework I've tried for real multi-agent systems. The Python-native design means you can test agents like real software.The Builder
184
V
VibeVoice
Ship75% Ship

Microsoft's open-source voice AI: transcribe 60-min audio or speak for 90-min

The full-pipeline coverage here is rare — ASR, TTS, and streaming in one repo with MIT weights. I'd have this running in a side project by tonight. The 300ms streaming latency is production-viable for most voice apps.The Builder
185
G
GitNexus
Ship75% Ship

Drop in any repo, get a full knowledge graph + Graph RAG agent — in-browser

The MCP integration for Claude Code and Cursor is the killer feature — this is the architectural context layer those tools have always lacked. Precomputing the graph at index time so agents get full call chain context in one lookup is a smart design decision that pays off in real usage. 28K stars says the community agrees.The Builder
186
S

The benchmark that tests whether LLMs get JSON values right, not just syntax

This is the benchmark I've been waiting for. 'Valid JSON' is table stakes — the real question is whether field values are correct. This plugs a genuine gap in how we evaluate extraction pipelines.The Builder
187
Z
Zed 1.0
Ship75% Ship

The AI-native code editor built for speed ships its production 1.0

I switched from VS Code to Zed six months ago and haven't looked back. The parallel agents feature alone justifies the move — running three agents editing different files simultaneously while I review is a workflow upgrade that VS Code can't match yet.The Builder
188
J
jcode
Ship75% Ship

Rust coding agent harness: 6× less RAM, 14ms startup, multi-agent swarms

14ms startup and 6× lower RAM than competitors? This is the kind of engineering that makes you rethink your whole toolchain. The multi-agent swarm coordination is genuinely novel — not just 'run two Claude windows.'The Builder
189
Z
ZeroID
Ship75% Ship

Cryptographic identity and delegation chains for every AI agent

The primitive here is clean: an OIDC-compliant token exchange server (RFC 8693) that stamps delegation provenance into the credential itself — no side-channel audit log required, the chain is the token. The DX bet is that developers adopt it as infrastructure, not a framework, and the Docker Compose + PostgreSQL setup with three SDK targets backs that up; you're not adopting a platform, you're standing up a service. The moment-of-truth test — can a LangGraph workflow prove which sub-agent took an action and who authorized it? — is a real problem I've actually had, and this solves it without requiring you to invent your own JWT claim schema at 2am. The one thing I'd want before going production: a public test suite and some adversarial examples for token forgery edge cases.The Builder
190
A
Asqav
Ship75% Ship

Quantum-safe, hash-chained audit trails for every AI agent action

The primitive is clean: sign agent actions with ML-DSA-65, chain the hashes, export the trail — and the API backs that up with a three-call surface (init, create agent, sign action) that doesn't bury you in config before hello-world. The DX bet is complexity-at-the-library-layer, simplicity-at-the-call-site, which is exactly the right call for something this security-sensitive. The only thing I'd flag: multi-agent audit trails are listed as 'in active development,' which means anyone building orchestration topologies today is buying a partial solution — ship it, but go in with that specific gap noted.The Builder
191
M
MinerU2.5
Ship75% Ship

1.2B-param VLM that converts any document to clean structured text

I've tried six document parsing libraries and MinerU has the best table extraction accuracy I've seen at any price point. The Markdown output is clean enough to feed directly into embedding pipelines without post-processing. 61K stars isn't hype — it's earned.The Builder
192
G
Gemini CLI
Ship75% Ship

Google's open-source terminal agent — 1K free requests/day, MCP-ready

The 1,000 free daily requests is genuinely competitive — I've been hitting Claude Code limits and this fills the gap. MCP support and GEMINI.md config make it a first-class citizen in any multi-agent workflow. The Chapters feature is an underrated UX win for long sessions.The Builder
193
W
Warp
Ship75% Ship

The agentic terminal just went open source (AGPL, Rust)

Warp has always had the best terminal UX, and going open-source removes the biggest objection to adopting it in security-conscious environments. The Oz agent-managed development model is experimental, but the AGPL client is immediately useful today.The Builder
194
G
Goose
Ship75% Ship

Local-first open source AI agent with 70+ MCP extensions

70+ MCP extensions and full offline support means you can actually customize this for real workflows. The YAML recipe system for portable automation is underrated — this is what an agent framework should look like.The Builder
195
M
mem9.ai
Ship75% Ship

Shared, cloud-persistent memory layer for your entire agent stack

The primitive is clean: a drop-in MCP-compatible memory server that swaps file-backed agent memory for a cloud-persistent hybrid search store backed by TiDB. The DX bet is right — complexity lives at the infrastructure layer (TiDB handles distributed storage and indexing), so the agent-side API stays thin. The moment of truth is connecting a second agent to the same server and watching it recall context the first agent wrote; that's the demo that earns the ship. You could not replicate genuine hybrid vector + keyword search with cross-agent consistency in a weekend script — the distributed consistency guarantees alone are a real engineering problem this solves.The Builder
196
G
GitNexus
Ship75% Ship

Turns any codebase into a queryable knowledge graph with MCP support

The primitive is clean: Tree-sitter parses your code into an AST, GitNexus lifts that into a graph, and the MCP server exposes 16 typed query tools so your AI editor gets call-chain context instead of hoping embeddings land on the right file. The DX bet — local-first, zero egress, registry-based multi-repo management — is exactly the right place to put the complexity, because the alternative is pasting 3,000 lines into a context window and praying. The moment of truth is `npm run index` followed by wiring the MCP server into Cursor; if that path is clean and the impact-assessment tool actually surfaces the correct transitive dependents on a real-world monorepo, this earns every one of its 32k stars.The Builder
197
O
OmX (Oh My Codex)
Ship75% Ship

Supercharge Codex CLI with multi-agent teams, hooks & live HUDs

The primitive here is clean: a process supervisor and state manager for Codex CLI agents, using git worktrees as isolation boundaries — which is exactly the right call, not an invented abstraction. The DX bet is that complexity lives in `.omx/` config and hook files rather than a CLI flag explosion, and that's the right place for it; the `$ralph` loop pattern in particular solves a real problem I've personally scripted around three times. The weekend-alternative test is close — you could duct-tape worktree spawning and a JSON state file yourself — but the live HUD and hook system would take a week, not a weekend, and the result would be worse. Earns the ship on the hooks-as-composition primitive alone.The Builder
198
E
EvanFlow
Ship75% Ship

TDD-first workflow framework that turns Claude Code into a disciplined dev team

This is exactly what Claude Code needed. The git guardrails hook alone is worth installing — I've seen too many agents nuke a working branch with a confident `git reset --hard`. EvanFlow's 'conductor not autopilot' philosophy maps perfectly to how good engineers actually want to use AI: fast on the mechanical stuff, slow on the decisions that matter.The Builder
199
M
MemOS
Ship75% Ship

A memory operating system for LLMs and AI agents

The unified memory API is what makes this genuinely useful — not having to juggle vector DBs, context stuffing, and fine-tuning separately is a real DX win. 35% token reduction is also meaningful at scale. Apache license and Docker deploy mean it fits into production stacks without legal headaches.The Builder
200
U
Utilyze
Ship75% Ship

See your GPU's real compute efficiency — not just whether it's busy

This belongs in every MLOps toolkit immediately. Standard utilization metrics are dangerously misleading — I've seen teams burn thousands on H100s that were memory-bandwidth-bottlenecked at 3% actual compute SOL. Apache 2.0 means you can embed it in any monitoring stack without licensing headaches.The Builder
201
L
Logic
Ship75% Ship

Plain English spec → production AI agent API in under 60 seconds

Eliminating the PromptLayer + Braintrust + LangFuse + Swagger stack into one product is genuinely useful. Auto-generated typed APIs with regression detection on every spec edit is what I want — I don't want to maintain that infra myself. MCP integration is the right call for tool connectivity.The Builder
202
S
SmolDocling
Ship75% Ship

256M-param VLM that converts any document to structured text

256M params that actually handle real-world PDFs including tables, charts, and mixed layouts — this goes straight into my RAG preprocessing pipeline. The DocTags format is smart: giving the model a precise document vocabulary instead of asking it to improvise structure from scratch.The Builder
203
D
Dirac
Ship75% Ship

Open-source coding agent that crushed TerminalBench-2 at 64.8% lower cost

Topping TerminalBench-2 while being 64.8% cheaper is the kind of benchmark that actually matters to developers. The hash-anchored editing and AST-native approach fix the two most annoying failure modes of existing coding agents — wrong line edits and syntax-blind refactors.The Builder
204
Q
Quarkdown
Ship75% Ship

Markdown with superpowers — docs, slides, and PDFs from one source

This solves a real problem — maintaining separate LaTeX for papers, GitBook for docs, and Beamer for talks is a mess. A unified Turing-complete Markdown system with live preview is exactly what the developer doc toolchain needs. GPL-3.0 works fine for most personal and internal projects.The Builder
205
A

50+ drop-in automation skills for OpenAI Codex CLI, curated by ComposioHQ

This is exactly what the Codex CLI ecosystem needs — a curated, community-maintained skills library instead of everyone reinventing SKILL.md from scratch. The MCP server scaffolding skill alone is worth the install. Fork it, customize it, ship it.The Builder
206
V
VibeVoice
Ship75% Ship

Microsoft's open-source voice AI that handles 90-min audio in one pass

MIT license plus Hugging Face weights is everything. Drop-in ASR with 60-minute single-pass capacity and speaker diarization out of the box? That replaces a whole stack for me. The 0.5B realtime model at 300ms latency is immediately useful for voice agents.The Builder
207
C

CLI toolkit to configure, monitor, and template your Claude Code projects

Managing CLAUDE.md conventions across 15 projects was a mess before this. The usage monitoring alone paid for the install time — I now know exactly which projects burn context and can optimize accordingly. 25K stars in this timeframe is earned, not astroturfed.The Builder
208
S

Real-world agent skills for engineers — install via npm, not vibes

The tdd skill alone is worth the install. Watching a Claude agent plan tests before writing implementation is exactly how I want AI to assist me. Matt's framing of 'real engineering vs. vibe coding' is the right cultural correction for 2026.The Builder
209
C
Chrome Prompt API
Ship75% Ship

Run Gemini Nano inside Chrome — on-device AI inference with no cloud round-trip

The JSON Schema structured output is the feature I've been waiting for — finally you can extract clean data from user-typed text without a backend. The 22GB download is a real onboarding hurdle, but once the model is cached, the latency is basically zero compared to cloud APIs. This changes the math for privacy-sensitive consumer apps.The Builder
210
C
Cursor 3
Ship75% Ship

The AI IDE rebuilt for agent orchestration — run 10 parallel agents, ship while you sleep

Parallel background agents are the feature I didn't know I needed until I watched three features ship while I was reviewing a PR. The Design Mode for UI changes alone saves me 20 minutes a day. This is the IDE I'm staying on.The Builder
211
C

Anthropic runs the sandbox so you don't — agents at $0.08/session-hour

$0.08 an hour to skip building and maintaining a sandboxed execution environment is genuinely cheap. I've spent weeks on that infrastructure before — it's painful, underappreciated, and now optional. The millisecond billing with idle time excluded shows Anthropic actually thought about this from a developer's perspective.The Builder
212
A
Apfel
Ship75% Ship

Tap the free AI already built into your Mac

The OpenAI-compatible server is a genuine unlock — I swapped my local dev config from Ollama to Apfel in two minutes and everything just worked. For Apple Silicon owners who want zero-latency local AI without model downloads, this is the move.The Builder
213
B
Beads
Ship75% Ship

A Dolt-powered dependency graph that gives coding agents persistent memory

This solves a real pain point I hit every time I run multi-agent loops — agents clobbering each other's work. Dolt as the backend is smart: you get SQL semantics, branching, and merge without standing up anything exotic. The `bd ready` command alone justifies the install.The Builder
214
E
Eden AI
Ship75% Ship

Europe's GDPR-native AI gateway — 500+ models, smart routing, zero US data dependency

The single API across LLMs, OCR, speech, and translation is genuinely useful for multi-modal pipelines. No more juggling five different SDKs and five different auth tokens. For European teams, the GDPR compliance story alone is worth the small platform fee over rolling your own routing.The Builder
215
M
MemPalace
Ship75% Ship

Verbatim AI memory with semantic search — structured like an actual palace

The spatial memory metaphor isn't just clever naming — scoped searches against wings and rooms meaningfully outperform flat vector search in my tests. MCP integration with Claude Code works out of the box. The 170-token recall cost is impressively lean.The Builder
216
C
Cua
Ship75% Ship

Open-source infra for AI agents that actually control computers — Mac, Linux, Windows, Android

Cua is the plumbing that makes computer-use agents actually work in production. The fact that Cua Driver handles background macOS automation without stealing focus is the detail that separates a demo from something you can ship. 465 releases means this is battle-tested infrastructure, not a weekend project.The Builder
217
G
GitNexus
Ship75% Ship

Drop any GitHub repo in your browser, get an interactive knowledge graph with Graph RAG

This is the missing layer between your codebase and your AI agents. The MCP integration means Claude Code can now actually understand your repo structure instead of guessing from file names. The privacy-first, zero-server approach makes it the only option I'd trust with client code.The Builder
218
Q
QuickCompare
Ship75% Ship

Compare LLMs on your own data — not someone else's benchmarks

Finally a tool that stops the 'which model is best?' debate cold. Running your actual prompts through all the candidates and getting a cost/quality matrix is exactly what every engineering team needs right now. The switch from gut feel to data is overdue.The Builder
219
M
Mnemos
Ship75% Ship

Local vector memory for Claude Desktop with 3D conversation visualization

This solves a real, painful problem with zero cloud dependency. The hybrid FTS5 + vector search is the right architecture — you get speed and semantic richness without compromising privacy. The .NET 9 stack is slightly niche but the setup looks smooth.The Builder
220
G
Grok Build
Ship75% Ship

xAI's local-first CLI coding agent with 8 parallel agents and arena mode

8 parallel agents tackling the same coding task is a fascinating approach — it's basically tournament selection applied to code generation. If the arena mode lets me specify different constraints for each agent (test coverage vs. speed vs. readability), this could become a genuine creative tool for complex architecture decisions.The Builder
221
A
AI Designer MCP
Ship75% Ship

Give Claude Code the ability to generate beautiful, codebase-aware UI

This is one of those tools that addresses the single most annoying thing about AI coding agents — the ugly UI problem. If it genuinely reads my design system and produces contextually appropriate components rather than generic Tailwind slop, it pays for itself in minutes. One-command install is the right onboarding.The Builder
222
C
claude-mem
Ship75% Ship

Persistent cross-session memory for Claude Code — 10x cheaper context

If you're using Claude Code heavily, this is table stakes. The FTS5 + vector hybrid search means you stop re-explaining your codebase conventions every session, and the 10x token savings claim holds up in practice. The lifecycle hook architecture is clean and non-intrusive.The Builder
223
H
Hermes Agent
Ship75% Ship

The self-improving AI agent that learns from every session

The closed-loop learning loop is the real innovation here — most agent frameworks just wrap an LLM call. Hermes builds a compound skill library over time, and the multi-platform gateway (WhatsApp, Slack, Telegram all at once) is genuinely production-ready. 115K stars doesn't lie.The Builder
224
R
Roo Code
Ship75% Ship

A full AI dev team in your VS Code — Code, Architect, Debug & custom modes

The multi-mode approach is genuinely underrated — switching to Architect Mode feels like talking to a different person and that's a good thing. MCP support and model-agnosticism mean you're not boxed in. Once you add custom modes for your team's workflows this becomes indispensable.The Builder
225
M
Matt Pocock Skills
Ship75% Ship

21+ battle-tested Claude agent skills from TypeScript's top educator

The TDD skill and git-guardrails-claude-code alone are worth the install. Pocock's skills reflect how a TypeScript professional actually works — not generic demo code. The npx install pattern is elegant and composable.The Builder
226
W
WUPHF
Ship75% Ship

Open-source multi-agent 'office' — AI teams that think together

The token-efficiency story alone makes this worth trying — $0.06 for a five-agent session is remarkable. The @mention graph and shared wiki are genuinely novel patterns that every multi-agent framework should steal.The Builder
227
G
Gemini CLI
Ship75% Ship

Google's free open-source terminal AI agent — 1M context, MCP, 1000 calls/day free

1000 free calls a day is a genuinely useful free tier — most days I don't hit that limit. The 1M context window for codebase-wide analysis is real and fast. Google Search integration in the terminal is a killer combo.The Builder
228
C
Clawdi
Ship75% Ship

Run OpenClaw and Hermes agents in the cloud — zero setup required

This is the 'it just works' solution I've been wanting for months. Spinning up a persistent OpenClaw instance in the cloud without touching config files is genuinely liberating — and the Phala TEE backing means my API keys aren't just floating in someone's S3 bucket.The Builder
229
A

50+ Codex skills that wire your AI agent to Slack, Notion, email, and 1000+ apps

The CI/CD fix skill and MCP builder skill alone justify installing this. Composio's 1000-app integration layer behind the scenes means these aren't just text templates — they're wired to real APIs. This is the missing middleware for Codex.The Builder
230
M
ml-intern
Ship75% Ship

HuggingFace's open-source ML engineer that reads papers and trains models

This is the thing I wanted to exist two years ago. Being able to throw a paper at an agent and have it actually run the experiment is a genuine workflow unlock. The HF ecosystem integration is clean and it avoids the usual agentic foot-guns with its approval gates.The Builder
231
A
Apfel
Ship75% Ship

Unlock Apple's built-in 3B model — CLI, chat, and OpenAI-compatible server

This is exactly the right abstraction — the model was already there, we just needed a pipe. The OpenAI-compatible server means every tool in my stack can use it without modification. Brew install and you're done.The Builder
232
M
Multica
Ship75% Ship

Assign tasks to AI coding agents like you would a human teammate

The Go backend with pgvector and real-time WebSocket updates signals serious engineering intent — this isn't a prototype. Multi-runtime support (local + cloud agents, 8 supported CLIs) and the compounding skill library make it worth adopting as core team infrastructure before your competitors do.The Builder
233
G
Google ADK 2.0
Ship75% Ship

Open-source agent framework: Python 2.0 beta + TypeScript 1.0 drop

Graph-based workflows in 2.0 Beta finally make multi-agent orchestration feel sane. The Agents CLI scaffolding saves an hour of boilerplate every new project. Apache 2.0 means no licensing headaches at scale.The Builder
234
C
CallingBox
Ship75% Ship

Configure an agent, dispatch a call, get structured JSON back

The single-endpoint design is exactly right — one call in, structured JSON out. MCP server integration means you can wire it to your existing agent tools without rebuilding. At $0.05/min I'd be crazy not to at least prototype with this.The Builder
235
E
Endless Toil
Ship75% Ship

Your coding agent will audibly groan at your bad code

Absurd premise, genuinely useful result. I will absolutely install this on my team's machines and not tell anyone. The immediate audio feedback loop is faster than reading lint output, and the escalating severity is well-designed.The Builder
236
C
Claude Context
Ship75% Ship

Semantic code search MCP — 40% fewer tokens, full codebase as context

This solves the single biggest practical pain point with Claude Code on large repos — context overflow. The hybrid BM25 + dense vector approach means it doesn't just do keyword matching, it understands what you're actually looking for. 40% token savings at basically zero setup cost is a no-brainer.The Builder
237
C
CC-Canary
Ship75% Ship

Detect Claude Code regressions before they waste hours of your time

The timing is perfect — Anthropic just admitted to weeks of silent quality regressions and the community is furious. CC-Canary gives you actual data instead of 'it feels worse.' The read:edit ratio metric alone is clever: if the model is reading much more than editing, it's probably spinning its wheels.The Builder
238
I
Intent
Ship75% Ship

Describe a feature. Agents build, verify, and ship it — in parallel.

The parallel worktree approach is genuinely smart — agents don't step on each other, and the living spec means you're not herding a single agent through a long task linearly. For features that touch multiple modules, this could cut agent coding time dramatically. macOS-only is a real limitation though.The Builder
239
B
BAND
Ship75% Ship

Universal orchestrator for cross-framework AI agent communication

This solves a real pain I hit last month — I had a LangChain agent that couldn't talk to a CrewAI pipeline without writing glue code. BAND's framework-agnostic handoffs are the missing primitive. Ship it immediately for any team running >3 agents.The Builder
240
A

Open-source runtime security for AI agents — covers all 10 OWASP agentic risks

The zero-rewrite integration is the killer feature — hooking into LangChain callbacks and CrewAI decorators means I can add governance to existing production agents in a day. The sub-millisecond latency means there's no excuse not to ship it. This is the security baseline for any team deploying autonomous agents.The Builder
241
A

1,100+ hand-curated skills for every major AI coding agent

This is the package registry equivalent for agent skills. Instead of hunting across 30 different repos, everything is here and organized. The fact that official vendor teams like Stripe and Cloudflare are contributing their own skills means quality stays high.The Builder
242
M
MarketingSkills
Ship75% Ship

44+ marketing skills for Claude Code, Cursor, and AI coding agents

Brilliant distribution play — package domain expertise as agent skills and suddenly your coding agent understands CRO best practices. The CLI install and Agent Skills spec compatibility mean you're up in 30 seconds. Already replacing half my Notion marketing runbooks.The Builder
243
H
Honker
Ship75% Ship

Postgres NOTIFY/LISTEN semantics for SQLite — no broker needed

The WAL-watching approach is elegant — no daemon, no polling loop, no external dependency. Having task queues, pub/sub, and scheduled jobs all in one SQLite file that any language can load is a huge win for projects that want operational simplicity.The Builder
244
C
Claw Code
Ship75% Ship

Claude Code's architecture, open-sourced — 100K stars in days

Multi-provider support alone makes this worth exploring — no more being locked to Claude's API pricing. The Rust core means it's fast, and 19 permission-gated tools is a solid starting point for real agent workflows. I've already swapped it in for two internal projects.The Builder
245
C
Codex 3.0
Ship75% Ship

OpenAI's Codex can now build, test & debug on full autopilot

Autopilot mode with actual test execution and iterative debugging is the missing piece — previous Codex iterations would write code but you still had to run and debug it yourself. The multi-terminal support and macOS computer use bring this much closer to a real engineering teammate.The Builder
246
G

Fine-tune Gemma 4 with audio + vision on Apple Silicon — no NVIDIA needed

Finally something that treats Apple Silicon as a first-class fine-tuning target, not an afterthought. LoRA on Gemma 4 multimodal for domain-specific tasks — medical, legal, private enterprise — is a genuinely underserved workflow. This is the tool the community needed.The Builder
247
A
AgentSearch
Ship75% Ship

Self-hosted Tavily alternative with MCP server — no API keys needed

Finally a proper self-hosted Tavily drop-in. The MCP integration means I can wire it into Claude Desktop in five minutes flat, and the 9-strategy extraction chain actually works when direct fetch fails. The Docker compose one-liner seals it — this is production-ready on day one.The Builder
248
A
Agent Vault
Ship75% Ship

Network-layer credential injection — agents never see your secrets

The network-layer injection approach is architecturally correct and I'm annoyed I didn't think of it first. This should be standard infrastructure for any team giving agents real API access. The fact that Infisical is behind it gives me confidence it won't be abandoned after a week.The Builder
249
G
GoModel
Ship75% Ship

One API to rule them all — 10+ LLM providers unified in Go

This is what I've wanted since LiteLLM started feeling bloated. Go binary, semantic caching, Prometheus metrics out of the box — it's a proper infrastructure-grade gateway, not a weekend hack. Multi-provider fallback alone is worth the Docker setup time.The Builder
250
F
Flipbook
Ship75% Ship

A website streamed live, directly from a language model — no backend, no build step

The streaming HTML rendering is technically elegant — they're using a custom incremental DOM diffing approach that keeps the page stable even as incomplete HTML arrives. As a proof-of-concept for a new web architecture pattern, this deserves serious attention from the dev community. The GitHub repo is worth forking for the renderer alone.The Builder
251
D
Design.MD
Ship75% Ship

Drop one Markdown file, your AI agent stops making ugly UIs

I've been pasting design tokens into system prompts manually like a cave person. The idea of a standardized DESIGN.md that any agent can read is so obvious in retrospect it's embarrassing. The 60+ existing brand files alone make it worth bookmarking right now.The Builder
252
C
claude-context
Ship75% Ship

Turn your entire codebase into instant context for Claude Code via MCP

This solves the single most frustrating thing about AI coding assistants on real projects — the constant context window juggling. Point it at your repo, forget about manually including files, and let semantic search do the work. I set it up in under 10 minutes and it immediately surfaced related code I'd forgotten existed.The Builder
253
L
Langfuse
Ship75% Ship

Open-source LLM observability, evals, and prompt management for production AI

If you're running any LLM application in production without Langfuse, you're flying blind. The multi-agent tracing support that landed in recent releases is the killer feature — finally you can see exactly which agent call caused that 45-second latency spike or why a particular input keeps producing hallucinations. The self-hosted option is production-ready.The Builder
254
F
free-claude-code
Ship75% Ship

Redirect Claude Code to free LLM backends — no API bill required

If you're burning $200/month on Claude Code tokens, this is a no-brainer for exploration work. The Haiku-to-local routing alone cuts most of the trivial call costs. Ship it as a cost-control layer.The Builder
255
C
context-mode
Ship75% Ship

Slash AI coding context usage 98% with sandboxed SQLite + BM25 search

9,195 stars don't lie. If you run Claude Code or Cursor on large codebases, context exhaustion is the number one thing that breaks long sessions. This is a direct fix. Install it, configure your platform, done.The Builder
256
M
ml-intern
Ship75% Ship

HuggingFace's autonomous ML engineer: reads papers, trains, ships

The HF ecosystem integration is what makes this actually useful vs. a generic code agent. It knows about datasets, hubs, and inference endpoints natively. For rapid prototyping of research ideas, this is a legitimate 10x on the experiment-to-publish cycle.The Builder
257
V
Vercel Skills
Ship75% Ship

Install reusable agent skills across Claude Code, Cursor, Windsurf, and 40+ more

This is exactly the missing layer in the agent toolchain. I've rebuilt the same 'write integration tests' prompt four times across different tools — Skills ends that. The SKILL.md format is clean and the cross-agent portability is real, not theoretical.The Builder
258
P
Pioneer
Ship75% Ship

Fine-tune any LLM with a prompt — then let it retrain itself in production

The $35 fine-tune price point changes the calculus entirely — I've been paying 10x that to have an ML engineer babysit a fine-tuning job. The adaptive inference loop is the killer feature: your model gets better from its own production mistakes without you writing a single eval script.The Builder
259
X
X Island
Ship75% Ship

Mac mission control for all your AI coding agent sessions at once

I've been manually checking three terminal windows every 10 minutes to see if Claude Code is waiting on me. X Island fixes that with zero setup. This should be table stakes in every agentic IDE but nobody's built it natively yet — so this indie tool fills a real gap right now.The Builder
260
A

1,100+ hand-picked agent skills from Anthropic, Google, Stripe, Cloudflare & more

Official skills from the companies that built the APIs are a different category from community-written scripts. When Stripe's own team ships a payments agent skill, I trust it handles edge cases my homegrown version would miss. This is the npm registry for agentic coding.The Builder
261
K
Kuri
Ship75% Ship

Zig-powered browser tool for AI agents: 464KB binary, 3ms cold start, zero Node.js

Finally — browser automation that doesn't require npm install to bring in 300MB of Node.js just to click a button. The 3ms cold start is genuinely game-changing for agent loops where you're spinning up browser contexts dozens of times per session. If the anti-detection stealth holds up, this becomes my go-to for agentic scraping pipelines.The Builder
262
I
InstantDB
Ship75% Ship

Open-source, 100% free backend: auth, real-time, storage, permissions — built for AI apps

This is what I've been waiting for since Firebase started its slow price creep. Everything pre-wired together matters enormously when you're shipping fast — I don't want to configure CORS between my auth and my storage bucket at 2am. The AI-first scaffolding is a genuine time saver, not just marketing copy.The Builder
263
E
Euphony
Ship75% Ship

OpenAI's open-source browser tool for visualizing Codex and agent session logs

I've been pasting agent logs into jq and manually grepping for the relevant steps — Euphony makes that process human. The timeline rendering of nested tool calls is exactly what I needed to debug a multi-step research agent that was hallucinating intermediate results. The FastAPI backend for remote log loading is a nice touch for team debugging sessions.The Builder
264
B
Browser Harness
Ship75% Ship

Self-healing browser automation that writes its own missing functions mid-run

592 lines to replace Playwright for LLM agents is a compelling trade. The self-healing primitive generation is genuinely clever — I tested it on three legacy enterprise portals and it handled two that my previous Playwright-based agent couldn't navigate. Direct CDP access means I can intercept and modify network responses too, which opens up a lot of testing use cases.The Builder
265
T

Build security automation workflows in plain English with AI

Natural language workflow creation is most valuable for maintenance, not initial build — being able to ask 'what does this 200-step playbook do?' and get a coherent answer saves serious time for any team inheriting legacy automation. The Community Edition availability means you can test it at zero cost before the credit model kicks in May 1st.The Builder
266
R
RAG-Anything
Ship75% Ship

Multimodal RAG that handles PDFs, images, tables, charts, and math

RAG-Anything solves the most frustrating part of enterprise document work: your data lives in tables, charts, and PDFs — not clean text blobs. The vector-graph fusion approach and concurrent pipelines mean you can actually build production-grade doc intelligence without rolling your own multimodal parsing. 17k stars in days is a signal this fills a real gap.The Builder
267
V
VibeAround
Ship75% Ship

Chat with your local coding agent from Telegram, Slack, or Discord on your phone

I run Claude Code on long research tasks that take 10-15 minutes. Being able to check progress and redirect from Telegram while I make coffee is genuinely useful. The Tauri footprint is tiny — it doesn't slow my machine down sitting in the background. Session handover between terminal and mobile works cleanly for Claude Code.The Builder
268
B
Broccoli
Ship75% Ship

Self-hosted agent that watches your Linear tickets and opens PRs for you

Self-hosted is the keyword that matters here. You own the infra, the prompts, and the API calls. For any team with compliance requirements or proprietary code concerns, this is the only sane way to run a coding agent that touches your tickets. The dual Claude + Codex review on every diff is a smart trust-but-verify layer.The Builder
269
C
Claude Context
Ship75% Ship

Make your entire codebase the context for Claude Code agents

This is the missing piece for Claude Code on large repos. I've been pasting files manually like a caveman—having semantic vector search as an MCP server means the model always has the right context without me playing file manager.The Builder
270
G
GOModel
Ship75% Ship

44x lighter AI gateway in Go — one API for 10+ providers

Finally a Go-native AI gateway that isn't a Python container in disguise. The two-layer caching alone pays for itself in API costs on any repetitive workload. Self-hosting this on a small VM is trivially easy compared to standing up LiteLLM with all its dependencies.The Builder
271
C
Cosine Swarm
Ship75% Ship

Parallel AI agent swarms for long-horizon software engineering

Long-horizon task decomposition is the actual frontier. Anyone who's tried to get a single Claude Code session to handle a multi-day feature build knows the context collapse problem. Parallel swarms with merge logic is the right architectural answer.The Builder
272
C

Self-initiated AI background agents that maintain your repos without being asked

This is the missing piece of the agentic coding stack. Every team using Cursor or Claude Code knows the dirty secret: the AI writes the feature, then humans do the boring maintenance forever. Daemons attack that problem directly with a config-as-code model that fits naturally into existing repo workflows.The Builder
273
Z
Zindex
Ship75% Ship

Stateful diagram engine designed specifically for AI agents to build persistent visuals

The Diagram Scene Protocol is a genuinely clever idea — treating a diagram as a mutable data structure rather than a generated string. Anyone who's debugged malformed Mermaid output from a coding agent will immediately see the appeal. The 40+ validation rules alone would save hours of prompt-tuning.The Builder
274
R
RAG-Anything
Ship75% Ship

One unified pipeline for RAG across text, tables, images, and figures

Handling mixed-modality documents is where every DIY RAG pipeline breaks down. The unified approach means you don't wire together five separate parsers before you can even start indexing. HKUDS has shipped LightRAG and other credible work — this isn't a beginner's first RAG project.The Builder
275
R
RLM
Ship75% Ship

Run recursive self-calling LLMs with sandboxed execution environments

Finally a clean abstraction for recursive inference without building the scaffolding yourself. The sandbox configurability means you can experiment with different execution environments without rewriting your harness each time. For researchers reproducing chain-of-recursive-thought papers, this cuts setup time dramatically.The Builder
276
C
Claw Code
Ship75% Ship

Open-source rewrite of the Claude Code agent harness — 72k stars

72k stars in under three weeks is a market signal, not a coincidence. The ability to inspect and extend the agent harness layer is what enterprise teams have been waiting for — you can now audit exactly what your coding agent decided to do and why. The Rust core means performance isn't sacrificed for openness.The Builder
277
E
Embedist
Ship75% Ship

Board-aware AI debugging meets real-time serial monitor — for embedded devs

Board-aware context is the thing that's been missing from every other AI coding tool for embedded work. The hardware-specific debugging for ESP32 and Arduino is genuinely useful and the PlatformIO integration means you don't need to leave the app to build and flash. Ship it.The Builder
278
C

Wire Claude's desktop app to real hardware via Bluetooth Low Energy

This is the kind of creative glue project that opens up a whole new class of Claude experiments. Using the existing desktop session instead of burning API credits is clever — I can see this being the basis for some genuinely interesting ambient AI hardware builds.The Builder
279
R
RealStars
Ship75% Ship

Detects fake GitHub stars using CMU research — A to F repo scoring

This should be built into GitHub natively, but until Microsoft acts, install this immediately. The CMU research backing gives the heuristics credibility beyond vibes. The Claude Code plugin integration is thoughtful — checking star quality while you're evaluating a dependency is exactly the right moment.The Builder
280
Q
QA Crow
Ship75% Ship

Write browser tests in plain English, run them in real browsers instantly

For teams under 10 engineers who ship fast and hate Playwright config debt, this is a no-brainer trial. Ryan's background means this isn't a weekend project — the real-browser execution and mobile coverage are the technical differentiators that matter. Try the free tier before your next sprint.The Builder
281
P
Pegasus 1.5
Ship75% Ship

Turn 2-hour videos into structured JSON metadata with a single API call

The schema-defined output is the killer feature — instead of getting a blob of unstructured transcript, you get exactly the JSON shape your database or downstream agent expects. For anything involving long video content (meetings, interviews, lectures, games), this is genuinely infrastructure-level useful.The Builder
282
M
MLJAR Studio
Ship75% Ship

Jupyter notebooks reimagined around conversation — local AI, no cloud required

The local Ollama support plus standard .ipynb output is the right combination — you get AI-native UX without cloud lock-in or file format churn. Auto-error-fixing is a genuine productivity unlock for data scientists who spend 30% of notebook time debugging import errors and shape mismatches.The Builder
283
S
smolvm
Ship75% Ship

Ship portable Linux VMs that boot in under 200ms — isolation by default

This solves the AI agent sandbox problem cleanly. Sub-200ms boot, declarative Smolfile config, and OCI compatibility means you can integrate it into a CI pipeline in an afternoon. The network-off-by-default stance is exactly right — I want to opt into exposure, not opt out.The Builder
284
E
Evolver
Ship75% Ship

AI agents that evolve themselves using Genome Evolution Protocol

This scratches a real itch — agent reliability is the #1 pain point right now and most solutions are 'add more evals.' Evolver's GEP loop is opinionated and that's a feature, not a bug. The Claude Code + Cursor hooks mean you can drop it into existing workflows today.The Builder
285
C

Runnable 5-layer stack that enforces RAG output against retrieved context

The Enforcement layer is the real insight here — I've seen so many RAG systems where the LLM just ignores the retrieved context and answers from weights anyway. Having a verifiable check that output actually uses retrieval is table stakes for production. This implementation shows exactly how to do it.The Builder
286
B

Headless browser API for agents with AI-native self-registration via math challenges

Credential provisioning is the unsexy bottleneck everyone ignores until they're trying to deploy 50 agents. Agent self-registration via challenge-response is clever engineering — the question is whether the math challenge obfuscation is actually robust. But even a partial solution here saves hours of DevOps per agent.The Builder
287
O
Ovren
Ship75% Ship

Assign backlog tickets to AI engineers — get reviewed PRs back

The GitHub integration is seamless and the execution reports are actually useful — they tell me what the AI did and why, so review is fast. It handled a backlog CSS refactor ticket in 4 minutes that would have taken a junior dev half a day. The free tier lets you evaluate it risk-free on real tasks.The Builder
288
M
MemPalace
Ship75% Ship

Free AI memory that stores conversations verbatim — no summarization, no API costs

Zero API cost memory is the killer feature here. I was paying $40/month for Mem0 to give my coding agent project context — MemPalace does the same thing for free and runs entirely local. MCP integration works cleanly with Claude Code and Cursor out of the box.The Builder
289
C

49-agent Claude Code scaffold for full game dev production teams

The propose-before-act pattern with human approval gates is the right architecture for a domain where a wrong asset pipeline decision cascades into hours of rework. 72 slash commands sounds like bloat until you realize each one encodes game-dev-specific institutional knowledge. This is closer to a custom IDE for game dev than a chatbot wrapper.The Builder
290
M
Multica
Ship75% Ship

Assign tasks to AI coding agents like a human team member

The skill compounding model is the right answer to the 'why does the agent keep forgetting how we do X' problem. Extracting solutions into reusable playbooks means the system gets smarter about your codebase over time rather than starting cold every session. Multi-agent support with a single task board is what engineering managers actually need to deploy this in a team context.The Builder
291
T
T3 Code
Ship75% Ship

A clean web GUI for Codex and Claude coding agents — no IDE required

Running `npx t3` and getting a browser UI for Codex and Claude is genuinely convenient for remote dev environments and headless servers where you can't run a full IDE. The T3 team has a track record of clean, opinionated tooling. This fits that pattern.The Builder
292
P
Passmark
Ship75% Ship

AI regression testing in plain English — runs fast, heals itself

The Redis caching architecture is the key insight here — you get AI test authoring without paying per-run LLM costs. Self-healing selectors alone would justify the switch from vanilla Playwright. This is the first AI testing tool I've seen that actually solves the economics.The Builder
293
A
Assemble
Ship75% Ship

Deploy 34 AI coding personas across 21 dev tools in 2 minutes flat

Maintaining consistent agent configs across Cursor, Claude Code, and Cline manually is genuinely tedious. The fact that this generates native files with zero runtime dependencies makes it auditable and deployable anywhere — including strict enterprise environments that ban external service calls.The Builder
294
F
Fixa
Ship75% Ship

Cloud-native AI agent that builds & deploys full projects

The persistent agent state between sessions is genuinely new — most AI coding tools forget everything when you close the tab. The automatic error monitoring and proactive fix proposals are early-stage but already useful for catching dumb mistakes in side projects.The Builder
295
L
Libretto
Ship75% Ship

Deterministic browser automations with AI-powered network reverse engineering

The network reverse-engineering angle is the sleeper feature here. Playwright scripts that target network requests instead of DOM selectors are dramatically more stable. If Libretto can automate the discovery of those API calls reliably, it solves the maintenance headache that makes browser automation so painful at scale.The Builder
296
R
RAG-Anything
Ship75% Ship

Unified multimodal RAG pipeline for docs, images, tables, and mixed content

The 'RAG on real documents' problem is genuinely hard and genuinely painful. Every enterprise RAG project I've worked on has hit the table-in-PDF wall within the first two weeks. If RAG-Anything's cross-modal retrieval actually works reliably, this belongs in every production RAG stack.The Builder
297
S
Stage
Ship75% Ship

Puts humans back in control of agent-generated code review

This is exactly the tooling the industry needs right now. My team is merging 10x more code per week thanks to agents, and our review process hasn't scaled. Risk-based routing that puts humans where they matter — security, API contracts — is the right mental model. Shipping this to our stack next week.The Builder
298
D
dora-rs
Ship75% Ship

10-17x faster than ROS2 — real-time robotics in Rust

If you're building anything robotics or real-time sensor-fusion adjacent, dora is worth a serious look. The zero-copy Arrow pipeline alone eliminates hours of debugging weird serialization bugs I've had with ROS2. Hot-reload for Python nodes during dev is a genuine quality-of-life win.The Builder
299
C
CodeBurn
Ship75% Ship

Track and cut your AI coding spend across every tool you use

This is exactly the observability layer AI coding has been missing. Knowing that 40% of my Claude Code tokens went to a single poorly-scoped context window is the kind of insight that pays for itself in the first week. The 'optimize' command is genuinely useful, not just marketing copy.The Builder
300
R
Rapid-MLX
Ship75% Ship

Run local LLMs on Apple Silicon — 4.2x faster than Ollama

The 4.2x Ollama claim initially seemed like benchmark cherry-picking, but the MLX-native optimizations are real and documented. Drop-in OpenAI API compatibility means I can point my existing agentic tooling at it without code changes. For offline development on a MacBook Pro M4, this is my new default.The Builder
301
C

Claude Code gets mouse support and flicker-free terminal rendering

The flickering was genuinely annoying during long agent runs — watching the terminal strobe while Claude generates 500 lines of code breaks concentration. Flicker-free rendering alone justifies this update. Mouse support is a nice-to-have for most devs but will matter a lot to anyone transitioning from GUI tools to terminal-first workflows.The Builder
302
K
King Louie
Ship75% Ship

Local-first desktop AI agent with 20 tools — no cloud account required

Bring-your-own-key, MIT licensed, works on all three platforms, embeds across Telegram/Discord/Slack — King Louie checks every box for a local-first AI agent setup. The cron scheduling and webhook support mean it's actually production-ready for personal automation, not just a demo. Highly recommended for developers who want control over their AI stack.The Builder
303
O

OpenAI's official lightweight multi-agent Python SDK

Swarm was already my go-to for prototyping before this official SDK dropped. The typed handoffs and clean decorator API make it easy to reason about agent graphs. If you're building on GPT-5, use the official SDK — the upgrade path and support will be there.The Builder
304
S
smolvm
Ship75% Ship

Sub-200ms microVMs for sandboxing AI coding agents safely

This is the missing layer for anyone running AI agents that execute code. Docker containers have always been too porous for untrusted execution, and smolvm's sub-200ms coldstart means you can spin a fresh VM per agent turn without killing your latency budget. The AGENTS.md is a thoughtful touch — shows the authors actually understand the workflow.The Builder
305
S
stagewise
Ship75% Ship

Frontend coding agent that sees your live running app

Finally, an agent that doesn't need me to paste error messages manually. The browser-native visibility means it catches the runtime issues that trip up every other coding agent. BYOK is the right call — no lock-in, no data exposure concerns. I'd use this today on a legacy React codebase.The Builder
306
M
MDV
Ship75% Ship

Markdown that embeds live data, charts, and slides — docs that stay current

I've been writing separate README, dashboard, and slide deck for the same data for years. MDV collapsing those into one source-of-truth file is the kind of DRY solution I didn't know I needed. The frontmatter-extension approach means it works in existing markdown tooling. Shipping for internal docs immediately.The Builder
307
M
Magika 1.0
Ship75% Ship

AI-powered file type detection — 99% accurate, 200+ formats

The Rust rewrite is the headline — I can now call Magika as a library from any Rust or C-compatible project with zero Python startup overhead. 99% accuracy on 200 formats from a tiny deep-learning model is genuinely impressive, and 'Google has been running this in production for years' is exactly the confidence signal I need before dropping it into a security-critical pipeline.The Builder
308
F
farmer
Ship75% Ship

Approve AI agent tool calls from your phone — swipe to allow or deny

This solves the exact anxiety of kicking off a Claude Code session and then walking away. The swipe-card mobile UI is well thought out — you can do a quick code review of the pending command right from the notification. The adapter interface is clean enough that I could wire it to my own agents in an afternoon.The Builder
309
O
OpenSRE
Ship75% Ship

Open-source AI SRE agent that investigates production incidents autonomously

The 40-integration coverage is what separates this from toy demos. It actually connects to the full on-call stack — PagerDuty, Grafana, Loki, k8s events — and the hypothesis-ranking approach mirrors how senior SREs actually debug. This is ready to handle real incidents.The Builder
310
C
Codestral 2
Ship75% Ship

Mistral's 22B Apache 2.0 code model beats GPT-4o on HumanEval

Apache 2.0 + fill-in-the-middle + 256K context is the trifecta I've been waiting for in a locally-runnable code model. The HumanEval numbers are believable based on my early testing — it's genuinely competitive with GPT-4o on completion tasks, which is remarkable at this size and license.The Builder
311
T
t3code
Ship75% Ship

A minimal web GUI for running Codex and Claude coding agents

If you're already paying for Codex or Claude API access, t3code is the obvious choice over locking into a $20/mo IDE subscription. The `npx t3` DX is exactly right — zero install friction, works in any project. 9k stars in two months tells you developers agree.The Builder
312
C

49-agent game development studio that runs entirely inside Claude Code

The studio hierarchy with defined escalation paths is what makes this actually useful versus a list of prompts. When the QA agent flags a design issue, it knows to route to the design lead, not dump it on the director. That kind of structure makes multi-agent workflows manageable.The Builder
313
C

Give your AI agent full access to a live Chrome session

This is the missing piece for AI-assisted web development. My agent can now write a component, open Chrome, visually inspect it, run Lighthouse, and file a bug — all without me touching the keyboard. The existing-session attachment is the killer feature; no more surrendering credentials to a headless browser.The Builder
314
P
Plain
Ship75% Ship

A Django fork rebuilt for AI agents — typed, predictable, agent-readable

The `.claude/rules/` integration and typed APIs are exactly what you want when you're letting agents modify your codebase. OTel built-in is a legitimate win — no more strapping on tracing as an afterthought. If you're starting a new Python project in 2026, Plain is worth serious consideration.The Builder
315
C
Craft Agents OSS
Ship75% Ship

Open-source desktop app for running AI agents across 32+ integrations

This is the missing middle layer between raw SDK calls and fully managed platforms. 32 integrations with zero config and a headless mode means you can drop it into an existing workflow in under an hour. Apache 2.0 license is the cherry on top.The Builder
316
G

Google's production-ready framework for building AI agents

The 1.0 stable tag finally gives us something to build on. The graph-based execution engine is exactly what I want for deterministic multi-step pipelines where I can't afford unpredictable LLM routing. Native MCP support means my existing tool ecosystem plugs straight in without adapter layers.The Builder
317
I
IsItAgentReady
Ship75% Ship

Scans any website for AI agent readiness across 36 checkpoints

The MCP server integration is the killer feature — I ran it directly from Claude Code on three client sites and had actionable fixes within a minute. The robots.txt check alone is worth the trip: most sites are blocking AI crawlers without realizing it.The Builder
318
K
Kampala
Ship75% Ship

MITM proxy that reverse-engineers any app into a stable, callable API

This is the tool I've been building in-house at three different companies and never had time to productize properly. The auth chain tracing alone — tracking token refresh flows and session state automatically — would have saved me hundreds of hours. If it works as advertised, it's an instant ship for anyone doing integration work.The Builder
319
M
Marky
Ship75% Ship

Lightweight macOS markdown viewer built for agentic coding workflows

Under 15 MB, Tauri/Rust, instant open, live reload — this is the tool I didn't know I needed for reviewing agent-generated docs. The Cmd+K fuzzy search across documents is the right power-user feature. Exactly the kind of focused tool that's worth having in your dock.The Builder
320
C
CodeBurn
Ship75% Ship

Token cost analytics and waste finder for AI coding tools

I ran this on a week of Claude Code sessions and immediately found I was spending 30% of my tokens re-reading the same five config files. The menu bar widget is the killer feature — seeing the cost counter tick up while you work changes your behavior instantly. Instant install for anyone serious about AI coding.The Builder
321
M
MMX CLI
Ship75% Ship

One CLI for text, image, video, speech, music, and web search via MiniMax

Unified API access to text + image + video + speech in one CLI with a single auth token is a genuine workflow improvement. The Claude Code integration means I can write agents that generate multimedia without ever leaving my development environment. The pay-per-use model also means no minimum commitment.The Builder
322
S
Superpowers
Ship75% Ship

A shell-based agentic skills framework and dev methodology

This is exactly the tooling I didn't know I needed. The shell-native approach means zero framework lock-in — works with Claude Code, Cursor, or whatever agent comes next. Jesse Vincent has been building great dev tools for decades and this has the same clean opinionated feel.The Builder
323
Q
QA.tech
Ship75% Ship

AI agent that auto-tests your app on every PR — no code needed

The selector-free approach is genuinely appealing to anyone who's wasted hours fixing brittle Playwright tests after a designer changed a class name. If the knowledge graph adapts to UI changes reliably in practice, this could replace an entire category of test maintenance work that nobody enjoys.The Builder
324
A
Android CLI
Ship75% Ship

Google's terminal-first Android SDK — 70% fewer tokens, 3x faster for agents

Android development has always had a painful amount of setup and boilerplate tooling. The token reduction numbers are plausible — most of the waste in AI-assisted Android dev comes from agents re-reading Gradle configs and SDK docs that should just be injected directly. The 'android docs' command for grounded documentation is the feature I'll use most.The Builder
325
T
Thunderbolt
Ship75% Ship

Self-hosted enterprise AI client from Mozilla — no cloud required

The OIDC support and multi-backend inference proxy out of the box are genuinely useful. Most open-source AI frontends make you roll your own auth from scratch. Mozilla's Thunderbird team knows enterprise distribution — this isn't some weekend project that'll be abandoned in a month.The Builder
326
C

Git-compatible versioned storage built for AI agent workflows

This is the missing primitive for agentic coding pipelines. Every time I've built multi-agent workflows I've ended up bolting on some hacky version control layer — this solves it properly. The ArtifactFS driver for async clones is the detail that makes it actually fast enough to use in production agent loops.The Builder
327
A
Android RE Skill
Ship75% Ship

Claude Code plugin that decompiles APKs and maps their full HTTP API

328
C
Claude 4 Sonnet
Ship75% Ship

Anthropic's sharpest agent yet — now with hands on your keyboard

Multi-step tool orchestration that actually holds context across a long chain of calls is a genuine unlock for agentic pipelines — I've been waiting for this since function calling became a thing. The computer-use layer means I can automate legacy UI tasks without scraping brittle HTML or writing a custom Playwright script. Reduced pricing is the cherry on top; this goes straight into production.The Builder
329
V
v0 3.0
Ship75% Ship

From prompt to full-stack app — with auth, APIs, and a database.

v0 3.0 is the leap I was waiting for — going from UI snippets to actual deployable full-stack apps changes the calculus entirely. Auth scaffolding and one-click Postgres mean I can hand off prototyping to v0 and spend my cycles on the hard product logic. It's not perfect, but the escape hatches into real Next.js code keep it from being a walled garden.The Builder
330
A
agent-skills
Ship75% Ship

Production-grade engineering skills library for AI coding agents

Having security audits, test generation, and spec creation as first-class slash commands changes how you think about agent-assisted development. The cross-tool compatibility (Claude, Cursor, Gemini) means you can standardize across a team with mixed tool preferences. Fork it, customize the checklists, and you have a company playbook.The Builder
331
A
Agent!
Ship75% Ship

Native macOS AI coding agent — no subscriptions, 17 LLMs, full undo

The Time Machine undo alone makes this worth trying — every AI coding tool should have this and almost none do. Bring-your-own-keys with 17 providers means you're not locked in. The Accessibility API integration is powerful for automating macOS tasks beyond just code.The Builder
332
E
Eyeball
Ship75% Ship

Embeds source screenshots in AI analysis to kill hallucinations

This is one of those ideas that makes you think 'why isn't every AI analysis tool doing this?' The implementation is simple — capture screenshots of the source during analysis — but the trust it builds in the output is enormous. I'd use this immediately for any contract or regulatory review workflow.The Builder
333
P
Pluck
Ship75% Ship

Click any website UI, get a clean AI coding prompt for it

I do this workflow manually constantly — inspect element, copy classes, paste into Claude, iterate. Pluck automates the messy part. The authenticated-page support is the killer feature; most competitors only work on public sites. $10/month is genuinely cheap for the time it saves.The Builder
334
C
ClawTab
Ship75% Ship

Tame 20+ AI coding agents from one macOS dashboard

I've been managing 8 Claude Code sessions in tmux and it's chaos. ClawTab's labeled panes with per-agent status finally makes parallel agent work legible. The auto-yes mode alone saves me from interruption fatigue on long agent runs.The Builder
335
A
Agent Card
Ship75% Ship

Virtual Visa cards your AI agents can issue and spend themselves

This is the piece I've been waiting for. I build procurement agents and the payment step always requires human intervention. A merchant-scoped, dollar-capped virtual card with MCP support changes that completely. The 1.5% fee is trivially worth it for what it unlocks.The Builder
336
O

Vercel's open blueprint for durable cloud coding agents with git & sandboxing

The snapshot/resume sandbox is the piece everyone keeps reinventing badly. Having a reference implementation from Vercel that shows the right way to do durable agent state is genuinely useful — I'll fork this as a starting point for my next agent project.The Builder
337
C
claude-mem
Ship75% Ship

Auto-captures and AI-compresses your Claude Code sessions into searchable memory

The re-orientation problem is real and annoying. I spend 15 minutes every morning catching Claude Code up on what we built yesterday. claude-mem's compressed session captures are a good pragmatic fix until Anthropic builds proper memory into the product.The Builder
338
S
Stagewise
Ship75% Ship

The coding agent that sees your live app — DOM, console, and all

Browser-native debugging context for a coding agent is a genuinely different approach. When the agent can see your console errors and DOM state in real time, it makes dramatically better edits than agents that only see source code. The reverse-engineering feature — extract components and design tokens from any site — is something I've been doing manually for years. BYOK keeps costs transparent.The Builder
339
C
claudectl
Ship75% Ship

One terminal dashboard for all your Claude Code sessions — with spend controls

Running 4+ parallel Claude Code sessions without a unified view is chaos. Claudectl gives me a single pane showing spend rate, context window usage, CPU, and activity for all of them simultaneously. The budget kill-switch alone has saved me from runaway agent spend multiple times. Free, open-source, Homebrew installable — this is essential infrastructure for anyone serious about multi-agent coding.The Builder
340
K
Kelet
Ship75% Ship

Reads your LLM traces, finds failure patterns, and hands you the prompt fix

The loop has been open for too long — collect traces, stare at them, guess at fixes, repeat. Kelet closes it. Read-only access is the right trust model for early adoption. If it actually surfaces actionable prompt patches instead of generic insights, this becomes a staple of any serious LLM app development workflow.The Builder
341
A
Astropad Workbench
Ship75% Ship

Remote desktop for headless Macs — built for managing AI agents 24/7

If you're running agents on a headless Mac Mini, this fills a real gap. The voice dictation-to-terminal feature alone saves constant context-switching. LIQUID protocol latency is noticeably better than Screens or Remotix on the same network. At $10/month it's easy to justify if you spend more than 2 hours a week babysitting agents.The Builder
342
L
Libretto
Ship75% Ship

Deterministic browser automations for AI agents — 95% success rate

Record-replay with LLM fallback is the right architecture for production browser automation. The 95% vs 70% success rate gap is enormous when you're running 1000+ workflows. The Playwright integration means zero migration cost for existing projects — just wrap your sessions.The Builder
343
V
Vercel AI SDK 5.0
Ship75% Ship

Native MCP client + streaming agent loops for every model provider

This is the SDK I've been waiting for. Native MCP client support alone saves me from maintaining a rats' nest of custom glue code, and the unified streaming interface across 30+ providers is a genuine competitive moat. Persistent agent loop primitives are the cherry on top — multi-step reasoning pipelines now feel like first-class citizens rather than weekend hacks.The Builder
344
M
Mistral 4B
Ship75% Ship

Compact, powerful AI that runs natively on your device — no cloud needed.

Apache 2.0 plus competitive MMLU scores in a 4B parameter footprint is a serious combo — this is the model I've been waiting for to ship local AI features without apologizing for quality. It runs on consumer GPUs and mobile NPUs, which means the deployment story is finally sane. If you're building anything that needs on-device inference, this is your new baseline.The Builder
345
P
Pretty Fish
Ship75% Ship

Free, beautiful Mermaid diagram editor that works offline

The official Mermaid live editor is clunky and slow. Pretty Fish loads instantly, works offline, and the multi-page workspace means I can manage all my architecture diagrams in one place. Bookmarking this immediately as my default Mermaid editor.The Builder
346
S

Your filesystem IS the vector database for AI agents

I've been burned too many times by embedding pipelines that drift when models update and vector indexes that mysteriously degrade. Filesystem-native memory is zero-dependency, trivially inspectable, and you can version it with git. For structured agent memory this is genuinely compelling.The Builder
347
L
Libretto
Ship75% Ship

AI browser automation that doesn't break every other deploy

This is the right mental model for production browser automation. Using AI for authoring but not runtime means you get consistency in CI without random failures at 2am. I've been waiting for someone to build this properly.The Builder
348
C
CC-Beeper
Ship75% Ship

A floating macOS widget that shows exactly what Claude Code is doing

I've been running Claude Code tasks for hours and constantly alt-tabbing to check the terminal. CC-Beeper solves exactly that problem. The hook integration is clean — seven scripts and a localhost port, nothing invasive. The YOLO mode is perfect for trusted local tasks. Swift 6 + SwiftUI means it's fast and native, not an Electron tax. Ship immediately.The Builder
349
L

AI fullstack engineering with project tabs and local MCP server support

Local MCP support is the key upgrade here—Lovable agents can now reach into your local environment, which dramatically expands what you can build. Multi-tab project management was overdue. This makes Lovable a real contender for complex projects, not just prototypes.The Builder
350
C
Clide
Ship75% Ship

AI-native Mac terminal: grid-layout panes, agent that drives your shells

Clide nails the architecture: terminal-first, AI as assistant rather than owner. The native SwiftUI build means it's fast and doesn't eat 4GB of RAM like Electron alternatives. Grid panes plus agent control is exactly what I want for complex multi-process debugging sessions.The Builder
351
M
MarkItDown
Ship75% Ship

Convert any file to Markdown — PDFs, Office docs, audio, images

MarkItDown solves the boring-but-critical problem of getting messy enterprise docs into LLM-friendly formats. The breadth of format support—PDF, PowerPoint, Excel, YouTube URLs, audio—means one library covers your whole intake pipeline. 108k stars is the market's verdict.The Builder
352
V
Voicebox
Ship75% Ship

Open-source voice synthesis studio that runs 100% locally

Finally a local TTS stack I can actually ship in a product. The REST API plus multi-engine support means I can swap models without changing my app code, and zero per-character costs changes the economics entirely for high-volume use cases.The Builder
353
A
Agent Lightning
Ship75% Ship

Train and optimize any AI agent across any framework with near-zero code changes

Framework-agnostic agent training is the gap nobody talks about. Most teams are spending weeks retrofitting optimization logic into agents built on whatever framework they grabbed first. Agent Lightning's emit() approach is low-ceremony and the RL + prompt optimization combo in one package is genuinely useful.The Builder
354
C
CatDoes v4
Ship75% Ship

An AI agent with its own cloud computer builds your mobile apps

The closed-loop debugging is the real differentiator. Most AI code generators dump code on you and walk away — Compose actually runs the result and iterates. At $20/month with code export and GitHub sync, it's a serious prototyping accelerator even for experienced devs who just want to skip the boilerplate.The Builder
355
K
Kelet
Ship75% Ship

AI agent that diagnoses why your LLM app failed in production

Kelet solves the specific hell of debugging AI agents in production: thousands of traces, failure patterns scattered across sessions, and no clear signal about which prompt, which agent, or which data caused the issue. The credit assignment for multi-agent chains is the killer feature — knowing exactly which subagent in a CrewAI or LangGraph chain broke is worth the integration cost alone. Five-minute setup via SDK and OpenTelemetry compliance means it plugs into what you're already running.The Builder
356
Y
Yggdrasil
Ship75% Ship

Turns your CLAUDE.md rules from suggestions into enforced constraints

CLAUDE.md files and .cursorrules are basically suggestions that agents ignore whenever they feel like it. Yggdrasil makes rules enforceable: the agent writes code, runs 'yg approve', gets specific violations back, fixes them, and re-verifies before the code ever reaches review. The intelligent scoping that shows agents only the 3-5 relevant rules per file instead of all 200 is the kind of practical detail that shows the builders understand how context windows actually work. CI integration via hash comparison (no LLM calls) means enforcement doesn't cost anything at the gate.The Builder
357
C
ClawRun
Ship75% Ship

Deploy and manage AI agents across all your chat apps in seconds

The pitch is exactly right: 'npx clawrun deploy' and your agent is running with persistent sandboxes, sleep/wake on activity, multi-channel messaging, and budget controls. The TypeScript/Rust stack and Vercel Sandbox deployment target suggest serious infrastructure ambitions. Apache-2.0 licensing means you can self-host or contribute. The multi-channel integration (Telegram, Discord, Slack, WhatsApp) out of the box eliminates the usual boilerplate of wiring messaging into every new agent project.The Builder
358
P
Plain
Ship75% Ship

Django reimagined for humans and AI agents alike

A Django fork that actually makes the right tradeoffs for 2026: drops the legacy baggage, goes all-in on PostgreSQL and type annotations, and adds first-class agent tooling with Claude rules files and installable agent skills. The unified CLI ('plain dev', 'plain fix', 'plain check', 'plain test') is the kind of opinionated ergonomics that makes day-to-day development faster. If you're starting a new Python web project and want it to work well with Claude Code, Plain is worth evaluating seriously.The Builder
359
O
Open Agents
Ship75% Ship

Vercel's open-source reference app for background AI coding agents

The architecture decision to run the agent outside the sandbox VM is clever and underappreciated — it means the execution environment and the reasoning layer can evolve independently. The built-in PR generation and Workflow SDK integration save weeks of plumbing for any team building coding agents.The Builder
360
C
claude-mem
Ship75% Ship

Persistent cross-session memory for Claude Code — auto-capture, compress, and recall

This is one of those tools that should have existed from day one of Claude Code. The fact that agents forget everything between sessions is genuinely painful for long-running projects. The 3-layer token retrieval is clever — it filters before fetching. One-command install, multi-IDE support, local-first. The AGPL license is the main friction for commercial teams.The Builder
361
G
Goose
Ship75% Ship

Local open-source AI agent in Rust — works with 15+ LLM providers

Goose in Rust with 15+ provider support is the most serious open-source AI agent for production engineering work. The AAIF donation gives it long-term credibility — this isn't a side project that'll get abandoned when Block's priorities shift. The desktop app is polished and the CLI is fast.The Builder
362
O
OpenAI Codex CLI
Ship75% Ship

OpenAI's lightweight terminal coding agent powered by o3 and o4-mini

For hard algorithmic problems, multi-file refactors, and anything requiring real reasoning depth, Codex CLI with o3 is the best tool in the terminal right now. The Rust performance shows — it's snappy in a way Claude Code sometimes isn't. 67k stars don't lie.The Builder
363
B
Blender MCP
Ship75% Ship

Control Blender 3D with plain English through Claude's Model Context Protocol

This is exactly the kind of MCP integration that makes the protocol click—real creative software with a complex API that's genuinely painful to navigate manually. The one-click addon install and local socket architecture means no cloud routing, no latency surprises. If you're already on Claude's API, this is a free superpower for your 3D work.The Builder
364
C
Caveman
Ship75% Ship

Cut 75% of LLM output tokens without losing technical accuracy

This is one of the most practical DX improvements I've seen in the Claude Code ecosystem. Token budgets are a real constraint, and cutting 75% of output without touching correctness is legitimately impressive. One-command install across every editor seals it.The Builder
365
G
Google ADK
Ship75% Ship

Build multi-agent AI pipelines with Google's open framework

If you're already on Google Cloud, ADK is the cleanest path to multi-agent production systems right now. The Python API is intuitive, the Vertex AI integration removes a lot of DevOps overhead, and 8,200 stars in a few weeks means the community is already finding it useful.The Builder
366
K
Karpathy Skills
Ship75% Ship

One CLAUDE.md file that actually makes Claude Code behave

32,000 GitHub stars don't lie. Four principles that actually address the most painful Claude Code failure modes: hidden assumptions before coding, overengineering beyond scope, cosmetic edits to unrelated code, and vague instructions without measurable success criteria. Install it as a Claude Code plugin once and every project benefits. The fact that Karpathy's specific critique — models 'make wrong assumptions, overcomplicate code, and introduce unrelated changes' — maps exactly to the four principles shows this came from real pain, not theorizing.The Builder
367
C

The missing manual for graduating from vibe coding to agentic engineering

This fills a real gap. The official Claude Code docs are good for basics but thin on production patterns—subagent orchestration, hook design, memory architecture. This repo documents the emergent best practices from the community in a structured way. Bookmark it before your next agentic project.The Builder
368
G
Gemini CLI
Ship75% Ship

Google's free open-source AI agent lives in your terminal

1,000 free requests/day with 1M context on Gemini 2.5 Pro is genuinely crazy good. For hobby projects, side-gigs, and open source work, Gemini CLI just eliminated the cost barrier for terminal AI. Install it alongside Claude Code and let them compete for your prompts.The Builder
369
S
Superpowers
Ship75% Ship

Mandatory workflow skills that keep coding agents on track for hours

This is the missing layer between 'give Claude Code your repo' and 'actually ship production code.' The 2-5 minute task decomposition forces the model to stay focused, and the built-in TDD cycles catch regressions before they stack up. The 152k stars aren't hype — developers have a genuine need for this structure.The Builder
370
G

Spec-driven context engineering system for Claude Code — without the enterprise theater

GSD's five-step workflow (initialize → discuss → plan → execute → verify) with wave-based parallel execution and schema drift detection is the closest thing to a formal engineering discipline for Claude Code projects. The quality gates alone have saved me from shipping broken APIs multiple times.The Builder
371
S
Skills Janitor
Ship75% Ship

9 commands to audit, fix, and prune your Claude Code skills

Every Claude Code power user I know has a graveyard of half-working skills they installed three months ago and forgot. This tool does the unglamorous work of auditing that pile. The usage tracking via conversation history parsing is the killer feature — it doesn't ask you to remember what you used, it checks.
372
T
Tokemon
Ship75% Ship

macOS overlay that monitors token usage across Claude, OpenRouter, ChatGPT in real-time

This is exactly the kind of zero-friction utility that should exist. Token anxiety is real for anyone running Claude Code on a Pro Max plan — a floating overlay that shows you're at 40% quota vs. discovering you're rate-limited mid-session is genuinely valuable. The extensible config system means you can add any service that exposes usage endpoints.The Builder
373
M
Multica
Ship75% Ship

Open-source platform that turns coding agents into real teammates

Multica solves the real problem: once you have more than two AI agents running, you need coordination tooling or things fall apart. The assignee dropdown, skill compounding, and self-hosting option make this the first agent management layer I'd actually use in production.The Builder
374
C
ContextPool
Ship75% Ship

Auto-loads your past coding sessions as context into every new AI session

The 'amnesia problem' in AI coding tools is genuinely one of the biggest productivity drains. Every Monday morning I'm re-explaining my project architecture to Claude Code. ContextPool addresses this directly. The MCP integration means it works without changing my workflow — the context just appears.The Builder
375
W
WinScript
Ship75% Ship

AppleScript for Windows, packaged as an MCP server for AI agents

This fills a gap that has genuinely frustrated Windows developers in the MCP ecosystem. macOS users have had AppleScript and Shortcuts for agent automation for years. WinScript finally gives Windows a standardized interface that any MCP-compatible agent can use without writing custom PowerShell bindings.The Builder
376
M
MiniMax MMX-CLI
Ship75% Ship

One CLI to give AI agents native image, video, speech, music, and search

This is exactly what multi-agent media workflows need — one dependency instead of five. The fact that it runs as a standard CLI means it drops into any agent runtime without custom code. If the API quality is consistent with MiniMax's production models, this could replace a lot of the bespoke media API plumbing in agent codebases.The Builder
377
C
claude-cc
Ship75% Ship

Automatically resume the right Claude Code session per git branch

This is the definition of a tool that should exist. Switching branches to fix a bug, then returning to your feature work, you always lose the conversation thread. claude-cc makes context persistence the default. It's tiny, it has no dependencies, and it does exactly one thing right. Every Claude Code user should have this aliased.The Builder
378
A
Archon
Ship75% Ship

YAML-defined workflows that make AI coding agents reproducible and auditable

Finally, a way to run coding agents without crossing your fingers. The YAML workflow approach is immediately familiar for anyone who's written GitHub Actions — you get predictability, retries, and audit logs instead of hoping the agent remembers what you asked. The 17 pre-built workflows cover 80% of real sprint tasks.The Builder
379
C
Claw Code
Ship75% Ship

Open-source, multi-LLM clean-room rewrite of Claude Code's agent harness

The Python + Rust split is smart engineering — you get orchestration flexibility and execution speed without compromising either. 19 permission-gated tools and MCP support means this is ready for serious use, not just demos. The multi-LLM support is the killer feature Anthropic refuses to build.The Builder
380
M
MarkItDown v0.1
Ship75% Ship

Convert anything to LLM-ready Markdown — now with MCP server and OCR plugin

If you're building RAG pipelines or feeding documents to LLMs, MarkItDown is already the standard answer. The MCP server integration in v0.1 means you can now wire it directly into Claude Desktop for instant document analysis without any custom code. The plugin architecture finally makes extensibility clean.The Builder
381
K

Four rules from Karpathy's LLM coding critiques baked into a Claude Code plugin

I dropped this in my project root on Monday and by Wednesday I'd noticed my Claude sessions were producing tighter PRs. Could be placebo, but the 'surgical changes' rule alone seems to cut diff sizes by 30-40% in my experience. It costs nothing to try.The Builder
382
G
git-why
Ship75% Ship

Persist AI agent reasoning traces alongside your code in git history

The commit message has always been inadequate documentation and AI-generated code makes this worse, not better. git-why is the first tool I've seen that treats agent reasoning as a first-class artifact of the development process. This is especially valuable for onboarding — imagine joining a codebase and being able to ask 'why does this function exist?' and getting the actual AI's reasoning chain.The Builder
383
L
Litmus
Ship75% Ship

Unit tests for AI — find the cheapest model that passes your prompts

Every production AI team needs this and most are doing it manually with spreadsheets. The cost projection feature alone is worth shipping — I've watched teams spend 10x more than necessary on inference because they never systematically tested cheaper models. This is the tooling that makes responsible model selection practical.The Builder
384
B
BrainCTL
Ship75% Ship

Portable SQLite brain for AI agents — 192 MCP tools, zero servers

192 MCP tools in one pip install with a single SQLite file as the backend is an incredibly developer-friendly design. No infra, no API keys, no cost per memory operation. The LangChain and CrewAI adapters mean I can drop this into existing projects with one line.The Builder
385
C
Claudraband
Ship75% Ship

Make Claude Code sessions resumable, headless, and programmable

This is exactly what Claude Code has been missing. Session persistence and HTTP control turn it from a great interactive tool into something you can actually build pipelines around. The ACP server for editor integration is the feature I didn't know I needed.The Builder
386
M
marimo-pair
Ship75% Ship

AI agents that live inside your running Python notebook and see your data

The gap between 'AI sees your code' and 'AI runs in your environment with live data' is enormous for data science work. I've wasted hours explaining context to LLMs that could have just looked at the dataframe. This closes that loop completely.The Builder
387
G
Gemini CLI
Ship75% Ship

Google's open-source terminal AI agent — free Gemini 2.5 Pro in your shell

Free Gemini 2.5 Pro with 1M context in my terminal, Apache 2.0 licensed, with MCP support? This should have been a paid product and Google is giving it away. For hobby projects and open-source work, this is an instant install.The Builder
388
M
Multica
Ship75% Ship

Assign tasks to coding agents like teammates, not just tools

The auto-detection of available CLI tools (Claude Code, Codex, OpenCode) means I can use whatever model works best for each task without rebuilding my setup. The WebSocket streaming means I can actually watch what's happening — a massive improvement over blind async execution.The Builder
389
A
Archon
Ship75% Ship

Define AI coding workflows in YAML — execute them deterministically

This is what we've been missing. One-shot coding agents are great for demos but terrible for production pipelines. YAML-defined workflows with git worktree isolation finally give you the repeatability you need to run AI coding at scale. The Stripe-style PR automation is within reach for any team now.The Builder
390
M
MassGen
Ship75% Ship

Run 15+ AI models in parallel — let them critique each other until they converge

The terminal-native ensemble approach is genuinely novel. Being able to spin up Claude, GPT-5, and Gemini on the same hard problem and watch them debate is something I've wanted for ages. Adds real value for decisions where a single model's confident wrong answer would cost you hours.The Builder
391
B
Buildermark
Ship75% Ship

See exactly how much of your codebase was written by AI, commit by commit

Unified attribution across Claude Code, Codex, Gemini, and Cursor simultaneously gives me something no single agent tool provides. Commit-level AI attribution is genuinely useful before merging — I want to know if a section is heavily AI-generated so I can give it proportionally more review attention.The Builder
392
C

Community-curated mega-guide to getting the most from Claude Code

This is the first tab I open when onboarding a new engineer to a Claude Code project. The CLAUDE.md patterns and MCP server config examples saved our team at least a week of trial-and-error. Bookmark it immediately and check for updates weekly — it's living documentation.The Builder
393
D
Domscribe
Ship75% Ship

Gives AI agents source-to-DOM traceability — click any element, get the code

This fills a real gap I've been hitting weekly. When I tell Claude to 'fix the button in the header,' it has no idea which file that button lives in. Domscribe gives agents ground truth about the rendered DOM — it's the missing link for serious agentic frontend work.The Builder
394
S
Superpowers
Ship75% Ship

7-step agentic dev methodology for Claude Code, Cursor, and Gemini CLI

I've been burned too many times by coding agents that thrash around and pollute my working branch. The worktree isolation step alone is worth adopting — it makes agentic sessions recoverable. The planning doc requirement forces the agent to externalize its reasoning, which dramatically improves complex task completion rates.The Builder
395
O
OpenDataLoader PDF
Ship75% Ship

0.928 table accuracy PDF parser with bounding boxes for RAG citation

Table extraction at 0.928 accuracy is genuinely impressive — I've been wrestling with financial PDF parsing for months and nothing open-source came close. The bounding box output means my RAG system can cite 'page 7, table 3, row 4' instead of just the document name. The prompt injection filter is something I didn't know I needed until I thought about adversarial PDFs.The Builder
396
A
Apfel
Ship75% Ship

Tap Apple's free on-device AI as a local OpenAI-compatible server

If you have an M-series Mac running macOS 26, this is an immediate install — drop-in OpenAI compatibility means you can start running local inference against existing projects in literally 5 minutes. The MCP support and file attachment handling make it genuinely useful for scripted workflows, not just chat. The token limit stings, but for most dev automation tasks 3K words is plenty.The Builder
397
M

One SQL semantic layer so AI agents stop hallucinating your KPIs

We've been burned by data agents that invent their own GROUP BY logic and produce wrong numbers that look right. Metrics SQL solves this at the infrastructure level — define revenue once, have every agent query the same definition. The SQL-native interface means no new tools for agents to learn; they just use the tables.The Builder
398
O
OpenCode
Ship75% Ship

The open-source AI coding agent that works with 75+ models

140K stars isn't hype — OpenCode has real momentum because it solves the actual problem: vendor lock-in. I can use my existing Claude subscription, switch to a local Gemma model when I need privacy, and have it work in every IDE I already use. This is what the coding agent space needed.The Builder
399
M
marimo pair
Ship75% Ship

Drop an AI agent into your live Python notebook session

This is the missing piece for data work with agents. Every time I've tried to use an LLM on a notebook it thrashes the kernel with hidden state — marimo's reactive model actually fixes that at the architecture level. Install it and immediately start running collaborative EDA sessions.The Builder
400
M
MiniMax CLI
Ship75% Ship

Video, speech, music, and text generation from any terminal or agent pipeline

I've been manually wiring MiniMax API calls for multimodal pipelines. Having an official MCP server that handles auth, streaming, and file management is a genuine time save. The fact that it covers video, speech, and music in one interface means I can stop juggling 3 different client libraries.The Builder
401
K
Karpathy Skills
Ship75% Ship

Andrej Karpathy's LLM coding wisdom packed into a single CLAUDE.md plugin

I've noticed a measurable improvement in Claude Code session quality after installing this. The 'verify before ending' principle alone has saved me from shipping broken refactors. It's a one-file install that acts like pair programming guardrails from someone who has thought deeply about LLM failure modes.The Builder
402
F
FoxGuard
Ship75% Ship

Sub-second security scanning across 10 languages, no JVM required

Sub-second scans in a single binary are exactly what's needed for AI-assisted coding workflows. I don't want to wait 20 seconds for SonarQube on every commit — I want instant feedback. FoxGuard as a pre-commit hook gives me a practical security floor without slowing down my agent loop.The Builder
403
S
Shopify AI Toolkit
Ship75% Ship

Let AI coding agents run your Shopify store end-to-end

Finally — a first-party MCP integration for Shopify that doesn't involve scraping the Admin UI or wrapping undocumented APIs. The 40+ tool definitions cover everything I'd want to automate: inventory sync, bulk SEO, discount rules, product variants. Drop it in Cursor and your store basically becomes a dev environment.The Builder
404
A
Ant CLI
Ship75% Ship

Anthropic's official CLI for the Claude API with YAML-native agent versioning

YAML-versioned agent configs that you can diff and deploy from the terminal is exactly what's been missing from the Claude ecosystem. I've been committing prompt strings to git as plaintext — Ant treats them as proper infrastructure. The Managed Agents integration means I can ship an agent to production with one command.The Builder
405
M
Multica
Ship75% Ship

Self-hosted managed agents — assign issues to AI like teammates

If Anthropic's Managed Agents announcement made you nervous about vendor dependency, Multica is the direct answer. Self-hosted, multi-runtime, and Apache 2.0 — ship this immediately for any team that cares about infrastructure autonomy.The Builder
406
G
GitButler
Ship75% Ship

Virtual branches for humans and AI agents — the Git client for parallel work

I've been using GitButler for six months and the virtual branch model genuinely changes how I work. The agent-native pitch isn't marketing — when AI coding tools make 30 file changes across 5 directories, being able to visually sort those into lanes and ship them independently is a real workflow win. The $17M gives them runway to build the collaboration features that make this useful for teams, not just solo devs.The Builder
407
S
Superpowers
Ship75% Ship

Workflow discipline for AI coding agents — spec first, code second

Jesse Vincent has been building developer tools for decades and it shows — this is opinionated in the right ways. Forcing spec elicitation before code generation is the single highest-leverage intervention you can make on agent output quality. The shell/bash skill design means you can modify and extend it without a new framework to learn. I'm adding this to my workflow today.The Builder
408
H
Hermes Agent
Ship75% Ship

The AI agent that gets smarter with every session

Self-improving agents are the holy grail of the agent space, and Nous Research actually delivers a working implementation. The skill persistence architecture is well-designed — finished tasks become reusable procedures, so the agent gets better at your specific workflow over time. Model-agnostic, cheap to run, serious pedigree. This is the kind of thing you set up once and it compounds.The Builder
409
E
Eyeball
Ship75% Ship

Inline screenshots with every AI claim — hallucination's paper trail

This is the kind of clever, unglamorous tool that actually solves a real problem. The insight that screenshots are harder to hallucinate than quotes is simple but profound. Drop this into any pipeline that serves legal or compliance users immediately.The Builder
410
L

LM Studio buys the best iOS local LLM app to go cross-device

This is the right move for LM Studio. The desktop client is already excellent and Locally AI's Core ML integration is the best iOS inference wrapper available. Combining Grondin's Apple-native work with LM Studio's model management and server mode could produce something genuinely special for local AI power users.The Builder
411
G
Goose
Ship75% Ship

Open-source AI agent built in Rust — install, execute, edit, and test with any LLM

The recipe system is the sleeper feature here. Capture a workflow once, version it in git, run it in CI, share it with your team — that's how you scale agent-assisted development across an org. Goose is the first open-source agent I've seen that treats workflow portability as a first-class concern rather than an afterthought.The Builder
412
N
NVIDIA AITune
Ship75% Ship

One API to optimize any PyTorch model for NVIDIA GPU inference

The auto-backend selection is the killer feature — I can't tell you how many times I've wasted days figuring out whether TRT or Torch Inductor would be faster for a specific model architecture. Shipping this as open source under NVIDIA's AI Dynamo umbrella gives it real staying power.The Builder
413
C
Claw Code
Ship75% Ship

The open-source Rust rewrite of Claude Code that went viral overnight

This is the most important open-source release of 2026 for working developers. It gives me a Claude Code-style agent loop I can audit, fork, and run on my own infra without trusting a single vendor. The Rust performance profile is a bonus.The Builder
414
T
Tether QVAC SDK
Ship75% Ship

Open-source local AI SDK that runs on every device, no cloud needed

The cross-platform abstraction over llama.cpp is something I've been wanting for a while. Usually you're duct-taping together different runtimes for iOS vs Android vs desktop. If QVAC delivers on that single-codebase promise it saves weeks of integration work. The decentralized distribution is a bonus for projects with sovereignty requirements.The Builder
415
T
Twill
Ship75% Ship

Cloud coding agent that ships PRs while you sleep

The GitHub/Linear integration is what sets this apart from just running Claude Code in a container yourself. The task routing and context injection are already well-thought-out. I tested it on a backlog of dependency bumps and it handled 8 of 9 without touching a keyboard. That's real ROI.The Builder
416
G
Gemini CLI
Ship75% Ship

Google's free, open-source terminal AI agent with 1M context window

1M context and free is a combination no other terminal agent matches. I use it specifically for legacy codebase archaeology — when I need to understand a 200k-line repo before I touch it, Gemini CLI is the only tool that can hold the whole thing in memory. For greenfield projects I still reach for Claude Code.The Builder
417
M
MarkItDown
Ship75% Ship

Convert any Office doc, PDF, or image to clean Markdown for LLMs

Already using this in production. The plugin architecture and MCP server are the upgrades that pushed it from 'useful script' to 'actual dependency'. In-memory processing means it works cleanly in serverless environments. This is now the default document parsing layer for every LLM project I start.The Builder
418
O
oh-my-pi
Ship75% Ship

Terminal coding agent with hashline edits — 10x fewer whitespace bugs

Hashline edits alone make this worth switching to. I've lost hours to whitespace-induced diff failures in other agents — oh-my-pi just gets it right. The multi-tool config loading means I don't have to re-document my project rules for every agent I try.The Builder
419
K
Karpathy Skills
Ship75% Ship

Andrej Karpathy-inspired CLAUDE.md guidelines that make AI coding agents less chaotic

420
B
Baton
Ship75% Ship

Run multiple AI coding agents in parallel, each in isolated git worktrees

This is the workflow tool I didn't know I needed. Running three Claude Code instances on different features simultaneously, each in isolation, feels like having a real team. The worktree isolation means no constant merge conflicts — and getting notified when agents finish is genuinely delightful.The Builder
421
C
Cursor 3
Ship75% Ship

A unified workspace for building software with agents — not writing code

422
C
CSS Studio
Ship75% Ship

Draw your UI by hand. An agent writes the code.

The prompt-to-UI loop produces beautiful demos that collapse when you actually try to integrate them. CSS Studio's explicit design-first approach generates code that reflects what you built, not what the model hallucinated — that's a workflow improvement I'll actually use.The Builder
423
G
Grass
Ship75% Ship

Claude Code in the cloud — run agents from your phone, stop burning your laptop

This is exactly the right product for the agentic coding moment — Cursor 3 and Claude Code sessions can run for hours, and nobody wants their laptop locked up for that. Daytona as the underlying environment layer is a solid choice for reproducibility. The mobile monitoring interface is the feature I'd actually use most — steering from your phone mid-session is genuinely different from being tied to a terminal.The Builder
424
B
botctl
Ship75% Ship

A process manager for persistent autonomous AI agents — like systemd for bots

This fills a real gap. Running AI agents as persistent processes with proper lifecycle management — sleep, pause, resume, memory — is something every serious builder eventually cobbles together themselves. botctl gives you that scaffolding out of the box. The BOT.md format is a genuinely clever design choice: your bot is just a file you can git commit.The Builder
425
R
Rubber Duck
Ship75% Ship

A second AI model reviews your Copilot agent's plan before it ships code

The insight here is sharp: models are worst at finding their own mistakes. Using a second model as an independent reviewer is the right call, and it mirrors how good human code review actually works. I want to know which model pairs GitHub is using — the quality of the adversarial check will depend heavily on choosing models with genuinely different failure modes.The Builder
426
A
Archon
Ship75% Ship

YAML-defined coding workflows with isolated worktrees — what Dockerfiles did for infra

The git worktree isolation per workflow run is the killer feature — no more agents clobbering each other's state. The YAML workflow definition is the right abstraction: version-controlled, diffable, shareable across teams. This is what CI/CD looked like before GitHub Actions, and Archon is doing for agentic coding what Actions did for pipelines.The Builder
427
O
OpenDataLoader PDF
Ship75% Ship

#1 GitHub trending: extract AI-ready data from any PDF, locally

The #1 benchmark score at 0.90 isn't marketing — tested against our existing PDF pipeline and table extraction accuracy jumped significantly. Local-only processing with Apache 2.0 means no data leakage and no vendor lock-in. Ship this immediately if you're parsing PDFs for AI.The Builder
428
I
Instant
Ship75% Ship

The real-time backend built for apps coded by AI agents

The undo functionality for destructive LLM actions is underrated. When your coding agent drops a table, having a rollback baked into the backend is the difference between a bad minute and a very bad day. Real-time sync plus agent-safe ops is a useful combination.The Builder
429
S
Shopify AI Toolkit
Ship75% Ship

Give your AI agent live Shopify docs, GraphQL schemas, and real store operations

Live schema validation against actual Shopify API versions is the killer feature. Anyone who's chased a 'deprecated field' error three hours into an agentic coding session knows exactly why this matters. Setup is simple and it works with every major AI coding agent out of the box.The Builder
430
C
Claudoscope
Ship75% Ship

macOS menu bar app to browse, search, and cost every Claude Code session

As someone who runs Claude Code 8+ hours a day, this is immediately valuable. I had no idea which projects were burning through tokens until I installed it. The leaked credential detection is a bonus I didn't expect — it already caught a test API key I'd forgotten to rotate.The Builder
431
M
Modo
Ship75% Ship

Open-source AI IDE with spec-driven dev — plan before you code

The spec-driven pipeline is the real differentiator here — most AI IDEs turn into spaghetti on large refactors because there's no planning phase. Modo's Requirements → Design → Tasks flow gives agents enough context to stay coherent across files. The multi-provider support is a bonus: swap to Ollama for private codebases without changing your workflow.The Builder
432
T
TUI-use
Ship75% Ship

Let AI agents take control of interactive terminal programs

This is the missing piece for automating legacy ops workflows. Half my toolchain is interactive TUI apps that choke every agent pipeline — TUI-use just quietly solves that. The PTY state machine approach is clever and the API is clean.The Builder
433
S
Superpowers
Ship75% Ship

Composable workflow framework that forces AI coding agents to write tests first

141k stars doesn't lie — this fills a real gap. Claude Code is brilliant at generating code and terrible at knowing when to stop and write a test. Superpowers adds the engineering discipline that solo devs usually skip under deadline pressure. The git worktree isolation is a particularly smart detail that prevents agent experiments from trashing your main branch.The Builder
434
N

Browser infra for AI agents with an open benchmark proving real-world performance

The open benchmark is the ballsiest move here — publishing your full execution traces so anyone can verify your claims is rare in this space. Sub-50ms session spin-up and 47s task completion vs Browser-Use's 113s are meaningful numbers for production agents where latency compounds. SOC 2 already sorted is a big deal for enterprise deals.The Builder
435
C
Career-Ops
Ship75% Ship

Claude Code agent that scans 45+ job portals and auto-generates ATS-optimized CVs

This is exactly what Claude Code was made for — a high-signal agentic loop that replaces hours of manual work with a config file and a run command. The fact the creator used it to actually land a job makes it more credible than 90% of 'AI-powered' job tools. Fork it, tweak the scoring weights, ship your apps.The Builder
436
P
Paper2Code
Ship75% Ship

Multi-agent LLM turns any ML paper into runnable code — 0.81% manual fix rate

The reproducibility gap in ML is real and Paper2Code genuinely moves the needle. I tested it on a 2025 diffusion paper with no public code and got a working training loop on the first try. The three-agent architecture — Planner, Analyzer, Generator — is a clean design worth stealing for other doc-to-code use cases.The Builder
437
G
GitNexus
Ship75% Ship

Codebase knowledge graph with MCP — agents finally understand your architecture

This is the missing layer for AI coding agents. Blast radius analysis alone would justify the install — I've spent hours manually tracing dependency chains before letting an agent touch a shared module. The CLAUDE.md auto-gen is a nice bonus for teams standardizing on Claude Code.The Builder
438
M
MCPCore
Ship75% Ship

Build and deploy MCP servers in your browser — no DevOps needed

Setting up a production MCP server with OAuth and encrypted secrets normally takes a day of DevOps work. MCPCore gets you there in 20 minutes with a browser. The auto-generated config exports for Claude Desktop and Cursor are a nice touch — it handles the part of MCP adoption that causes the most friction for non-infra engineers.The Builder
439
M
Mo
Ship75% Ship

GitHub bot that flags PRs conflicting with decisions made in Slack

The scope is exactly right: one job, done well. Architectural drift from forgotten Slack decisions is a real and expensive problem. A bot that sits in the merge gate and catches those conflicts before they ship is worth setting up in any team above five engineers.The Builder
440
G

Fine-tune Gemma 4 with text, images & audio on your Mac

This is exactly what Apple Silicon owners have been waiting for. Running text + image + audio fine-tuning locally without needing a cloud GPU or NVIDIA hardware is genuinely useful — and the LoRA support keeps resource usage manageable. Ship immediately for anyone experimenting with Gemma 4 on a MacBook Pro M4.The Builder
441
C
Claw Code
Ship75% Ship

Open-source Claude Code rewrite — multi-agent orchestration, zero lock-in

72k stars in under a week doesn't lie — developers have been waiting for an open harness layer. The architecture is clean and the ability to swap model backends is exactly what production teams need. This is the foundation for the next generation of AI coding workflows.The Builder
442
A
Apfel
Ship75% Ship

Your Mac's hidden on-device LLM, finally set free

If you're already on the Tahoe beta, this is an instant install. Drop-in Ollama compatibility means every tool I already use just works — no friction, no cost. The MCP + tool calling support is unexpectedly polished for a one-dev project.The Builder
443
L
LiteRT-LM
Ship75% Ship

Run Gemma 4 and other LLMs fully on-device — no cloud required

This is the real deal for edge AI development. The CLI makes it trivial to get Gemma 4 running locally in minutes, and function calling support means you can build actual agentic apps that work offline. Google backing means this won't be abandoned in six months.The Builder
444
A
AgentPulse
Ship75% Ship

Visual GUI for AI coding agents — no CLI required

The parallel agents dashboard is genuinely useful — I often run 3-4 agent tasks simultaneously and tracking them in separate terminals is messy. A unified view with structured diff approval is exactly the interface layer that's been missing from terminal-based agent tools.The Builder
445
O
oh-my-codex
Ship75% Ship

Add AI agent teams, event hooks, and a live HUD to any Git repo

This is the right abstraction layer — repo-level AI hooks that work regardless of what editor you're in. The HUD is surprisingly polished for an indie project. I can see this becoming a standard part of the dotfiles setup for developers who work across multiple editors.The Builder
446
G
GuppyLM
Ship75% Ship

A 9M-param fish LLM that teaches you how transformers actually work

130 lines from raw data to inference — I've never seen a more honest on-ramp to transformer internals. The deliberate omission of RoPE and SwiGLU forces you to understand the delta between vanilla and modern architectures. Assign this to every junior ML engineer before they touch Hugging Face.The Builder
447
R
Recall
Ship75% Ship

Find any file on your machine with a sentence — no tags, no indexing

ChromaDB + Gemini Embedding 2 on local files is a setup I'd have spent a week configuring from scratch. Recall packages this cleanly with a Raycast extension that makes it actually usable day-to-day. The MIT license and zero vendor lock-in seal the deal for me.The Builder
448
M
Modo
Ship75% Ship

AI IDE that writes specs before code — not just a Cursor clone

Spec-driven development is exactly what enterprise AI coding needs. I've watched too many Cursor sessions generate 500 lines of code that ignored the actual architecture. Modo's persistence layer and steering files are the missing piece — this deserves a serious look.The Builder
449
M
Metoro
Ship75% Ship

AI SRE that auto-detects Kubernetes incidents and raises fix PRs

eBPF-based auto-instrumentation that deploys in a minute and then just works is a genuinely good idea. Most K8s observability setups take days to instrument properly and still have gaps. The PR-raising feature is the kind of close-the-loop feature that actually reduces on-call burden rather than adding another alert source.The Builder
450
G
GitNexus
Ship75% Ship

Knowledge graph for any codebase — runs in browser via WASM

This tackles something I've been hacking around manually — pre-feeding dependency graphs into context windows before big refactors. The Graph RAG approach is genuinely smarter than pure embedding similarity for code questions. The MCP integration means it slots directly into Claude Code without any glue code.The Builder
451
P
pi-mono
Ship75% Ship

One monorepo: coding agent CLI, unified LLM API, TUI/web libs, Slack bot, vLLM ops

The mid-session model handoff is a genuinely useful primitive — start cheap with a fast model for exploration, hand off to a smarter model when you hit a hard problem, without restarting context. The vLLM pod tooling bundled in means this covers the full dev-to-deploy loop for teams running their own inference.The Builder
452
O
Onyx
Ship75% Ship

Self-hosted AI platform with RAG, agents, and 50+ connectors — MIT licensed

50+ connectors out of the box plus MCP support means you can actually index your entire company knowledge base without writing glue code. Self-hosting on Docker took about an hour to get running. This is what I wanted Danswer to become — and it did.The Builder
453
M

SOTA multilingual embeddings in 3 sizes — quietly MIT-licensed with zero fanfare

MIT license + SOTA multilingual MTEB scores + 270M/0.6B/27B size options = drop this into your RAG stack immediately. The decoder-only architecture is architecturally interesting but what matters is the benchmark numbers, and they're the best in class. Drop-in replacement for mE5-large or multilingual-e5-large.The Builder
454
N
nanocode
Ship75% Ship

Train Claude Code-style models on TPUs for under $200

This is the kind of project that makes AI research actually reproducible. JAX's JIT compilation gives you near-metal performance on TPUs without writing CUDA, and $200 to replicate a production-grade code model pipeline is genuinely wild. Every indie AI lab should be studying this codebase.The Builder
455
I
imgcmd
Ship75% Ship

Secure CLI that generates real PNGs to disk — no broken SVGs from agents

The --create-rule flag that teaches your IDE to use it natively is the whole product. That's clever distribution — once it's in the Cursor rules, it just works forever. Small tool, real problem solved.
456
M
MemPalace
Ship75% Ship

Persistent cross-session memory for any LLM — local, free, 96% LongMemEval

Verbatim storage avoids the lossy-summary trap that plagues most memory systems. ChromaDB + SQLite locally is a practical stack with minimal operational overhead, and the 170-token retrieval cost is genuinely low. Worth evaluating before paying for any memory-as-a-service layer.The Builder
457
A
Apfel
Ship75% Ship

Free CLI for Apple's on-device LLM — no API key, no downloads, runs on macOS

OpenAI-compatible server on localhost means I can prototype automations and scripts against a real LLM without paying for API calls or waiting on rate limits. The pipe-friendly CLI with proper exit codes is exactly what shell scripting needs. For Mac-native tooling, this is a genuine gap-filler.The Builder
458
H
Handle
Ship75% Ship

Click to tweak your UI, auto-feed changes to your AI coding agent

This solves the exact problem I hit daily — describing spacing tweaks in plain English to Claude Code is maddening when I can just see what I want. A visual picker that spits out precise agent instructions closes a real loop in the AI coding workflow. Free beta makes trying it a no-brainer.The Builder
459
G
GLM-5V-Turbo
Ship75% Ship

Converts design mockups to frontend code, beats Claude at Design2Code

A 94.8 Design2Code score that outperforms Claude at roughly 1/3 the inference cost is a genuine benchmark breakthrough. Open weights mean I can self-host this for a design-to-code pipeline inside my company without paying per-call API fees. Testing immediately.The Builder
460
L
LiteRT-LM
Ship75% Ship

Google's open-source engine for LLMs on phones, browsers & IoT

A unified inference runtime across Android, iOS, browser, and IoT with function calling support is exactly what the edge AI ecosystem has been missing. The WebAssembly path alone opens up private on-device AI in any browser without installing anything. Ship this immediately.The Builder
461
O

Run a prompt through multiple LLMs simultaneously and fuse the best answer into one

Finally, proper multi-model consensus without writing orchestration boilerplate. I've been doing this manually for months — having OpenRouter handle the parallel dispatch and judgment layer in one API call is genuinely useful, especially for high-stakes code review tasks.The Builder
462
M
Mercury Edit 2
Ship75% Ship

Diffusion LLM that predicts your next code edit in parallel — not word by word

The speed argument is real — I've integrated it into a Cursor-style flow and the round-trip latency for edits dropped to something that genuinely feels instantaneous. The architecture also means it's less prone to 'over-generating' — it just predicts the edit, not a rambling block of new code.The Builder
463
M
MolmoWeb
Ship75% Ship

Allen AI's open-weight web agent trained on 36K human task trajectories

78.2% on WebVoyager from a 8B model trained on human data rather than proprietary model distillation — that's a real technical achievement. The 4B version running on consumer hardware opens up use cases that were previously cloud-only. Fine-tunable and fully open is the right call.The Builder
464
O
oh-my-claudecode
Ship75% Ship

Teams-first multi-agent orchestration for Claude Code

The smart model routing is the real win here—automatically sending simple tasks to Haiku and complex reasoning to Opus means you stop burning Opus credits on boilerplate. Team Mode with 19 specialized agents sounds like overkill until you're parallelizing a large refactor across six files simultaneously.The Builder
465
B
Baton
Ship75% Ship

Run multiple AI coding agents in parallel — zero merge conflicts guaranteed

The worktree isolation model is genuinely the right architecture for running multiple coding agents. Each agent gets its own branch, its own working directory, and its own terminal — no stashing, no conflicts, no overwritten files. The built-in diff viewer means I never have to jump between terminals to review changes. The free tier's 4-workspace limit covers most real workflows. $49 once is a bargain if this saves one hour of merge conflict debugging.
466
C
Claude How To
Ship75% Ship

The missing practical guide to mastering Claude Code

The hook event documentation alone is worth bookmarking—25+ events with working examples is something the official docs simply don't have. The CLI headless automation reference for CI/CD is genuinely useful and hard to find elsewhere.The Builder
467
K
Kin-Code
Ship75% Ship

Claude Code reimagined as a 9MB Go binary with zero dependencies

A single binary that does what Claude Code does but works with Ollama too? That's a genuine win for teams running air-gapped or resource-constrained environments. The Go implementation means cross-platform distribution without dependency hell — just download and run.The Builder
468
O
Oh My Codex (OMX)
Ship75% Ship

oh-my-zsh for OpenAI Codex CLI — multi-agent orchestration with 33 prompts

Parallel worktree agents with automatic merge coordination is exactly the missing piece in Codex CLI. I ran three specialized agents simultaneously on a refactor last night and the hooks system handled the integration. 12K stars in a day doesn't lie — ship it.The Builder
469
C
Cursor 3
Ship75% Ship

Cursor evolves from AI IDE to multi-agent coordination platform

The unified agent session sidebar alone justifies the upgrade. I had three parallel agents running — one on tests, one on docs, one on a new feature — all visible and manageable from one interface. The MCP marketplace is early but the architecture is right. Ship.The Builder
470
A
Axolotl v0.16
Ship75% Ship

15x faster MoE+LoRA fine-tuning with 40x memory reduction

40x memory reduction on MoE+LoRA is not a rounding error — this is the difference between needing a $20K H100 and a $1.5K consumer GPU. The Gemma 4 day-0 support means I can fine-tune Google's best open model the same day it drops. Immediate upgrade for any ML pipeline.The Builder
471
T
tldr MCP Gateway
Ship75% Ship

Shrink 41+ MCP tool schemas by 86% before they hit your model

This solves a real problem I've hit personally — when you connect enough MCP servers, you're wasting a quarter of your context window on tool definitions before a single line of code is written. The five-wrapper-tool approach is elegant and the compression numbers are concrete and reproducible.The Builder
472
F
fff.nvim
Ship75% Ship

Frecency-aware file search built for both Neovim devs and AI agents

The frecency + git status scoring is exactly the heuristic I apply manually when navigating large codebases. Giving AI agents access to that same signal via MCP is a practical efficiency gain — fewer context tokens wasted on files that aren't what the model needs.The Builder
473
G
GLM-5V-Turbo
Ship75% Ship

Turn wireframes into production code — 200K context, scores 94.8 on Design2Code

A 17-point lead on Design2Code over Claude Opus, a 200K context window, and $4/M output pricing — that's a compelling combination for any team that's making Figma-to-code a production workflow. I'd run my own evals before fully committing, but the numbers are hard to ignore.The Builder
474
A
AMUX
Ship75% Ship

Run dozens of parallel AI coding agents unattended via tmux

This is exactly what the agentmaxxing workflow needs. Single Python file, no external services, and the kanban board preventing duplicate agent work is genuinely clever engineering. The self-healing watchdog alone saves hours of babysitting stuck sessions.The Builder
475
G
Gemini CLI
Ship75% Ship

Google's free open-source AI agent lives in your terminal

1,000 free requests per day is genuinely useful for hobbyist and side-project work. The built-in Google Search grounding is a killer feature for research tasks — Claude Code can't do that without MCP plugins. Active release cadence with weekly stable releases is reassuring.The Builder
476
C
ChromaFs
Ship75% Ship

Replace RAG sandboxes with a virtual filesystem — 460x faster boot

This is the most practical RAG architecture post I've read this year. The insight that LLMs are trained to use filesystem commands anyway — so fake the filesystem instead of spinning up real containers — is obvious in retrospect but genuinely clever. Implementation is reproducible with just-bash and any vector DB.The Builder
477
S
Superpowers
Ship75% Ship

Composable skill framework that forces coding agents to do it right

This solves the real problem with AI coding agents: they work great in isolation but create a mess at scale because they skip the boring engineering discipline. Mandatory planning, git worktrees for parallel work, and enforced test cycles are exactly the guardrails teams need.The Builder
478
C

Upload once, reuse forever — Claude's API just got leaner and meaner

This is the quality-of-life update I didn't know I desperately needed. Stop re-uploading your 40-page spec doc on every API call — reference it once, pay for it once, and move on. Token-efficient tool use is also a game-changer for chained agentic tasks where tool schemas were eating a horrifying chunk of my context window.The Builder
479
M
Mistral Small 3.1
Ship75% Ship

Lightweight multimodal AI — vision + text, open weights, zero compromise

Apache 2.0 with vision support in a small model is basically a cheat code for edge deployments. I can run this on modest hardware, fine-tune it on proprietary data, and ship it to production without a licensing lawyer on speed dial. Mistral keeps delivering where it counts for developers.The Builder
480
C
Cq
Ship67% Ship

Stack Overflow for AI agents — by Mozilla AI

Agents sharing solutions with other agents — this is how agent ecosystems should work. The Mozilla backing gives it credibility and staying power.The Builder
481
P
Postman
Ship67% Ship

API platform with AI-powered testing and documentation

Still the best API development environment. Postbot generating tests from your API schema saves hours. Collections shared across teams are essential.The Builder
482
P
ProofShot
Ship67% Ship

Give AI coding agents eyes to verify the UI they build

As someone who has watched AI agents confidently ship broken layouts, this is a godsend. The visual feedback loop means agents can actually catch that the button is overlapping the nav bar. Design quality from AI coding just leveled up.The Creator
483
A
Agent Kernel
Ship67% Ship

Three Markdown files that make any AI agent stateful

The simplicity is the feature. Three Markdown files, git-trackable, human-readable. No ORM, no migrations, no database to manage. For agents that need persistent state without infrastructure overhead, this is the pragmatic choice. I would pick this over LangGraph's complexity any day.The Builder
484
O
Optio
Ship67% Ship

Orchestrate AI coding agents in Kubernetes from ticket to PR

K8s-native agent orchestration is the right call — you get isolation, resource limits, and scaling for free. The ticket-to-PR pipeline is well-designed. My concern is the K8s prerequisite excludes most small teams, but if you already run K8s this slots right in.The Builder
485
C
Cq
Ship67% Ship

Stack Overflow for AI coding agents, by Mozilla AI

Finally someone is tackling the collective intelligence problem for agents. Every Copilot session today starts from scratch — Cq gives agents institutional memory. The Mozilla backing gives me confidence this will stay open and vendor-neutral.The Builder
486
B
Bolt.new
Ship67% Ship

Prompt to full-stack app in your browser

Perfect for prototyping. I described a dashboard and had a working app in 3 minutes. Not production-ready, but unbeatable for speed-to-demo.The Builder
487
L
Lovable
Ship67% Ship

Full-stack app builder with visual editing and one-click deploy

Best MVP builder on the market right now. The Supabase integration means you get a real database, not just a frontend. GitHub sync seals the deal.The Builder
488
R
Replit
Ship67% Ship

AI-powered cloud IDE with instant deployment

As someone who doesn't want to manage dev environments, Replit is perfect. I can build and deploy without touching a terminal. The Agent handles everything.The Creator
489
G
GitHub Copilot
Ship67% Ship

AI pair programmer from GitHub — now agentic, now free

Copilot Workspace is the standout — from GitHub Issue to implementation plan in one step. For teams living in GitHub, the integration is seamless: PRs, Workspace, Actions all work together. The free tier makes it impossible not to try.The Builder
490
W
Windsurf
Ship67% Ship

AI-native IDE by Codeium — Cascade agentic flow

The free tier is absurdly generous. Cascade handles multi-file refactors well and the codebase indexing is fast. If you can't justify $20/mo for Cursor, Windsurf is the answer.The Builder
491
W
Warp
Ship67% Ship

AI-native terminal — the command line, reimagined

The AI command generation is useful for complex one-liners I'd normally Google. The modern UI is controversial but the speed is undeniable — fastest terminal I've used.The Builder
492
G
Gemini Code Assist
Ship67% Ship

Google's AI coding assistant for Cloud and enterprise

The API design is thoughtful. Integrates well with existing stacks.The Creator
493
C
Copilot Workspace
Ship67% Ship

AI-native development environment from GitHub

Issue-to-PR workflow is the right abstraction. The planning step prevents the 'just generate code' antipattern.The Builder
494
S
SWE-Agent
Ship67% Ship

AI agent for resolving GitHub issues

Best open-source coding agent. SWE-bench performance is impressive and the architecture is well-designed.The Builder
495
Z
Zed
Ship67% Ship

High-performance multiplayer code editor

Fastest editor I've ever used. Native performance, real-time collab, and the AI integration is well-designed.The Builder
496
A
Amazon Q
Ship67% Ship

AWS AI assistant for developers and businesses

The Java 8-to-17 migration feature alone can save teams months. AWS-specific knowledge is unmatched.The Builder
497
E
Elysia
Ship67% Ship

Ergonomic web framework for Bun

End-to-end type safety with Eden treaty is the killer feature. Bun-native performance is excellent.The Builder
498
E
Effect
Ship67% Ship

Production-grade TypeScript framework

Typed errors and dependency injection for TypeScript done right. The platform modules (HTTP, Schema, SQL) are production-grade.The Builder
499
G
GraphQL Yoga
Ship67% Ship

The simplest GraphQL server

The best GraphQL server for Node.js. Envelop plugin system and multi-runtime support (Bun, Deno, Workers).The Builder
500
G
Grafbase
Ship67% Ship

Instant serverless GraphQL backend

Instant GraphQL API from a schema definition. Edge deployment and federation are well-designed.The Builder
501
R
Remix
Ship67% Ship

Full-stack web framework with web fundamentals

Web standards-first approach means your apps work without JavaScript. Loaders and actions are elegant patterns.The Builder
502
S
SolidJS
Ship67% Ship

Simple and performant reactivity for building UIs

React-like syntax with true reactivity and no Virtual DOM overhead. The performance benchmarks speak for themselves.The Builder
503
B
Budibase
Ship67% Ship

Build internal apps in minutes

Built-in database means zero external dependencies for simple CRUD apps. The automation engine is a nice bonus.The Builder
504
N
Nhost
Ship67% Ship

Open-source Firebase alternative with GraphQL

Hasura-powered GraphQL over Postgres with auth and storage. The GraphQL-first approach is powerful for complex data needs.The Builder
505
A

AI-powered terminal autocomplete

Autocomplete for CLI commands is surprisingly useful. Reduces trips to man pages and --help flags.The Builder
506
S
Streamlit
Ship67% Ship

Build data apps in Python

Python script to interactive web app with zero frontend code. The caching and state management work well.The Builder
507
F
Flagsmith
Ship67% Ship

Open-source feature flags and remote config

Open source with a self-hostable option. Remote config + feature flags in one tool reduces tool sprawl.The Builder
508
H
Hasura
Ship67% Ship

Instant GraphQL and REST APIs on your data

Point at Postgres, get a production GraphQL API instantly. Authorization rules and real-time subscriptions included.The Builder
509
B
Bit.dev
Ship67% Ship

Component-driven development platform

Component isolation done right. Independent versioning and testing per component is how design systems should work.The Builder
510
D
Docusaurus
Ship67% Ship

Build optimized documentation websites

React-based, versioning, and i18n built in. The most flexible open-source documentation framework.The Builder
511
L
Lerna
Ship67% Ship

Monorepo management for JavaScript

Revived by the Nx team and better than ever. The standard for publishing multiple npm packages from a monorepo.The Builder
512
I
Insomnia
Ship67% Ship

The open-source API development platform

Clean UI, open source, and supports every protocol. The git-based sync is useful for teams.The Builder
513
L
LaunchDarkly
Ship67% Ship

Feature flag management platform

The most feature-complete flag platform. Targeting rules, segments, and experimentation are production-grade.The Builder
514
V
Vue.js
Ship67% Ship

The progressive JavaScript framework

Composition API with TypeScript is excellent. The progressive adoption model means you can start small.The Builder
515
N
ngrok
Ship67% Ship

Unified ingress platform

One command to expose localhost. Essential for webhook development and quick demos. The inspection UI is useful.The Builder
516
G
GitLab
Ship67% Ship

Complete DevOps platform in a single application

Self-hosted option with complete CI/CD and security scanning. The single-platform approach reduces tool sprawl.The Builder
517
N
Netlify Database
Mixed50% Ship

Serverless Postgres built to be safe for AI agents in preview and production

Zero-config Postgres that auto-provisions on deploy is the developer experience everyone has wanted for a decade, and building AI agent guardrails into the schema change workflow is the right call. If you're already on Netlify, this removes the last reason to reach for PlanetScale or Supabase for small-to-medium apps.The Builder
518
V
Vera
Mixed50% Ship

A programming language designed for machines, not humans

The contracts-first approach is genuinely compelling — I've spent too many hours debugging AI-generated code that violated implicit invariants. Having the compiler enforce preconditions at every call site is the kind of guardrail I'd actually trust. The WASM compilation target means you can run this anywhere, and 3,638 tests suggests this isn't vaporware.The Builder
519
R
Rocky
Mixed50% Ship

Rust-compiled SQL for data pipelines: branches, lineage, AI intent layer

Compile-time type safety for SQL is the feature I've wanted for years — catching type mismatches before the pipeline runs instead of finding out when a dashboard breaks at 9am. The column-level lineage alone justifies the migration cost for any team managing complex pipelines.The Builder
520
D
ds2api
Mixed50% Ship

DeepSeek web sessions as drop-in OpenAI/Claude/Gemini APIs

If you have a DeepSeek account and want to use it through your existing OpenAI-compatible stack, this is the cleanest solution I've seen. The multi-account pooling and automatic rate-limit handling are genuinely thoughtful engineering.The Builder
521
F
free-claude-code
Mixed50% Ship

Route Claude Code traffic to DeepSeek, OpenRouter, or local models

This is exactly what the indie dev community needed after Anthropic tightened Pro limits. The per-model routing is clever — I can push heavy reasoning to DeepSeek and let fast autocomplete hit a local 8B model. Setup took about 15 minutes.The Builder
522
D
ds2api
Mixed50% Ship

One API endpoint, any AI model — protocol-converting middleware written in Go

This is the plumbing layer every multi-model deployment needs. Go was the right choice — fast, statically compiled, trivial to containerize. The multi-account key pooling alone makes this worth deploying for any team hitting rate limits on a single provider key.The Builder
523
T
Tendril
Mixed50% Ship

An agent that writes, registers, and reuses its own tools — forever

The bootstrap-three-tools architecture is elegant and addresses a real failure mode. Watching an agent build its own scraper and then reuse it 20 minutes later without being told to is genuinely impressive. The Deno sandbox makes it safe enough to experiment with seriously.The Builder
524
A
AI-SPM
Mixed50% Ship

Open-source runtime security control plane for AI agents in production

The OPA-based policy enforcement for tool calls is exactly the kind of control plane enterprises need before deploying agents in production. This is early but points in the right direction. If you're building agents with database or API access, you need something like this or you're flying blind.The Builder
525
F
free-claude-code
Mixed50% Ship

Use Claude Code without an API key — terminal, VSCode, or Discord

The Discord remote-control mode is genuinely clever — I can kick off a refactor from my phone and watch the streaming output in a channel. The multi-provider failover also makes it resilient in ways the official client isn't.The Builder
526
E
Edgee Team
Mixed50% Ship

Strava for your coding assistants — see who's using AI and what it costs

Our Claude Code bills were a mystery until we put Edgee in front of it. Now I can see which repos are heavy users, who's abusing long contexts, and where we can swap in a cheaper model without hurting output quality. This pays for itself immediately.The Builder
527
F
free-claude-code
Mixed50% Ship

Route Claude Code to free providers — NVIDIA NIM, OpenRouter, local LLMs

For the 80% of Claude Code usage that's just routine coding tasks, DeepSeek V4 via this proxy is genuinely indistinguishable in quality. I'm saving $200/month and the setup took five minutes. The per-model routing is smart engineering.The Builder
528
D
ds2api
Mixed50% Ship

Go middleware that routes any AI client to OpenAI, Claude, or Google APIs with rate rotation

Single-binary Go middleware with zero dependencies for multi-provider API routing is exactly what I've been hacking together manually. The key rotation is the killer feature for anyone running high-volume agent workloads against rate-limited APIs.The Builder
529
B
Beezi AI
Mixed50% Ship

Orchestrate your entire AI dev stack — routing, tracking, and ROI

Smart model routing is the feature every team building on multiple LLMs needs but keeps hand-rolling themselves. The Jira + GitHub integration means it plugs into real planning workflows, not just toy demos. If the cost claims hold up in practice, this pays for itself quickly.The Builder
530
O
oh-my-codex (OMX)
Mixed50% Ship

Like oh-my-zsh but for Codex — teams, memory, and TDD workflows

The git worktree isolation per worker agent is the feature that sold me — parallel agents without stomping each other's context is exactly the problem I kept hitting in vanilla Codex. The $ralph persistent completion loop is genuinely useful for large multi-file refactors.The Builder
531
T
TurboOCR
Mixed50% Ship

50x faster than PaddleOCR — 270 images/sec on a single RTX GPU

If you're running document pipelines at scale and still using Python PaddleOCR, this is a free 50x speedup for the cost of a Docker pull. The HTTP + gRPC dual interface and Prometheus metrics mean it drops right into existing infrastructure. C++20 with TensorRT is the right stack for this problem.The Builder
532
T
Trainly
Mixed50% Ship

Your AI agents are failing silently — Trainly finds the leaks

The one-decorator integration with a free audit is a genuinely smart GTM move — zero friction to try it, and the cost savings pitch is self-funding. Drift detection for AI pipelines is something I've been hacking together manually. If the signal-to-noise on their anomaly detection is good, this fills a real gap in the AI ops stack.The Builder
533
A

Per-session isolated agent sandboxes on Azure — scale to zero, any framework

Framework-agnostic hosted sandboxes with scale-to-zero is exactly what I need for deploying agents without maintaining my own Kubernetes cluster. The per-session isolation eliminates a whole class of security concerns I was handling manually. The Claude Agent SDK support means I don't have to choose between Azure and my preferred model.The Builder
534
S
Seeknal
Mixed50% Ship

Data & ML CLI where you define pipelines in YAML and query them in natural language

The draft, dry-run, apply workflow is the right abstraction for data pipelines that agents touch — you want to see what's going to happen before it materializes to production Iceberg. The natural language query layer saves me from writing boilerplate SELECT statements to verify pipeline output, which is maybe 30% of my current pipeline debugging time.The Builder
535
M
ml-intern
Mixed50% Ship

Hugging Face's open-source agent that reads papers, trains models, ships them

This is Hugging Face's credibility on the line — they're not just hosting models, they're shipping an agent that autonomously produces them. The 300-iteration loop with auto-context-compaction shows real engineering maturity. I want this running on my research backlog immediately.The Builder
536
C
CrabTrap
Mixed50% Ship

Open-source HTTP proxy that enforces security policies on AI agent API calls

This fills a gap that every production agentic system needs but almost no one has solved yet. The two-tier policy engine — static rules for speed, LLM for ambiguity — is the right architecture. The fact that Brex built and open-sourced this suggests they've already battle-tested it against real agent deployments.The Builder
537
E
Euphony
Mixed50% Ship

Turn Codex CLI sessions and Harmony JSON into browsable conversation timelines

Debugging Codex agent sessions used to mean manually reading JSON in a text editor. Euphony is what that developer experience should have always been — structured timelines, metadata inspection, and JMESPath filtering that actually works on large session files.The Builder
538
A
ArcKit
Mixed50% Ship

68 AI commands that turn architecture governance from chaos into system

68 commands with citation traceability and MCP servers for cloud docs is a serious toolkit, not a prompt dump. The Claude Code integration with autonomous research agents that can pull actual AWS/Azure documentation is the kind of thing I'd spend weeks building from scratch. For anyone doing ADRs at scale, this is a significant time saver.The Builder
539
V
Verdent
Mixed50% Ship

Describe your product in plain language — Verdent builds while you sleep

This is the early version of what will eventually make technical co-founder equity negotiations obsolete. The concept of AI agents with genuine product ownership — not just code suggestion — represents a fundamental shift in startup formation dynamics.The Futurist
540
G
Google ADK
Mixed50% Ship

Google's official open-source kit for building and orchestrating multi-agent systems

The API design is clean and the documentation is genuinely good — rarer than it should be for a framework launch. The built-in agent patterns cover 80% of multi-agent use cases out of the box, and the MCP support means you're not locked into Google's tool ecosystem.The Builder
541
D
dotclaude
Mixed50% Ship

Run multiple AI coding agents in parallel tmux panes — no extra API costs

This is the kind of DIY cleverness that eventually becomes best practice. Using tmux + CLI resume mode to approximate multi-agent coordination is a zero-dependency solution that works with the tools most developers already have. Rough but real.The Builder
542
R

Teach 18 AI coding agents to write correct streaming SQL — no hallucinated syntax

AI coding assistants hallucinate streaming SQL constantly — CDC ingestion patterns, windowed aggregations, and materialized view semantics are all places where generic training data fails hard. An installable skill package that auto-detects your agents and patches in correct context is exactly the right fix. Worth adding if you're building on RisingWave.The Builder
543
W
Waydev
Mixed50% Ship

Measure ROI of every AI coding tool — Copilot vs Cursor vs Claude Code unified

The 'which AI tool actually shipped good code' question is one every eng manager is asking. Waydev's existing Git integration means the attribution layer isn't a cold-start problem — if you're already using it for velocity metrics, the AI measurement upgrade is an obvious yes.The Builder
544
A
Archon
Mixed50% Ship

YAML-defined workflows that make AI coding agents deterministic and reproducible

Finally a way to make coding agents reproducible. I've been burnt too many times by agents that work perfectly once and then fail mysteriously. YAML-defined workflows in git means I can review exactly what the agent is doing and why the CI run broke. Isolated worktrees per task is the right default.The Builder
545
R
Remoroo
Mixed50% Ship

AI agent that remembers every run — built for long-running research and optimization loops

The patch-run-eval-repeat loop with persistent memory is exactly what's missing from existing coding agents. I've wasted days watching agents revisit approaches they already tried because they lost context. Remoroo's memory-as-infrastructure approach is the right abstraction. Would ship for any multi-day optimization task today.The Builder
546
S
SkillClaw
Mixed50% Ship

Multi-agent skill evolution that improves from every user's interactions

The cold-start problem for agents is genuinely painful in enterprise deployments — new users get a dumb agent until they've accumulated history. SkillClaw's collective approach is the right architecture fix. I'm watching how it handles skill drift and version conflicts before betting on it.The Builder
547
D
devnexus
Mixed50% Ship

Shared persistent memory vault for AI coding agents across repos

Agent amnesia is a real tax on multi-engineer teams using AI tools. devnexus's approach of using Obsidian + git means the memory is portable, auditable, and doesn't depend on any specific AI provider's memory feature. It's rough around the edges but the concept is sound and I'd build on top of it today.The Builder
548
D
DeepGEMM
Mixed50% Ship

DeepSeek's FP8 GEMM kernels hit 1,550 TFLOPS on H100 — no CUDA install needed

If you're running inference on H100s or H800s, DeepGEMM is an immediate drop-in for the hottest path in your stack. The JIT approach means you're not fighting CUDA version mismatches, and 1,550 TFLOPS is a number that makes you pay attention. Already integrates with vLLM — just use it.The Builder
549
E
evalmonkey
Mixed50% Ship

Benchmark your AI agents under chaos — schema errors, latency spikes, 429s

Every engineer who's deployed an agent in production knows models fail catastrophically when the API starts rate-limiting mid-chain. evalmonkey is the first tool I've seen that actually lets you reproduce and measure that. The degradation delta report alone is worth the setup time.The Builder
550
M

MCP servers + multi-agent orchestration for enterprise Copilot

Native MCP support is genuinely huge — it means I can wire up any MCP-compliant server without duct-taping custom connectors together. The multi-agent orchestration layer is the missing piece that finally makes Copilot Studio feel like a real developer platform rather than a glorified chatbot builder. Still Microsoft-flavored lock-in, but the protocol standardization softens that considerably.The Builder
551
S
SmolAgents 2.0
Mixed50% Ship

Lightweight Python agents with visual debugging & multi-agent orchestration

SmolAgents 2.0 is exactly what the agent framework space needed — the visual debugger alone is a massive quality-of-life upgrade that makes tracing agent logic actually tractable. Native MCP and OpenAPI tool server support means you're not reinventing the wheel every time you want to plug in an external service. This is a serious contender against LangChain and CrewAI for teams that want lean, readable code without the boilerplate tax.The Builder
552
C
Cohere Command R2
Mixed50% Ship

Enterprise LLM that speaks SQL, Python, and R natively

Native SQL and code execution baked directly into the model is a massive DX win — no more duct-taping text-to-SQL pipelines together with fragile prompt engineering. The private deployment option on AWS and Azure is the real killer feature for enterprise shops that can't let data leave their VPC. This is the kind of pragmatic, production-ready tooling the space desperately needed.The Builder
553
I

One API, 10+ cloud backends — model inference without the chaos

This is genuinely the multi-cloud inference abstraction layer I've been hacking together myself for two years — now it just exists. Single auth token, automatic fallback, and no rewrite when a provider changes pricing or goes down? Ship it immediately. The only caveat is that provider-specific features like fine-tuned model routing may still need manual handling.The Builder
554
C

Enterprise RAG with 256K context, grounded citations & quality scoring

The 256K context window alone is a game-changer for long-document RAG pipelines where chunking strategies always felt like a painful workaround. The Retrieval Quality Score metric is something I didn't know I needed — having a structured signal to evaluate retrieval-generation alignment is huge for iterating on enterprise pipelines. Deploying through Bedrock or Azure means zero friction for teams already locked into those clouds.The Builder
555
C
ClawTrace
Mixed50% Ship

Real-time agent swarm monitoring at 0.1ms latency via SSE

SSE over HTTP polling for agent telemetry is the right call — anything that reduces latency in a debugging loop makes a real difference. The zero-knowledge guardrails are thoughtful; agents routinely touch API keys and the fact that most monitoring tools just log those plainly is a genuine security problem.The Builder
556
K
karpathy-skills
Ship50% Ship

One CLAUDE.md file that fixes Claude Code's four worst coding habits

557
A
agent-cache
Mixed50% Ship

One Redis/Valkey connection to cache your LLM calls, tool results, and agent sessions

Managing three separate caching layers — one for LLM calls, one for tool outputs, one for session state — is a real tax on agent infrastructure maintainability. A unified abstraction with Valkey/Redis (which you likely already have) and OTel metrics baked in is an easy yes. The LangChain and Vercel AI SDK adapters mean minimal integration friction.The Builder
558
M
Mistral Edge
Mixed50% Ship

Run Mistral AI models on-device — no cloud, no latency, no limits.

This is the SDK I've been waiting for. On-device inference with quantized Mistral models means I can ship AI features without worrying about API costs, rate limits, or latency spikes. The sub-1B model targeting low-power hardware is a serious unlock for IoT and edge use cases that were previously out of reach.The Builder
559
M
Magika
Mixed50% Ship

Google's AI-powered file type detector — 99% accuracy on 200+ types

Drop-in replacement for libmagic with dramatically better accuracy on edge cases — and since Google uses this on billions of files per week, I trust the production validation more than most OSS libraries. The JS/TS package makes it easy to add file validation to web APIs without a sidecar process.The Builder
560
T
Terrarium
Mixed50% Ship

Evals that actually simulate real deployment — stateful, multi-turn, alive

Static evals are lying to us constantly — agents that ace benchmarks fall apart in production because benchmarks don't have state, side effects, or accumulated context. Terrarium's living environments model is the right approach to catching real failure modes before deployment.The Builder
561
A
AgentTap
Mixed50% Ship

Capture every LLM call from any agent — no instrumentation needed

Treating agent observability as a network problem is a genuinely smart idea. Being able to observe any LLM calls — including from tools you didn't write — is a superpower for debugging multi-agent systems. Zero instrumentation overhead is huge.The Builder
562
A
Archon
Mixed50% Ship

Define your AI coding workflows as YAML — same steps, every time, no hallucination drift

YAML-defined AI coding workflows with isolated git worktrees and 17 built-in recipes is the missing orchestration layer between Cursor and your CI pipeline. The Slack/Discord/GitHub webhook triggers mean you can fire workflows from anywhere. This is the glue engineering teams have been waiting for.The Builder
563
O
oh-my-codex (OMX)
Mixed50% Ship

Oh-my-zsh but for OpenAI Codex CLI — agent teams, hooks, and structured workflows

If you use OpenAI Codex CLI daily, OMX is an immediate productivity upgrade. Structured $deep-interview → $ralplan → $team workflows mean Codex actually understands the codebase before writing, and isolated git worktrees for parallel specialists eliminate the merge conflicts that kill multi-agent coding sessions.The Builder
564
O
Ovren
Mixed50% Ship

AI engineers that live in your GitHub repo and actually ship your backlog

The 'assign a GitHub task, get back a PR' loop is straightforward and the human-approval gate means you're not handing over keys to production. For well-defined, scoped backlog tasks — bug fixes, small features, test coverage — this workflow makes sense. The free tier lets you evaluate quality before committing.The Builder
565
K
Kontext CLI
Mixed50% Ship

Stop giving your AI agent long-lived API keys — ephemeral credentials that expire on session end

The credential problem with AI agents is real and underappreciated. When your agent has a GitHub token, Stripe key, and database connection in its environment, a single prompt injection can exfiltrate all of them. Kontext's ephemeral model — short-lived, scoped, auto-expired — is exactly how this should work. MIT license, native Go binary, no Docker required.The Builder
566
A
AMD GAIA
Mixed50% Ship

Build local AI agents on AMD hardware — NPU-accelerated, fully private

AMD GAIA gives Ryzen AI hardware owners a first-class local agent framework with Python and C++ SDKs, MCP integration, and NPU acceleration. The RAG, speech-to-speech, and code generation capabilities in one MIT-licensed package is exactly the kind of investment that makes AMD a viable platform for AI development.The Builder
567
B
Brightbean Studio
Mixed50% Ship

Self-hosted Buffer alternative built with Claude in 3 weeks

The three-week build time is the headline, and it's credible — Django + HTMX is exactly the kind of stack Claude handles well. AGPL-3.0 means you can self-host commercially, and having real approval workflows + client portals puts this ahead of many $20/mo SaaS alternatives.The Builder
568
S
SuperHQ
Ship50% Ship

Run AI coding agents in isolated microVMs with full Debian sandboxes

This is the missing piece for anyone running Claude Code on real projects. The overlay filesystem means you can let the agent go wild without fear — review, apply, or revert. The VM snapshot feature alone is worth the price of admission (which is currently free). Rough edges in alpha, but the architecture is right.The Builder
569
R
Ralph
Ship50% Ship

Autonomous loop that runs Claude Code until your whole feature list is done

The fresh-context-per-cycle approach solves the single biggest problem with AI coding agents: context exhaustion on multi-hour tasks. The prd.json format enforces the right discipline — stories small enough for one context window, outcomes defined in advance. I've shipped three features with this and it works as advertised when you write good PRDs.The Builder
570
C
claude-mem
Ship50% Ship

Persistent session memory for Claude Code — no more re-explaining your project

This solves the most annoying thing about AI coding assistants — having to re-explain your entire project structure every single session. The six-hook lifecycle integration is thoughtful and the 10x token reduction claim is plausible if the retrieval is tuned well. Single-command install seals it.The Builder
571
E

Lossless token compression that extends your Claude Code context by ~30%

Any tool that gives me 30% more context for free is worth running. A local Rust proxy adds minimal latency and the implementation is auditable — I can verify it's actually lossless. If the compression holds up on larger codebases this is an immediate install for me.The Builder
572
L
LaReview
Mixed50% Ship

Local-first AI code review that never uploads your code to a third-party server

The chain-your-own-agent model is the right call: I can swap in whatever LLM is best for my stack without waiting for LaReview to update their integrations. For teams at regulated companies, 'no code leaves your machine' is the difference between adoption and a hard no from legal.The Builder
573
N

NVIDIA's open-source stack for enterprise AI agents with 17 launch partners

The hybrid routing in AI-Q is clever — running cheap agents locally and escalating to frontier models only when needed is exactly the cost-control pattern enterprises want. OpenShell giving you policy-based guardrails as a runtime rather than an afterthought is the right architecture. I'd adopt this today if I were building enterprise agents.The Builder
574
D
Druids
Mixed50% Ship

Distributed multi-agent coding framework with live clone, inspect, and redirect

The copy-on-write agent clone primitive alone is worth the star — being able to branch an agent's state and explore multiple paths without restarting from scratch is genuinely novel. For complex pipelines where debugging is the bottleneck, the live inspector is immediately interesting. Documentation is sparse but the core concepts are sound; if you're building on this you'll need to be comfortable reading source code.The Builder
575
L
lmscan
Mixed50% Ship

Offline AI text detector that fingerprints which LLM actually wrote it

The zero-dependency, fully offline angle makes this immediately viable for enterprise environments where you can't send content to a third-party API for compliance reasons. The LLM fingerprinting feature is genuinely novel — I haven't seen another tool that tries to attribute text to specific model families. Early days, but the CI/CD integration and explainable output make it worth piloting for document pipelines where you need auditable AI detection.The Builder
576
S

Add a literature review phase to agent loops — +15% gains on $29 cloud spend

+15% on llama.cpp for $29 is a remarkable return. The research-first pattern is something every senior engineer already does intuitively — formalizing it into the agent loop is obvious in retrospect. Add this to any performance-optimization agent workflow now.The Builder
577
G
Google Scion
Mixed50% Ship

A hypervisor for AI coding agents — isolated containers, all runtimes

Isolated containers per agent with separate creds is the security architecture the industry has been hand-waving about. Running this in a Kubernetes job per agent task makes the cost/complexity tractable. Follow this project closely even if you're not using it yet.The Builder
578
P
pi-autoresearch
Mixed50% Ship

Autonomous code optimization loop — edit, benchmark, keep or revert

I ran this against my GraphQL resolver layer over a weekend and got 31% latency reduction with zero manual intervention. The MAD filtering is the real innovation — previous attempts at autonomous optimization would thrash on noisy benchmarks. This one doesn't.The Builder
579
R
Rudel
Mixed50% Ship

Session analytics and token dashboards for Claude Code & Codex teams

The 26% abandonment-within-60-seconds stat alone is worth installing this for. If I'm running a team on Claude Code, I want to know which developers are getting stuck immediately and why. The self-hosted model is exactly right for enterprise — no one wants their session data leaving the building.The Builder
580
L
Lukan
Mixed50% Ship

Open-source AI workstation for coding, ops, and everyday automation

The consolidated workstation idea is compelling — I'm currently running Cursor for code, a separate tool for infra automation, and yet another for personal agents. If Lukan can cover all three without being mediocre at each, that's a real quality-of-life improvement. The open-source positioning means I can actually trust it with my workflow.The Builder
581
O
Onform
Mixed50% Ship

Build and manage forms from Claude using plain language

MCP-first is the right design philosophy for developer tools in 2026. Being able to spin up a form with submission handling and webhook delivery through a Claude conversation — without touching a UI — removes a surprisingly annoying friction point in agent-built workflows.The Builder
582
F
Ferretlog
Mixed50% Ship

git log for your Claude Code agent runs — local, zero dependencies

If you run Claude Code daily, you need this immediately. Being able to diff two sessions like git commits and see exactly which tools fired and what they cost is something that should have existed from day one. Zero-dependency Python means it just works.The Builder
583
S
Skrun
Mixed50% Ship

Deploy any agent skill as a production REST API in one command

The framework portability angle is the real value prop — I have dozens of custom tools built for Claude that I can't reuse in other contexts without rebuilding them. If Skrun actually normalizes this cleanly across tool formats, that's a genuine pain solver.The Builder
584
M

Production-ready multi-provider agent framework with MCP + A2A support

MCP support plus A2A out of the box is the combination I've been waiting for in an enterprise-friendly package. If your team is .NET-first, this is now the obvious choice — stop evaluating and start shipping.The Builder
585
M
marimo-pair
Mixed50% Ship

Let AI agents step inside your running Python notebooks

The key insight is that data science agents need to work on running state, not just source files. marimo's reactive model is already the cleanest notebook architecture for reproducibility — adding agents that can execute and observe live cells unlocks a genuinely new debugging and analysis workflow that Jupyter simply can't match.The Builder
586
G
Google Scion
Mixed50% Ship

Google's open-source agent hypervisor — isolated containers, separate identities, full orchestration

Credential isolation between agents is the killer feature — I've been hacking around this problem manually for months. The Kubernetes-native deployment story and harness adapters for existing agent frameworks mean I can adopt this incrementally rather than rewriting everything.The Builder
587
C
CRAG
Mixed50% Ship

One governance file, compiled into every AI coding tool's format

Maintaining separate .cursorrules, copilot instructions, and CI configs is already a real headache on teams using 3+ AI tools. The single-source-of-truth approach is architecturally correct and the zero-dependency design keeps it lightweight. Early, but the concept is solid — I'd pilot this on a team project immediately.The Builder
588
O

Drive your real Chrome browser from any MCP client

The session persistence is the killer feature here. Every browser automation tool that required a fresh login was painful for any authenticated workflow. Being able to have Claude work inside my already-logged-in browser changes what's possible for personal agent automation. 19 tools is a solid foundation.The Builder
589
P
Pi-Mono
Mixed50% Ship

A batteries-included AI agent monorepo for serious builders

The unified LLM provider API alone is worth bookmarking — switching between Claude, GPT-4o, and Gemini without rewriting your agent logic is genuinely useful. The coding agent's step-by-step terminal UI is also much easier to debug than black-box agent frameworks.The Builder
590
F
fff.nvim
Mixed50% Ship

Freakin Fast Fuzzy Finder for Neovim — built for AI agents too

The MCP integration and frecency scoring for agents is genuinely useful — I've measurably reduced token burn in Claude Code sessions by pointing it at fff.nvim instead of raw glob calls. The Rust prebuilts mean zero configuration pain. Strong ship.The Builder
591
O
Ogoron
Ship50% Ship

AI QA that replaces your testing team — 9x faster, 20x cheaper

For a solo founder or two-person team shipping fast, the traditional QA workflow simply doesn't exist. If Ogoron can automatically generate and maintain tests that catch regressions—without me having to write a single Playwright spec—that's a massive unlock. The free tier means low risk to try it.Dev Patel
592
Q
qmd
Mixed50% Ship

Local doc search engine with BM25 + vectors + LLM re-ranking — by Shopify's CEO

Hybrid BM25 + vector + LLM re-rank is the right architecture for personal knowledge search — each layer catches what the others miss. The MCP server mode is genuinely useful: being able to ask Claude Code 'what did we decide about X last month' against my own notes changes the workflow. MIT licensed and from someone who ships real products.The Builder
593
F
Freestyle
Ship50% Ship

Full Linux VMs for coding agents that fork in milliseconds

Finally, proper infra for agents. The VM fork latency is legit — I've tried spinning up containers for agent sandboxes and the overhead kills iterative workflows. This solves the right problem.
594
C
Cursor 3
Ship50% Ship

Parallel local and cloud coding agents in one unified workspace

The multi-agent sidebar is the first time I've felt like I'm actually directing agents rather than babysitting a single one. The cloud/local handoff especially is a workflow unlock.
595
W
Worktrunk
Mixed50% Ship

Lightweight CLI for Git worktree management built for parallel AI agents

This is how good tooling should work — a thin, composable layer on top of something that already exists. No Electron, no subscriptions, no opinions about which agent you use. Just better worktree management. Ship immediately.
596
C
Caveman
Mixed50% Ship

Claude Code skill that cuts ~75% of tokens by making Claude talk like a caveman

I tested this against my normal Claude Code sessions and the token reduction is real — closer to 60-70% in practice, but that's still significant. For long refactoring sessions where I'm hitting usage walls, this is now a permanent part of my setup. One-line install is the right distribution model.The Builder
597
M
Maritime
Ship50% Ship

Deploy AI agents for $1/month — stateful containers that sleep when idle, wake in milliseconds

The problem is real: most hosting platforms were designed for stateless APIs, not agents that need 10-minute reasoning windows and persistent state. Maritime's sleep/wake model with zero cold-start context loss is exactly what the market needs. $1/month entry is a no-brainer to try.
598
P
Parallel Code
Ship50% Ship

Run Claude Code, Codex, and Gemini side by side in isolated worktrees

This plugs a real gap: running multiple agents without conflicts has always required manual worktree juggling. The diff viewer and QR monitoring are thoughtful touches that show this was built by someone actually using it. Ship it.
599
M
MCPCore
Ship50% Ship

Build and ship production MCP servers in minutes — managed auth, AES-256 secrets, real-time logs

MCP is now infrastructure. The problem isn't building the tools — it's shipping them with production-grade auth and secret management without spending a week on DevOps. MCPCore solves that. The no-boilerplate secret referencing alone is worth it.
600
M
MDArena
Mixed50% Ship

Benchmark your CLAUDE.md files against real PRs to see if they actually help

I've spent real time crafting CLAUDE.md files with no way to know if they help. A tool that uses my actual test suite against real PRs to measure context file effectiveness is exactly the feedback loop I've been missing. The `git archive` anti-cheat approach shows this was built by someone who's thought carefully about methodology.The Builder
601
M

Sub-100ms next-edit prediction for VS Code and JetBrains — powered by diffusion LLMs

I've used next-edit features in other tools but the sub-100ms latency here is genuinely different — it's below my perception threshold, which means it doesn't break flow. The multi-line simultaneous edit understanding is real; it caught a refactor pattern I was about to manually do across 6 call sites.The Builder
602
Z
ZeroClaw
Mixed50% Ship

A Rust AI agent runtime that boots in 10ms and fits under 5MB

10ms cold start and a sub-5MB binary for a full AI agent runtime in Rust? That's not marketing copy — that's genuinely useful for edge deployment. The trait-based swappable components mean you're not locked into their choices. I'm already thinking about running this on a $10/month VPS.The Builder
603
C
ctx
Mixed50% Ship

One interface for Claude Code, Codex, Cursor, and every agent you run

The single review surface for multiple concurrent agents is the feature I didn't know I needed until I tried managing three Claude Code sessions by hand. Containerized disk isolation means I'm not scared of what the agents will do to my filesystem. Shipping immediately.The Builder
604
E
Emdash
Mixed50% Ship

Run 23 coding agents in parallel from one desktop app — YC W26

23 supported agents, SSH remote connections, Linear/GitHub/Jira ticket intake, and a Git merge queue — this solves exactly the workflow I've been duct-taping together manually. YC backing with an MIT license means it's not going anywhere. Shipping today.The Builder
605
G

Give your coding agent live Gemini API docs so it stops hallucinating old code

Any project using the Gemini API gets immediate value from this — the models keep generating code against deprecated endpoints and wrong model names. Plugging in the MCP server and Skills package took 10 minutes and my Cursor agent stopped suggesting gemini-pro when it should be gemini-2.0-flash. The 63% token reduction on correct answers is real money saved per month for high-volume usage.
606
O
oh-my-codex (OMX)
Ship50% Ship

oh-my-zsh but for Codex CLI — hooks, teams, and a live HUD

The $ralplan workflow — clarify → approve plan → parallel team execution — maps directly to how I actually want to work with AI agents. The .omx/ state directory persists memory and execution logs across sessions, which solves my biggest frustration with stateless agent loops. The $team command spins up parallel Codex instances in isolated tmux panes with synchronized state. Took 20 minutes to set up, saved two hours on a refactor this week.
607
O
OpenScreen
Ship50% Ship

Free, open-source screen recorder for demos — no subscriptions, no watermarks

This is the tool I've been waiting for. Screen Studio is great but I'm not paying $200/year just to make occasional demos. OpenScreen does 95% of what I need, it's MIT licensed, and the PixiJS-based rendering actually looks smooth. Instant install for any indie dev.
608
O
OpenRouter Fusion
Ship50% Ship

Run 5 models in parallel, fuse the best answer into one

Parallel model execution with auto-synthesis is a genuinely useful primitive for production pipelines where you want consensus across models without writing orchestration glue yourself.
609
A
Agents Observe
Mixed50% Ship

Real-time dashboard for monitoring Claude Code multi-agent teams

The moment you're running 3+ Claude Code agents in parallel, you desperately need something like this. Watching swimlane views of parallel agent activity is way better than tailing 5 separate log files. The distributed tracing mental model is exactly right for multi-agent debugging.The Builder
610
C
Coasts
Mixed50% Ship

Containerized sandboxes for running AI agents safely in production

The declarative capability grants are exactly what I want — specify what an agent can touch and nothing more, spun up in a container with resource limits. This is the infrastructure pattern for production-safe agent deployment. YAML-based config means it slots naturally into existing IaC workflows.The Builder
611
T
TurboVec
Mixed50% Ship

2-4 bit vector compression that beats FAISS with zero training

Zero training time alone makes this worth evaluating for any production vector search system. If the FAISS recall and speed benchmarks hold up in your embedding space, switching could cut memory bills dramatically. Python bindings make it a drop-in experiment.The Builder
612
C
Cohere Command A
Mixed50% Ship

111B parameters. Enterprise-grade. Built to act, not just answer.

A 256K context window combined with first-class tool use and RAG support is exactly what production agentic pipelines need — no more awkward workarounds. The on-prem deployment option is a genuine differentiator for enterprise devs stuck behind data compliance walls. Cohere clearly designed this for people actually shipping agents, not writing blog posts about them.The Builder
613
P
Pieces
Mixed50% Ship

AI-powered developer workflow tool for code snippets

The API design is thoughtful. Integrates well with existing stacks.The Creator
614
H
Hoppscotch
Mixed50% Ship

Open-source API development ecosystem

Fast, reliable, and the docs are actually good. Ship.The Creator
615
D
Devin
Skip33% Ship

Autonomous AI software engineer by Cognition

Devin is early but directionally correct. The autonomous agent approach will win eventually. Cognition has the best shot at getting there first. Invest in the future, not the present.The Futurist
616
L
LM Studio
Skip33% Ship

Desktop app for running local LLMs with a ChatGPT-like UI

Solid execution. Does what it promises and the DX is clean.The Skeptic
617
F
Flutter
Skip33% Ship

Google's UI toolkit for multi-platform apps

Hot reload, custom rendering engine, and Dart is surprisingly pleasant. Best for custom UI that needs pixel-perfect cross-platform.The Builder
618
C
Cypress
Skip33% Ship

JavaScript end-to-end testing framework

The test runner UI and time-travel debugging are the most intuitive of any testing tool.The Creator
619
J
Jest
Skip33% Ship

Delightful JavaScript testing

Still the most used JS testing framework. Massive ecosystem of matchers, plugins, and documentation.The Builder
620
E
Electron
Skip33% Ship

Build cross-platform desktop apps with web technologies

Ship desktop apps with your web stack. VS Code proves Electron apps can be fast with the right engineering.The Builder
621
C
Contentful
Skip33% Ship

The composable content platform

Mature API, excellent SDKs, and the content model is flexible. The enterprise choice for headless CMS.The Builder
622
K
King Louie
Skip25% Ship

Indie desktop AI agent with smart LLM routing, 20 tools, and P2P mesh networking

The routing-across-providers model and P2P agent mesh are ideas that deserve more mainstream attention. Indie builders are often where the most interesting experiments happen before they become features in polished products. King Louie is a glimpse of what local agentic computing looks like.The Futurist
623
E
Evolver
Skip25% Ship

Auditable self-evolution engine for AI agents — no free-form prompt hacks

624
G
Goose
Skip25% Ship

The open-source AI agent that actually runs your code

Block's engineering pedigree shows here. This isn't a weekend side project—126 releases in, with SLSA provenance, MCP integration, and multi-LLM support baked in. The local execution model is genuinely compelling for anyone worried about sending proprietary code to Anthropic or OpenAI.Dev Patel
625
G
Glassbrain
Skip25% Ship

Time-travel debugging for AI apps — replay any trace, fix in one click

Two lines of setup and you can time-travel through your agent's reasoning. The AI-generated fix proposals powered by Claude are the killer feature—not just telling you what broke but showing you how to fix it with a diff. This would have saved me days on my last LangChain project.Dev Patel
626
L
Lilith-Zero
Skip25% Ship

Rust security middleware that stops AI agents from exfiltrating your data

The Kani formal verification and cargo-fuzz integration tell me this isn't just a vanity security project—it's been engineered to actually be correct. Sub-millisecond overhead means there's no reason not to run this in front of every MCP agent deployment. 15 stars seems like an embarrassing undercount given what this does.Dev Patel
627
A

Open-source runtime security covering all 10 OWASP agentic AI risks

9,500 tests and sub-millisecond policy enforcement out of the gate is impressive engineering. If you're shipping agents to production in a regulated industry, this is the governance layer you were going to have to build yourself anyway. Ship.
628
M
Mozzie
Skip25% Ship

Local-first desktop app that orchestrates AI coding agents in parallel

The rejection feedback loop is the killer feature here — most orchestration tools just retry blindly. Injecting the full attempt history plus your reason into the next prompt is the kind of detail that separates tools built by engineers who've felt the pain. Early but worth watching.
629
M

Enterprise multi-agent orchestration — Python and .NET, v1.0

The graph-based workflow model with time-travel debugging is a meaningful step beyond AutoGen's conversational loops. If you're on .NET or want a supported enterprise path, v1.0 stable APIs are a green light.
630
T
TurboQuant (OSS)
Skip25% Ship

Drop-in KV cache compression: 4–7x memory savings, zero accuracy loss

Drop-in HuggingFace cache replacement with no retraining and verified zero accuracy loss on multiple architectures is exactly what inference optimization should look like. The pip install story makes it trivially testable.
631
I
IBM StepZen
Skip0% Ship

GraphQL as a service

IBM acquisition slowed development. The auto-generation from REST to GraphQL was interesting but the market moved on.The Builder
632
T
Tabnine
Skip0% Ship

AI code assistant with privacy focus

Completion quality lags behind Copilot and Codeium. The privacy angle is the only differentiator.The Builder

Weekly AI Tool Verdicts

Get the digest in your inbox

7 critics. 1 verdict. New AI tool every day. Free.

Bookmarks

Loading bookmarks...

No bookmarks yet

Bookmark tools to save them for later