AI tool comparison
Claude Files API & Token-Efficient Tool Use vs qmd
Which one should you ship with? Here is the side-by-side panel verdict, pricing read, reviewer split, and community vote comparison.
Developer Tools
Claude Files API & Token-Efficient Tool Use
Upload once, reuse forever — Claude's API just got leaner and meaner
75%
Panel ship
—
Community
Paid
Entry
Anthropic's Files API lets developers upload documents once and reference them across multiple Claude API calls, slashing redundant token usage and reducing latency at scale. Paired with new token-efficient tool use patterns, the update targets agentic and multi-step workflows where repeated context injection was previously a costly bottleneck. Together, these additions make building production-grade Claude integrations meaningfully cheaper and faster.
Developer Tools
qmd
Local doc search engine with BM25 + vectors + LLM re-ranking — by Shopify's CEO
50%
Panel ship
—
Community
Free
Entry
qmd is a lightweight local search engine built by Tobi Luetke, CEO of Shopify, for indexing and querying personal knowledge bases, documentation, and meeting notes — entirely offline. It combines three retrieval approaches in a single pipeline: BM25 full-text search for exact keyword matches, vector semantic search via ONNX-based embeddings, and LLM re-ranking using GGUF models through node-llama-cpp. All three stages run locally with no cloud dependency. The tool ships in multiple deployment modes: a CLI for ad-hoc queries, a Node.js library for programmatic use, an HTTP service for local API access, and — most useful for AI workflows — a native MCP server that lets Claude Code, Cursor, and similar editors query your local knowledge base directly during coding sessions. The hybrid retrieval approach means it handles both "find the exact error message from last week's standup notes" and "what was our decision about the auth architecture" equally well. What makes this notable beyond its technical approach is provenance: Luetke shipped it as a personal tool he actually uses, not a startup product. The GitHub history shows active iteration and he's been talking about it on X. It's a credible signal of where pragmatic AI-augmented knowledge management is heading for technical users who prefer local-first tools.
Reviewer scorecard
“This is the quality-of-life update I didn't know I desperately needed. Stop re-uploading your 40-page spec doc on every API call — reference it once, pay for it once, and move on. Token-efficient tool use is also a game-changer for chained agentic tasks where tool schemas were eating a horrifying chunk of my context window.”
“Hybrid BM25 + vector + LLM re-rank is the right architecture for personal knowledge search — each layer catches what the others miss. The MCP server mode is genuinely useful: being able to ask Claude Code 'what did we decide about X last month' against my own notes changes the workflow. MIT licensed and from someone who ships real products.”
“Color me cautiously impressed — this is a real, practical improvement rather than vaporware capability bragging. My only side-eye is toward file storage management, retention policies, and what happens when your uploaded doc goes stale mid-workflow. Still, hard to argue against paying fewer tokens for the same result.”
“This is a well-executed weekend project, not a production tool. It requires GGUF models and manual embedding setup — a meaningful friction barrier for non-technical users. The 'built by a CEO' narrative drives GitHub stars more than the technical differentiation. Obsidian with a local AI plugin gets you here with better UX.”
“Honestly, this one's not for me — it's API plumbing aimed squarely at developers building on top of Claude, not creatives using it directly. If you're not writing integration code, there's nothing to interact with here. I'll check back when this shows up as a feature inside actual creative tools.”
“I manage a lot of notes, references, and creative briefs, but the setup friction here — GGUF models, CLI configuration — makes this inaccessible for most creators. The concept is great; the UX needs a front-end before it reaches beyond developers.”
“This is the infrastructure layer that makes truly persistent AI agents viable — shared document memory across calls is a foundational primitive, not a minor patch. When you combine Files API with efficient tool chaining, you're starting to see the scaffolding for autonomous, long-horizon AI workflows emerge. Anthropic is quietly building the rails for the agentic era.”
“The pattern here — local hybrid retrieval as an MCP server feeding into AI coding agents — will be ubiquitous in two years. Today it's a technical power-user tool; tomorrow it's how everyone's AI assistant knows the institutional context behind the code. qmd is an early, clean implementation of that pattern.”
Weekly AI Tool Verdicts
Get the next comparison in your inbox
New AI tools ship daily. We compare them before you waste an afternoon.