Compare/Gemma Tuner Multimodal vs qmd

AI tool comparison

Gemma Tuner Multimodal vs qmd

Which one should you ship with? Here is the side-by-side panel verdict, pricing read, reviewer split, and community vote comparison.

G

Developer Tools

Gemma Tuner Multimodal

Fine-tune Gemma 4 with audio + vision on Apple Silicon — no NVIDIA needed

Ship

75%

Panel ship

Community

Free

Entry

Gemma Tuner Multimodal is an open-source fine-tuning toolkit for Google's Gemma 4 and Gemma 3n models that runs entirely on Apple Silicon using PyTorch with Metal Performance Shaders (MPS) backend — no NVIDIA GPU or cloud infrastructure required. It supports LoRA training on multimodal inputs: audio, images, and text simultaneously, using local CSV files or streamed from Google Cloud Storage or BigQuery. The tool targets the growing segment of developers who own M-series Macs but have been locked out of fine-tuning workflows that assume CUDA availability. Gemma 4's architecture is particularly well-suited to this use case: its 4B multimodal variant (designed for on-device deployment) trains efficiently on M3 Max and M4 Pro hardware within the available unified memory constraints. Primary use cases include medical transcription fine-tuning (audio → text with clinical terminology), visual QA systems (image + text → structured response), and private on-device pipelines where cloud API calls are prohibited by compliance requirements. The project fills a specific niche that Google's own fine-tuning documentation doesn't cover well for Apple hardware.

Q

Developer Tools

qmd

Local doc search engine with BM25 + vectors + LLM re-ranking — by Shopify's CEO

Mixed

50%

Panel ship

Community

Free

Entry

qmd is a lightweight local search engine built by Tobi Luetke, CEO of Shopify, for indexing and querying personal knowledge bases, documentation, and meeting notes — entirely offline. It combines three retrieval approaches in a single pipeline: BM25 full-text search for exact keyword matches, vector semantic search via ONNX-based embeddings, and LLM re-ranking using GGUF models through node-llama-cpp. All three stages run locally with no cloud dependency. The tool ships in multiple deployment modes: a CLI for ad-hoc queries, a Node.js library for programmatic use, an HTTP service for local API access, and — most useful for AI workflows — a native MCP server that lets Claude Code, Cursor, and similar editors query your local knowledge base directly during coding sessions. The hybrid retrieval approach means it handles both "find the exact error message from last week's standup notes" and "what was our decision about the auth architecture" equally well. What makes this notable beyond its technical approach is provenance: Luetke shipped it as a personal tool he actually uses, not a startup product. The GitHub history shows active iteration and he's been talking about it on X. It's a credible signal of where pragmatic AI-augmented knowledge management is heading for technical users who prefer local-first tools.

Decision
Gemma Tuner Multimodal
qmd
Panel verdict
Ship · 3 ship / 1 skip
Mixed · 2 ship / 2 skip
Community
No community votes yet
No community votes yet
Pricing
Open Source / Free
Free, open source (MIT)
Best for
Fine-tune Gemma 4 with audio + vision on Apple Silicon — no NVIDIA needed
Local doc search engine with BM25 + vectors + LLM re-ranking — by Shopify's CEO
Category
Developer Tools
Developer Tools

Reviewer scorecard

Builder
80/100 · ship

Finally something that treats Apple Silicon as a first-class fine-tuning target, not an afterthought. LoRA on Gemma 4 multimodal for domain-specific tasks — medical, legal, private enterprise — is a genuinely underserved workflow. This is the tool the community needed.

80/100 · ship

Hybrid BM25 + vector + LLM re-rank is the right architecture for personal knowledge search — each layer catches what the others miss. The MCP server mode is genuinely useful: being able to ask Claude Code 'what did we decide about X last month' against my own notes changes the workflow. MIT licensed and from someone who ships real products.

Skeptic
45/100 · skip

MPS backend for fine-tuning is still meaningfully slower than CUDA for most workloads, and Gemma 4's multimodal capabilities are weaker than the top closed models. For production use cases, you'll still want a cloud GPU for the training run even if you deploy locally after.

45/100 · skip

This is a well-executed weekend project, not a production tool. It requires GGUF models and manual embedding setup — a meaningful friction barrier for non-technical users. The 'built by a CEO' narrative drives GitHub stars more than the technical differentiation. Obsidian with a local AI plugin gets you here with better UX.

Futurist
80/100 · ship

The laptop-as-AI-training-cluster future is closer than most think. Apple's Neural Engine roadmap has MPS compute doubling every 18 months. Fine-tuning workflows that work on today's M4 Pro will run on tomorrow's M5 in an hour instead of overnight.

80/100 · ship

The pattern here — local hybrid retrieval as an MCP server feeding into AI coding agents — will be ubiquitous in two years. Today it's a technical power-user tool; tomorrow it's how everyone's AI assistant knows the institutional context behind the code. qmd is an early, clean implementation of that pattern.

Creator
80/100 · ship

Being able to fine-tune a model on my own creative portfolio and voice without sending my work to a cloud provider is a privacy game-changer. Custom style models trained locally, owned fully — this is the future of personalized creative AI.

45/100 · skip

I manage a lot of notes, references, and creative briefs, but the setup friction here — GGUF models, CLI configuration — makes this inaccessible for most creators. The concept is great; the UX needs a front-end before it reaches beyond developers.

Weekly AI Tool Verdicts

Get the next comparison in your inbox

New AI tools ship daily. We compare them before you waste an afternoon.

Bookmarks

Loading bookmarks...

No bookmarks yet

Bookmark tools to save them for later