Question 1

Which is better: CatDoes v4 or Gemma 4 Multimodal Fine-Tuner?

Accepted Answer

Based on our expert panel, CatDoes v4 has a stronger verdict with a 75% Ship rate. CatDoes v4 received a panel verdict of Ship and Gemma 4 Multimodal Fine-Tuner received Ship.

Question 2

Is CatDoes v4 free?

Accepted Answer

CatDoes v4 pricing: Free (25 credits); from $20/mo

Question 3

Is Gemma 4 Multimodal Fine-Tuner free?

Accepted Answer

Gemma 4 Multimodal Fine-Tuner pricing: Open Source

Question 4

What do experts say about CatDoes v4 vs Gemma 4 Multimodal Fine-Tuner?

Accepted Answer

CatDoes v4: CatDoes v4 ships with Compose — an autonomous AI agent that runs on its own cloud computer to build mobile apps, websites, and internal tools from plain text descriptions. You describe what you want, Compose plans the work, writes code, runs tests, fixes its own errors, and deploys — even after you close the browser tab.

Every project comes pre-wired with a full backend stack: database, authentication, storage, edge functions, and real-time events. The v4 release focuses on higher reliability and GitHub integration for developers who want to export and own their codebase. Free plans start at 25 credits; paid plans begin at $20/month with more projects and higher cloud limits.

What distinguishes CatDoes from the crowded AI app builder space is the "own computer" framing. The agent doesn't just generate code for you to paste — it has an execution environment where it can actually run and debug the app, catching errors before you see them. Whether that closed-loop debugging holds up in practice for complex apps is the open question. Gemma 4 Multimodal Fine-Tuner: Gemma 4 Multimodal Fine-Tuner is an open-source toolkit that lets developers fine-tune Google's Gemma 4 and 3n models across all three modalities — text, images, and audio — using only Apple Silicon hardware. It runs natively on PyTorch with Metal Performance Shaders (MPS), bypassing the NVIDIA requirement that has historically blocked Mac users from serious local fine-tuning work.

The toolkit handles the full training pipeline including dataset prep, LoRA adapters, and multi-modal data collation. It ships with working example notebooks, a validation suite, and clean abstractions that don't require deep familiarity with the underlying MPS stack. Apple Silicon's unified memory architecture actually helps here — large multimodal batches fit in memory that would otherwise require GPU VRAM splitting on CUDA setups.

Posted to Hacker News on April 7 as a Show HN, it pulled 109 upvotes and 165 GitHub stars within hours. The timing is sharp: Gemma 4 just dropped days ago with new multimodal capabilities, and the community immediately wanted local fine-tuning. This fills that gap faster than Google's own tooling.

CatDoes v4 vs Gemma 4 Multimodal Fine-Tuner

CatDoes v4

Gemma 4 Multimodal Fine-Tuner

Bookmarks