Question 1

Which is better: Apfel or Gemini 2.5 Flash (Stable) with Thinking Mode?

Accepted Answer

Based on our expert panel, Gemini 2.5 Flash (Stable) with Thinking Mode has a stronger verdict with a 100% Ship rate. Apfel received a panel verdict of Ship and Gemini 2.5 Flash (Stable) with Thinking Mode received Ship.

Question 2

Is Apfel free?

Accepted Answer

Apfel pricing: Free / Open Source (MIT)

Question 3

Is Gemini 2.5 Flash (Stable) with Thinking Mode free?

Accepted Answer

Gemini 2.5 Flash (Stable) with Thinking Mode pricing: Free tier (Google AI Studio) / Pay-as-you-go via Gemini API: ~$0.15/1M input tokens (non-thinking), ~$3.50/1M input tokens (thinking mode)

Question 4

What do experts say about Apfel vs Gemini 2.5 Flash (Stable) with Thinking Mode?

Accepted Answer

Apfel: Every Apple Silicon Mac running macOS 26 Tahoe already has a ~3B parameter LLM installed — the same model powering Siri and Apple Intelligence. Apple just doesn't expose it to developers. Apfel is a MIT-licensed Swift CLI that unlocks it: run it as a pipe-friendly command, an interactive chat session, or a local HTTP server at localhost:11434 that's fully OpenAI SDK-compatible. Any existing codebase using the OpenAI client can point at it with a one-line config change and start using free, private, offline inference with zero API keys, zero cloud, and zero subscriptions.

The feature set is surprisingly complete for a developer side project. Apfel supports MCP tool/function calling, streaming JSON output, file attachments, five context-trimming strategies for the 4,096-token window, and a companion ecosystem of apps (apfel-chat, apfel-clip, apfel-gui). With 4,138 GitHub stars in under three weeks — fueled by a 513-point Hacker News thread — it's clearly filling a real gap that Apple intentionally left.

The constraints are real: macOS 26 Tahoe required, context window capped at ~3,000 words, and the model is not going to replace GPT-4 for complex reasoning. But as a privacy-preserving local LLM for scripts, quick queries, code reviews, and offline workflows, it's genuinely compelling. The underlying model is already sitting on tens of millions of machines. Apfel is just the key to the door Apple forgot to install. Gemini 2.5 Flash (Stable) with Thinking Mode: Google DeepMind has promoted Gemini 2.5 Flash to stable status, making its 'thinking mode' generally available via the Gemini API and Google AI Studio. The model delivers chain-of-thought reasoning at significantly lower latency and cost than Gemini 2.5 Pro, making it a practical choice for production reasoning workloads. Thinking mode can be toggled on or off per request, giving developers granular control over the cost-quality tradeoff.

Apfel vs Gemini 2.5 Flash (Stable) with Thinking Mode

Apfel

Gemini 2.5 Flash (Stable) with Thinking Mode

Bookmarks