P

Parlor

Real-time voice + vision AI that runs 100% on your local machine

PriceOpen Source (MIT)Reviewed2026-04-06

Expert verdict

Ship

3-1
3 Ships1 Skips
Visit github.com

The Panel's Take

Parlor is an open-source Python/FastAPI app that gives you a fully local, real-time multimodal AI assistant — you speak to it and show it your camera, and it responds with synthesized voice, all on-device. It uses Gemma 4 for vision and language understanding and Kokoro for text-to-speech, delivering end-to-end latency of around 2.5-3 seconds on an Apple M3 Pro without touching any cloud API. What makes Parlor stand out is barge-in support — you can interrupt the AI mid-sentence, just like a real conversation — and cross-platform inference: MLX on macOS for GPU acceleration, ONNX on Linux. The creator benchmarked 83 tokens/second on an M3 Pro and provided reproducible setup instructions in under ten lines of shell. It surfaced on Hacker News as a 'Show HN' post and quickly accumulated over 50 upvotes, with developers praising the honest latency numbers and the fact that the entire stack — from audio capture to TTS playback — is open-sourceable and self-hostable with no API key required.

Share this verdict

Parlor verdict: SHIP 🚀

3 ships · 1 skip from the expert panel

Full review: shiporskip.io/tool/parlor-on-device-realtime-voice-vision-ai-local-gemma

Weekly AI Tool Verdicts

Get the next verdict in your inbox

7 critics review a new AI tool every day. Weekly digest — free.

Embed this verdict

Tool makers can add a live ShipOrSkip badge to their site. Badge loads track impressions; clicks route back to this review.

Ship · 7.5/10
HTML badge
<a href="https://shiporskip.io/api/badge-click/parlor-on-device-realtime-voice-vision-ai-local-gemma" target="_blank" rel="noopener"><img src="https://shiporskip.io/api/badge/parlor-on-device-realtime-voice-vision-ai-local-gemma" alt="Parlor Ship verdict on ShipOrSkip" width="360" height="90" /></a>
Markdown badge
[![Parlor Ship verdict on ShipOrSkip](https://shiporskip.io/api/badge/parlor-on-device-realtime-voice-vision-ai-local-gemma)](https://shiporskip.io/api/badge-click/parlor-on-device-realtime-voice-vision-ai-local-gemma)
Iframe widget
<iframe src="https://shiporskip.io/embed/parlor-on-device-realtime-voice-vision-ai-local-gemma" title="Parlor ShipOrSkip verdict" width="360" height="260" style="border:0;border-radius:16px;max-width:100%;" loading="lazy"></iframe>

The reviews

Finally a local voice+vision stack that actually benchmarks its own latency instead of hiding behind vague demos. The MLX path on Apple Silicon is fast, barge-in works, and the codebase is small enough to fork and own. This is the foundation I'd build a personal assistant on.

Helpful?

2.5-3 second latency is fine for demos but painfully slow for natural conversation — real barge-in at that speed still feels robotic. And Gemma 4 as the vision model is a step behind GPT-4V or Claude in accuracy. Until latency drops to sub-second, this is a weekend project, not a daily driver.

Helpful?

The local-first AI assistant with eyes and ears is the endgame for ambient computing. Parlor is the earliest working prototype of a future where your laptop has a persistent, private AI companion that sees what you see. Get familiar with this architecture now — it will be mainstream in 18 months.

Helpful?

Being able to point my camera at a draft design and ask what's wrong with this layout while talking out loud — all offline — is genuinely useful. The voice output quality from Kokoro is surprisingly good. I'd use this during creative sessions where I don't want to type.

Helpful?

Bookmarks

Loading bookmarks...

No bookmarks yet

Bookmark tools to save them for later