P

PersonaPlex

NVIDIA's 7B voice model that talks and listens simultaneously — 70ms latency

PriceOpen model weights (research/non-commercial license)Reviewed2026-04-06

Expert verdict

Ship

3-1
3 Ships1 Skips
Visit github.com

The Panel's Take

PersonaPlex is NVIDIA's open research model for full-duplex voice conversation — meaning it processes incoming speech and generates its spoken response at the same time, enabling real interruptions, barge-ins, and natural conversational overlap. Current voice AI pipelines are walkie-talkie style: the AI waits for you to stop, processes, then responds. PersonaPlex eliminates that turn-taking constraint. The 7B-parameter model achieves ~70ms end-to-end response latency and handles persona and voice control through two mechanisms: a text prompt that describes the persona's personality and speaking style, and an optional audio sample for voice cloning. The duplex architecture means it can detect mid-sentence whether you're interrupting (and stop gracefully) versus just clearing your throat (and continue). It ships with inference code, persona configuration examples, and a demo server. PersonaPlex was released in January 2026 as open research and is gaining significant traction this week (295 new stars today) as developers building voice agents discover it. The open model weights make it deployable on NVIDIA hardware without API dependencies, and the 7B scale means it runs comfortably on a single A100 or H100. The primary constraint is that full-duplex requires low-latency streaming infrastructure — it's not a drop-in for existing HTTP-based voice pipelines.

Share this verdict

PersonaPlex verdict: SHIP 🚀

3 ships · 1 skip from the expert panel

Full review: shiporskip.io/tool/nvidia-personaplex-full-duplex-voice-ai-7b-interruption

Weekly AI Tool Verdicts

Get the next verdict in your inbox

7 critics review a new AI tool every day. Weekly digest — free.

Embed this verdict

Tool makers can add a live ShipOrSkip badge to their site. Badge loads track impressions; clicks route back to this review.

Ship · 7.5/10
HTML badge
<a href="https://shiporskip.io/api/badge-click/nvidia-personaplex-full-duplex-voice-ai-7b-interruption" target="_blank" rel="noopener"><img src="https://shiporskip.io/api/badge/nvidia-personaplex-full-duplex-voice-ai-7b-interruption" alt="PersonaPlex Ship verdict on ShipOrSkip" width="360" height="90" /></a>
Markdown badge
[![PersonaPlex Ship verdict on ShipOrSkip](https://shiporskip.io/api/badge/nvidia-personaplex-full-duplex-voice-ai-7b-interruption)](https://shiporskip.io/api/badge-click/nvidia-personaplex-full-duplex-voice-ai-7b-interruption)
Iframe widget
<iframe src="https://shiporskip.io/embed/nvidia-personaplex-full-duplex-voice-ai-7b-interruption" title="PersonaPlex ShipOrSkip verdict" width="360" height="260" style="border:0;border-radius:16px;max-width:100%;" loading="lazy"></iframe>

The reviews

70ms with real interruption handling is a leap over anything I've built with pipeline-based approaches. The persona control via text prompt is flexible enough to cover most use cases. The main engineering challenge is the streaming infrastructure — this isn't plug-and-play, you need WebSocket or WebRTC plumbing — but for serious voice agent work, that's worth the investment.

Helpful?

Full-duplex in a research model doesn't mean production-ready full-duplex. The non-commercial research license blocks most commercial deployments, and NVIDIA-specific optimization creates hardware lock-in. OpenAI and ElevenLabs already have managed full-duplex APIs; wait for a commercial-licensed version before building on this.

Helpful?

Full-duplex voice AI removes the last major uncanny valley in AI conversation — the awkward pause while the model waits. Once this pattern is widespread, conversations with AI agents will feel phonically indistinguishable from human calls. PersonaPlex is the open-source reference architecture for that future; competitors will ship commercial versions within months.

Helpful?

The voice persona control is compelling for content creators building AI hosts or characters — you describe the personality and voice in text, provide an audio sample, and you get a consistent character. For podcasters and interactive content, this is a meaningful creative tool once it reaches more accessible hardware.

Helpful?

Bookmarks

Loading bookmarks...

No bookmarks yet

Bookmark tools to save them for later