Question 1

Which is better: Tencent Hy3 Preview or VoxCPM2?

Accepted Answer

Based on our expert panel, Tencent Hy3 Preview has a stronger verdict with a 75% Ship rate. Tencent Hy3 Preview received a panel verdict of Ship and VoxCPM2 received Ship.

Question 2

Is Tencent Hy3 Preview free?

Accepted Answer

Tencent Hy3 Preview pricing: Open Weights (Tencent Hy Community License); API from RMB 1.2/M tokens

Question 3

Is VoxCPM2 free?

Accepted Answer

VoxCPM2 pricing: Free / Open Source

Question 4

What do experts say about Tencent Hy3 Preview vs VoxCPM2?

Accepted Answer

Tencent Hy3 Preview: Tencent open-sourced Hy3 Preview on April 23, 2026 — the first model to emerge from the company's rebuilt AI infrastructure, and its most credible challenge to frontier closed models to date. With 295 billion total parameters but only 21 billion active at inference time (plus 3.8B MTP layer parameters), it's a Mixture-of-Experts architecture that punches far above its compute weight. The model supports up to 256K context and is available via Hugging Face, ModelScope, and GitCode under the Tencent Hy Community License.

On coding benchmarks, Hy3 scores 74.4% on SWE-bench Verified, 54.4% on Terminal-Bench 2.0, and 67.1% on BrowseComp — placing it firmly in the same tier as top models from Anthropic and OpenAI. Tencent claims a 40% efficiency improvement over its predecessor Hunyuan models, and pricing through Tencent Cloud TokenHub is aggressive: RMB 1.2 per million input tokens. A free two-week window at launch via OpenRouter made it widely accessible immediately.

The model was led by a team that includes former OpenAI researchers and has already been deployed across Tencent's core products — WeChat, Yuanbao, and QQ. That production integration is a meaningful signal: this isn't a benchmark vanity release. For developers who need a powerful, cost-efficient reasoning and agentic model with actual open weights, Hy3 Preview is one of the most interesting drops of April 2026. VoxCPM2: VoxCPM2 is a 2-billion-parameter text-to-speech model from OpenBMB that scraps discrete tokenization entirely, working directly in continuous latent space via a diffusion autoregressive architecture. Unlike dominant TTS approaches (VALL-E, Tortoise, XTTS), it never converts audio to discrete tokens — diffusion handles the full generation pipeline, resulting in 48kHz studio-quality output.

It supports 30 languages without requiring language tags, zero-shot voice cloning from reference audio, and — most distinctly — voice design from pure natural-language descriptions. You can prompt "a warm, slightly raspy woman in her 40s who sounds like a news anchor" and get a consistent new voice without providing any reference audio. Trained on 2M+ hours of multilingual data.

Released under Apache 2.0, making it commercially usable. The architecture diverges meaningfully from existing open-source TTS options and introduces a novel UX primitive (describe a voice, get a voice) that could reshape how developers approach voice synthesis in products.

Tencent Hy3 Preview vs VoxCPM2

Tencent Hy3 Preview

VoxCPM2

Bookmarks