Question 1

Which is better: Speechmatics or VoxCPM2?

Accepted Answer

Based on our expert panel, VoxCPM2 has a stronger verdict with a 75% Ship rate. Speechmatics received a panel verdict of Ship and VoxCPM2 received Ship.

Question 2

Is Speechmatics free?

Accepted Answer

Speechmatics pricing: Enterprise pricing

Question 3

Is VoxCPM2 free?

Accepted Answer

VoxCPM2 pricing: Free / Open Source (Apache 2.0)

Question 4

What do experts say about Speechmatics vs VoxCPM2?

Accepted Answer

Speechmatics: Speechmatics offers high-accuracy speech recognition with 50+ languages, on-premises deployment, and enterprise security. Strong for regulated industries. VoxCPM2: VoxCPM2 is a 2B-parameter text-to-speech system from OpenBMB — the team behind MiniCPM — built around a tokenizer-free, diffusion-autoregressive architecture. Most TTS systems convert text to discrete audio tokens first, then decode those tokens to waveform. VoxCPM2 skips the tokenization step entirely, operating in continuous latent space. The result is 48kHz output with smoother prosody and finer pitch control than token-based systems.

The headline feature is "Voice Design": you describe a voice in natural language — "a confident male voice, mid-Atlantic accent, slightly gravelly, deliberate pacing" — and VoxCPM2 synthesizes a brand-new voice from that description without any reference audio sample. This is architecturally different from voice cloning (which requires samples) and voice selection (which picks from a catalog). It supports 30 languages with automatic detection, no language tags required.

The model runs on consumer hardware (~8GB VRAM), integrates with the MiniCPM-4 language model backbone, and is released under Apache 2.0. For developers building multilingual voice products or researchers exploring generative voice control, VoxCPM2 represents a meaningful step beyond current open TTS leaders like F5-TTS and CosyVoice.

Speechmatics vs VoxCPM2

Speechmatics

VoxCPM2

Bookmarks