Question 1

Which is better: Qwen3.6-Plus or Qwen3.5-Omni?

Accepted Answer

Based on our expert panel, Qwen3.6-Plus has a stronger verdict with a 75% Ship rate. Qwen3.6-Plus received a panel verdict of Ship and Qwen3.5-Omni received Ship.

Question 2

Is Qwen3.6-Plus free?

Accepted Answer

Qwen3.6-Plus pricing: Free (preview) / Paid API

Question 3

Is Qwen3.5-Omni free?

Accepted Answer

Qwen3.5-Omni pricing: Proprietary / API (Alibaba Cloud)

Question 4

What do experts say about Qwen3.6-Plus vs Qwen3.5-Omni?

Accepted Answer

Qwen3.6-Plus: Qwen3.6-Plus is Alibaba's latest frontier model, built specifically for agentic real-world tasks with a particular emphasis on software engineering. Released in preview on OpenRouter as a free tier, it scores 61.6 on Terminal-Bench 2.0, edging past Claude Opus 4.5 (59.3), while running at roughly 3x the speed. It supports a 1M token context window with 65K output tokens — larger than most competitors.

Under the hood, Qwen3.6-Plus is a sparse mixture-of-experts architecture, activating a fraction of its parameters per forward pass for efficiency. It supports both text and multimodal inputs, and the API supports tool use natively — making it well-suited for agent loops. The free preview is positioned as a direct challenge to OpenAI and Anthropic in the agentic coding space.

The timing is notable: released the same week as Google Gemma 4 and Cursor 3, signaling an industry-wide pivot from autocomplete to full autonomous agents. With free preview access already expiring, Alibaba is clearly using the buzz from benchmark dominance to drive early adoption at the API tier. Qwen3.5-Omni: Qwen3.5-Omni is Alibaba's most advanced multimodal model yet — a native Thinker-Talker architecture that processes and generates text, audio, and video in a single unified system. Released in three variants (Plus, Flash, Light), it supports a 256k context window, 10+ hours of audio, and 400 seconds of 720p video at 1 FPS, with speech recognition across 113 languages and dialects.

The headline capability is what Alibaba is calling "Audio-Visual Vibe Coding" — an emergent behavior where the model writes functional code based solely on watching a video and listening to spoken instructions. In demos, it takes a hand-drawn sketch held up to a camera and converts it into a working React webpage in real time. This wasn't an explicitly trained capability; it emerged from the model's unified multimodal architecture.

The model uses semantic interruption and turn-taking intent recognition for real-time interaction, and TMRoPE for temporal multimodal position encoding. The catch: Alibaba broke from its open-source streak and kept Qwen3.5-Omni proprietary, accessible only through their chatbot interface and Alibaba Cloud. The open-source community has noticed — and is not pleased.

Qwen3.6-Plus vs Qwen3.5-Omni

Qwen3.6-Plus

Qwen3.5-Omni

Bookmarks