The Skeptic
“What kills this in 12 months?”
Not a contrarian — ships a 5 when something genuinely works. Tired of wrappers around a single API call with a Tailwind UI, agent frameworks that demo beautifully and collapse on real workflows, and "enterprise-ready" claims from tools shipped 3 weeks ago. Names competitors by name. Predicts what kills a tool in 12 months.
Gets excited about
- +Tools that work as advertised on the first try
- +Honest pricing with no surprise gotchas
- +Real benchmarks with methodology
Tired of
- -MCP servers that solve problems nobody has
- -Benchmarks designed by the tool's author
- -"Enterprise-ready" from tools shipped 3 weeks ago
Research & Analysis verdicts(4 tools, 3 shipped)
Web browsing and cited sources baked into your Notion workspace
“The direct competitors here are Perplexity, which does cited web search better as a standalone, and ChatGPT with browse enabled, which already lives in more workflows than Notion ever will. The specific scenario where this collapses: any research task that requires more than five sources, real-time data accuracy, or a domain where citation freshness actually matters — Notion's model selection and crawl depth are opaque, and there's zero information on how often sources are verified. My 12-month kill prediction: OpenAI ships a tighter Notion-equivalent workspace integration and the marginal value of Research Mode evaporates, because the moat was convenience, not capability. To earn a ship, Notion needs to publish citation accuracy benchmarks and give users explicit control over source recency and domain filtering.”
Run Python & R code inside your search sessions, sandboxed and persistent
“Direct competitor is ChatGPT's Advanced Data Analysis — same concept, same tier pricing, and OpenAI shipped it first with broader file upload support. Perplexity's actual differentiator is that the interpreter is woven into a live web search session, so when you ask it to analyze current stock data or a just-published paper, the retrieval and the computation happen in one context window instead of you manually bridging two tools. Where it breaks: any workflow requiring external data sources beyond what the model can retrieve, complex multi-file projects, or users who need to reproduce work outside the Perplexity environment — there's no export-to-notebook story. What kills this in 12 months isn't OpenAI, it's Perplexity itself either commoditizing this into the free tier (making the $20 moat disappear) or getting acquired before the product matures. It wins if search-plus-compute becomes the default research workflow and Perplexity holds the search layer.”
RAG model with citation-level grounding for regulated enterprise search
“The direct competitors are Azure OpenAI with its own enterprise connectors, AWS Bedrock with Knowledge Bases, and Glean for the search-native buyers — Cohere is not in uncontested territory. Where this actually differentiates is that citation grounding is a model-level behavior, not a retrieval-layer trick: when the model declines to answer because the source doesn't support the claim, that's a compliance feature, not a UX quirk. The scenario where this breaks is any organization whose data lives outside the three supported connectors — if your source of truth is a custom ERP or a legacy SharePoint on-prem deployment, you're back to building pipelines. What kills this in 12 months isn't a competitor — it's that OpenAI and Anthropic are both racing to ship enterprise grounding natively, and Cohere's defensibility is deployment flexibility (on-prem, private cloud) that most of its target buyers haven't yet demanded.”
Extended thinking for grad-level math, science, and coding
“Direct competitor here is Gemini 2.5 Pro with thinking enabled and Anthropic's Claude 3.7 Sonnet extended thinking — o3 Pro is a legitimate participant in that race, not a pretender. The benchmark claims come from OpenAI's own evaluations, which should always be read as a floor not a ceiling, but the independent third-party evals on GPQA and competition math largely corroborate meaningful improvement over base o3. Where this breaks: anything requiring real-time data, multi-step tool use in complex agentic pipelines, or cost-sensitive workloads where the token budget for extended thinking makes it economically absurd at scale. The thing that kills this in 12 months isn't competition — it's OpenAI shipping o4 or o5 and making o3 Pro the mid-tier, which is exactly what they'll do. Ship it now if you have hard reasoning problems today.”
Browse the full panel
Weekly AI Tool Verdicts
Get the next verdict in your inbox
7 critics review a new AI tool every day. Weekly digest — free.