First benchmark · voice AI platforms

Are AI agents building
with your tool?

When AI coding agents build real software, they pick the SDKs, APIs, and MCP servers. AgentRank measures which ones get installed, shipped, and kept — not just whether a chatbot mentions you.

or email hello@agentrankhq.com — we scope the run, share the raw methodology, and return your funnel in ~3 weeks.

agentrank · build a Node.js voice app · 50 runs / agent Sample view
Recommended82%
Installed61%
Build passed78%
Retained70%
First-install rate · across leading agentsillustrative
01your tool you64%
02vendor · b48%
03vendor · c31%
04self-hosted / DIY19%
First category benchmarked
Voice AI platforms
Run June 2026 · across frontier coding agents

The platform agents recommended most wasn't the one they actually built with — they often assembled a DIY stack instead. That recommend→build gap is the whole game.

Which platform? That's in the briefing.

How it's measured
  • Deterministic detection — no AI grading AI
  • Wilson 95% intervals · reported with sample size
  • 30-run reportability floor
Methodology v0.2 →
Directional pilot — sample sizes labeled in the methodology. No vendor named publicly, ever.
Deterministic detection — no AI grading AI Public, versioned methodology No sponsored placement, ever Synthetic benchmarks, always labeled

Which tools agents reach for is a black box.
AgentRank turns it into a number you can move.

What the benchmark shows

The whole funnel — and where you're losing it.

Four numbers, each backed by a chart that says exactly where you stand. The charts below illustrate the funnel shape; your real numbers come from the run.

Stage 1 · Agent Reach

Most tools are invisible to the agents your customers use.

How often agents propose you when asked to build — tracked at every model release, because preferences reshuffle the day a new model ships.

Agent reach over model releases
your toolcategory averagenext best
Stage 2 · First-Install Rate

Recommended isn't installed.

Agents name plenty of tools, then install one. The gap between what they say and what they do is where adoption quietly leaks.

Recommended → installed
Recommended82%
Installed61%
21-point drop — the recommend→install leak
Stage 3 · Build Success

A broken build sends the agent to someone else.

An agent installs you, then the build breaks on a stale quickstart — and it quietly retries with a competitor. We run the build and record the result.

Build outcome when installed
passed · 78%broke on install · 22%
Stage 4 · Refactor Retention

The tools agents keep through a refactor are the ones that win.

Agents refactor codebases. Retention measures whether your integration survives a "modernize this" pass — or gets silently swapped for a rival or a self-hosted stack.

Kept after a refactor pass
retained · 70%migrated away · 30%
How it works

A real build, scored by what happened in the repo.

Never an LLM grading an LLM. The install either happened or it didn't; the build either passed or it failed.

01

Define the task

A real thing a developer asks an agent to build — across founder, cost, enterprise, and open-source framings. Prompts are public.

02

Run real agents

The same coding agents your customers use — driven against clean repos, greenfield and brownfield, with and without incumbents.

03

Score the actions

Dependency diffs and build results — installed or not, builds or not, kept or ripped out. Deterministic, never an LLM.

04

Deliver the report

Your funnel with sample sizes and confidence intervals, re-run at every model release. Standing is a moving target.

Agent preferences reshuffle the day a new model ships. The teams who know their First-Install Rate this quarter compound a lead on the ones who find out next year.

Agents are already picking winners.
You don't have to guess which side you're on.

or email hello@agentrankhq.com