cd ~/

When Agents Argue: Adversarial Multi-Agent Code Auditing Is Here

Microsoft's 100-agent security system found 16 real Windows vulnerabilities by making AI agents disagree with each other.

Evyatar Bluzer
3 min read

Finding bugs is not a consensus problem. It is an adversarial one. Microsoft just proved that at scale.

What Shipped

Microsoft's Autonomous Code Security team released MDASH (multi-model agentic scanning harness) on May 12. It orchestrates over 100 specialized AI agents across an ensemble of frontier and distilled models to discover, validate, and prove exploitable vulnerabilities end-to-end.

The result: 16 new vulnerabilities in the Windows networking and authentication stack. Four are critical remote code execution flaws in the TCP/IP stack and IKEv2 service. All shipped as fixes in May's Patch Tuesday.

On benchmarks, MDASH hit 96% recall against five years of confirmed MSRC vulnerabilities in clfs.sys. 100% recall in tcpip.sys. 88.45% on the CyberGym leaderboard - five points clear of every other system.

The Architecture That Matters

MDASH runs a five-stage pipeline:

  1. Prepare - ingests source, builds language-aware indices, maps attack surface from commit history
  2. Scan - specialized auditor agents examine candidate code paths and emit findings with evidence
  3. Validate - debater agents argue for and against each finding's exploitability
  4. Dedup - consolidates semantically equivalent findings
  5. Prove - generates triggering inputs and executes them to confirm the bug is real

Stage three is the breakthrough. Two agents look at the same candidate vulnerability. One argues it is exploitable. The other argues it is a false positive. Disagreement is the signal. When agents cannot converge, the finding gets escalated. When they agree it is noise, it gets dropped.

Different model classes serve different stages - heavyweight reasoning models for the debate, distilled models for cost-effective scanning, independent counterpoint models for validation. Domain-specific plugins inject context about kernel conventions and system invariants that no foundation model carries on its own.

What Changes for Builders

If you build multi-agent systems - and I do - this validates a design pattern worth internalizing. Instead of routing tasks to the single best agent, route them to multiple agents with opposing perspectives and let structured disagreement surface what consensus would miss.

Most multi-agent orchestration today optimizes for agreement. MDASH optimizes for productive friction. The debate mechanism turns disagreement into evidence quality. That is a fundamentally different coordination primitive.

Where This Heads

Adversarial multi-agent architectures will move beyond security. Code review, compliance validation, spec verification - any domain where false positives are expensive and false negatives are dangerous. The agents that find the hardest bugs will not be the smartest ones. They will be the ones most willing to argue.

Comments