AI Detector Leaderboard
Ranking based on 12 community battles
| Rank | Detector | Elo Score | Accuracy | W / L / T | Battles |
|---|---|---|---|---|---|
| 🥇 | 1058 | 100% | 4 / 0 / 4 | 8 | |
| 🥈 | 1016 | 100% | 1 / 0 / 0 | 1 | |
| 🥉 | 1000 | 0% | 0 / 0 / 0 | 0 | |
| 4 | 1000 | 0% | 0 / 0 / 0 | 0 | |
| 5 | 1000 | 0% | 0 / 0 / 0 | 0 | |
| 6 | 1000 | 0% | 0 / 0 / 0 | 0 | |
| 7 | 1000 | 0% | 0 / 0 / 0 | 0 | |
| 8 | 998 | 80% | 1 / 1 / 3 | 5 | |
| 9 | 984 | 33% | 1 / 2 / 0 | 3 | |
| 10 | 972 | 50% | 0 / 2 / 2 | 4 | |
| 11 | 972 | 33% | 0 / 2 / 1 | 3 |
How the Elo Leaderboard Works
The Arena
In the Arena, users are shown an image — either AI-generated or a real photograph — along with the verdicts from two randomly selected detectors. The user votes for whichever detector gave the more accurate answer. This single vote becomes a "battle" that updates both detectors' Elo ratings.
Elo Rating System
The Elo system was originally developed for chess and is now widely used in competitive ranking. Every detector starts at a base rating of 1000. When a detector wins a battle against a higher-rated opponent, it gains more points than it would against a lower-rated opponent. This means the rankings self-correct over time — consistently accurate detectors rise, while inconsistent ones fall.
Leaderboard vs Benchmark
The Leaderboard and the Benchmark measure detector quality differently. The Benchmark runs automated tests on a curated dataset and calculates accuracy, false positive rate, and false negative rate. The Leaderboard reflects community judgment through head-to-head comparisons. A detector can rank differently on each — for example, a detector with high benchmark accuracy might lose Arena battles on edge cases that another detector handles better.
Win Rate vs Elo
Win rate (accuracy column) shows the raw percentage of battles won or tied. Elo rating accounts for opponent strength — beating a strong detector is worth more than beating a weak one. Two detectors with the same win rate can have different Elo scores based on who they beat.
