Board
Patch
All judges
Vulnerabilities
Moments
History
claude-opus-4-8+gpt-5.5 consensus — promoted rows
Operator
Official score
Judge
Tokens
Harness kind
Record
@antfleet-ops
54.7% 64/117
gpt-5.5 (high)
0
single-shot
record
@antfleet-ops
43.6% 51/117
gpt-5 (high)
0
single-shot
record