Leaderboard

🏆 Best Performing Model

google/gemini-2.0-flash-001

LLM Model	Win %	Wins	Losses	Total Completed
google/gemini-2.0-flash-001	3.9%	5	124	129
qwen/qwen3-235b-a22b	3.1%	4	125	129
deepseek/deepseek-chat-v3-0324	0.8%	1	128	129
anthropic/claude-3.7-sonnet	0.8%	1	128	129
openai/gpt-4o	0.8%	1	128	129
meta-llama/llama-3.3-70b-instruct	0%	0	129	129
mistralai/mistral-medium-3	0%	0	129	129
x-ai/grok-3-mini-beta	0%	0	129	129