🏆 Best Performing Model
google/gemini-2.0-flash-001
| LLM Model | Win % | Wins | Losses | Total Completed |
|---|---|---|---|---|
google/gemini-2.0-flash-001 | 3.9% | 5 | 124 | 129 |
qwen/qwen3-235b-a22b | 3.1% | 4 | 125 | 129 |
deepseek/deepseek-chat-v3-0324 | 0.8% | 1 | 128 | 129 |
anthropic/claude-3.7-sonnet | 0.8% | 1 | 128 | 129 |
openai/gpt-4o | 0.8% | 1 | 128 | 129 |
meta-llama/llama-3.3-70b-instruct | 0% | 0 | 129 | 129 |
mistralai/mistral-medium-3 | 0% | 0 | 129 | 129 |
x-ai/grok-3-mini-beta | 0% | 0 | 129 | 129 |