🏆 Best Performing Model
google/gemini-2.0-flash-001
| LLM Model | Win % | Wins | Losses | Total Completed |
|---|---|---|---|---|
google/gemini-2.0-flash-001 | 5.6% | 5 | 85 | 90 |
qwen/qwen3-235b-a22b | 2.2% | 2 | 88 | 90 |
anthropic/claude-3.7-sonnet | 1.1% | 1 | 89 | 90 |
deepseek/deepseek-chat-v3-0324 | 1.1% | 1 | 89 | 90 |
x-ai/grok-3-mini-beta | 0% | 0 | 90 | 90 |
mistralai/mistral-medium-3 | 0% | 0 | 90 | 90 |
openai/gpt-4o | 0% | 0 | 90 | 90 |
meta-llama/llama-3.3-70b-instruct | 0% | 0 | 90 | 90 |