Agents

AI agent reasoning and tool selection tests planning, tool choice, and self-diagnosis capabilities. These skills are crucial for autonomous ticket triage systems and complex agent workflows where models must decide which tools to use, plan multi-step processes, and diagnose when things go wrong. This represents some of the most challenging tasks as they require open-ended reasoning and decision-making.

67
fireworks/qwen3-coder-480b-a35b-instruct
Average duration
6s
Average tokens
2175
Average cost
$0.00
100
5s
1168
opper_agents_sample_01
100
7s
1476
opper_agents_sample_02
50
4s
1256
opper_agents_sample_03
100
8s
1790
opper_agents_sample_04
50
4s
1486
opper_agents_sample_05
50
5s
1574
opper_agents_sample_06
50
4s
1675
opper_agents_sample_07
100
6s
1885
opper_agents_sample_08
100
5s
3074
opper_agents_sample_09
100
4s
2956
opper_agents_sample_10
100
5s
2864
opper_agents_sample_11
0
8s
1308
opper_agents_sample_12
100
6s
1436
opper_agents_sample_13
100
4s
1840
opper_agents_sample_14
0
6s
1573
opper_agents_sample_15
100
4s
1309
opper_agents_sample_16
0
7s
1561
opper_agents_sample_17
100
5s
1585
opper_agents_sample_18
100
5s
3618
opper_agents_sample_19
100
5s
3568
opper_agents_sample_20
100
13s
4772
opper_agents_sample_21
0
7s
2988
opper_agents_sample_22
0
7s
3834
opper_agents_sample_23
0
4s
1610
opper_agents_sample_24