Agents

AI agent reasoning and tool selection tests planning, tool choice, and self-diagnosis capabilities. These skills are crucial for autonomous ticket triage systems and complex agent workflows where models must decide which tools to use, plan multi-step processes, and diagnose when things go wrong. This represents some of the most challenging tasks as they require open-ended reasoning and decision-making.

75
cerebras/qwen-3-32b
Average duration
4s
Average tokens
2874
Average cost
$0.00
100
6s
1516
opper_agents_sample_01
100
3s
1612
opper_agents_sample_02
50
3s
1595
opper_agents_sample_03
100
3s
1615
opper_agents_sample_04
100
4s
2030
opper_agents_sample_05
100
4s
2141
opper_agents_sample_06
100
3s
2149
opper_agents_sample_07
100
4s
2904
opper_agents_sample_08
100
4s
3247
opper_agents_sample_09
0
4s
3361
opper_agents_sample_10
100
3s
3469
opper_agents_sample_11
100
3s
1584
opper_agents_sample_12
0
4s
3309
opper_agents_sample_13
100
3s
2020
opper_agents_sample_14
100
4s
2440
opper_agents_sample_15
100
4s
2014
opper_agents_sample_16
50
4s
2251
opper_agents_sample_17
100
4s
2550
opper_agents_sample_18
100
3s
3652
opper_agents_sample_19
100
3s
3827
opper_agents_sample_20
50
6s
6898
opper_agents_sample_21
50
6s
6344
opper_agents_sample_22
0
4s
4161
opper_agents_sample_23
0
10s
2284
opper_agents_sample_24