Agents

AI agent reasoning and tool selection tests planning, tool choice, and self-diagnosis capabilities. These skills are crucial for autonomous ticket triage systems and complex agent workflows where models must decide which tools to use, plan multi-step processes, and diagnose when things go wrong. This represents some of the most challenging tasks as they require open-ended reasoning and decision-making.

60
cerebras/llama-4-maverick-17b-128e-instruct
Average duration
3s
Average tokens
2037
Average cost
$0.00
100
4s
1107
opper_agents_sample_01
100
3s
1317
opper_agents_sample_02
50
3s
1281
opper_agents_sample_03
100
3s
1515
opper_agents_sample_04
100
3s
1450
opper_agents_sample_05
50
3s
1619
opper_agents_sample_06
50
3s
1696
opper_agents_sample_07
50
3s
1733
opper_agents_sample_08
0
3s
2937
opper_agents_sample_09
0
3s
2844
opper_agents_sample_10
50
3s
2712
opper_agents_sample_11
100
3s
1333
opper_agents_sample_12
100
3s
1401
opper_agents_sample_13
100
3s
1750
opper_agents_sample_14
0
4s
1458
opper_agents_sample_15
100
3s
1303
opper_agents_sample_16
0
3s
1402
opper_agents_sample_17
100
3s
1469
opper_agents_sample_18
100
3s
3501
opper_agents_sample_19
100
3s
3506
opper_agents_sample_20
100
3s
3617
opper_agents_sample_21
0
3s
2687
opper_agents_sample_22
0
3s
3667
opper_agents_sample_23
0
3s
1571
opper_agents_sample_24