Context Reasoning

Context understanding and reasoning tasks test accurate answers grounded in provided context. This capability is essential for knowledge-base support bots, policy lookup systems, and internal knowledge Q&A applications. Models are evaluated on their ability to provide accurate answers that are properly grounded in the given context rather than hallucinating information.

68
cerebras/qwen-3-32b
Average duration
4s
Average tokens
4120
Average cost
$0.00
0
6s
7338
opper_context_sample_01
100
4s
3538
opper_context_sample_02
100
3s
3498
opper_context_sample_03
0
5s
7981
opper_context_sample_04
50
3s
3826
opper_context_sample_05
100
5s
3507
opper_context_sample_06
50
6s
4050
opper_context_sample_07
100
3s
3466
opper_context_sample_08
100
3s
3567
opper_context_sample_09
100
3s
3484
opper_context_sample_10
100
4s
4273
opper_context_sample_11
100
5s
3992
opper_context_sample_12
100
8s
3943
opper_context_sample_13
100
3s
3917
opper_context_sample_14
100
7s
3790
opper_context_sample_15
100
3s
3942
opper_context_sample_16
100
2s
3975
opper_context_sample_17
100
3s
7074
opper_context_sample_18
100
3s
7202
opper_context_sample_19
100
3s
7112
opper_context_sample_20
50
3s
7494
opper_context_sample_21
100
3s
7156
opper_context_sample_22
100
14s
7423
opper_context_sample_23
100
7s
8065
opper_context_sample_24
0
N/A
0
opper_context_sample_25
0
N/A
0
opper_context_sample_26
0
N/A
0
opper_context_sample_27
0
N/A
0
opper_context_sample_28
0
N/A
0
opper_context_sample_29
0
N/A
0
opper_context_sample_30