Context Reasoning

Context understanding and reasoning tasks test accurate answers grounded in provided context. This capability is essential for knowledge-base support bots, policy lookup systems, and internal knowledge Q&A applications. Models are evaluated on their ability to provide accurate answers that are properly grounded in the given context rather than hallucinating information.

0
cerebras/qwen-3-32b
Average duration
5s
Average tokens
3912
Average cost
$0.00
0
41s
3865
opper_context_sample_01
0
2s
3557
opper_context_sample_02
0
22s
3519
opper_context_sample_03
0
11s
6471
opper_context_sample_04
0
14s
3711
opper_context_sample_05
0
8s
3408
opper_context_sample_06
0
5s
3425
opper_context_sample_07
0
2s
3604
opper_context_sample_08
0
2s
4078
opper_context_sample_09
0
6s
3488
opper_context_sample_10
0
2s
3986
opper_context_sample_11
0
4s
3987
opper_context_sample_12
0
2s
3945
opper_context_sample_13
0
2s
3974
opper_context_sample_14
0
3s
3820
opper_context_sample_15
0
2s
3821
opper_context_sample_16
0
2s
3981
opper_context_sample_17
0
3s
7152
opper_context_sample_18
0
2s
7274
opper_context_sample_19
0
3s
7438
opper_context_sample_20
0
3s
7394
opper_context_sample_21
0
3s
7027
opper_context_sample_22
0
3s
7184
opper_context_sample_23
0
2s
7253
opper_context_sample_24
0
N/A
0
opper_context_sample_25
0
N/A
0
opper_context_sample_26
0
N/A
0
opper_context_sample_27
0
N/A
0
opper_context_sample_28
0
N/A
0
opper_context_sample_29
0
N/A
0
opper_context_sample_30