Context Reasoning

Context understanding and reasoning tasks test accurate answers grounded in provided context. This capability is essential for knowledge-base support bots, policy lookup systems, and internal knowledge Q&A applications. Models are evaluated on their ability to provide accurate answers that are properly grounded in the given context rather than hallucinating information.

50
cerebras/llama-4-maverick-17b-128e-instruct
Average duration
3s
Average tokens
3162
Average cost
$0.00
0
3s
3047
opper_context_sample_01
0
3s
2859
opper_context_sample_02
100
3s
2886
opper_context_sample_03
0
4s
3147
opper_context_sample_04
0
3s
2951
opper_context_sample_05
100
3s
2862
opper_context_sample_06
50
3s
2917
opper_context_sample_07
100
3s
2918
opper_context_sample_08
100
3s
2929
opper_context_sample_09
50
3s
2939
opper_context_sample_10
100
3s
3269
opper_context_sample_11
100
3s
3264
opper_context_sample_12
100
3s
3245
opper_context_sample_13
50
3s
3222
opper_context_sample_14
100
3s
3192
opper_context_sample_15
50
3s
3217
opper_context_sample_16
50
3s
3279
opper_context_sample_17
100
3s
6048
opper_context_sample_18
100
4s
6088
opper_context_sample_19
0
4s
6131
opper_context_sample_20
50
3s
6275
opper_context_sample_21
0
21s
6032
opper_context_sample_22
100
4s
6076
opper_context_sample_23
100
3s
6081
opper_context_sample_24
0
N/A
0
opper_context_sample_25
0
N/A
0
opper_context_sample_26
0
N/A
0
opper_context_sample_27
0
N/A
0
opper_context_sample_28
0
N/A
0
opper_context_sample_29
0
N/A
0
opper_context_sample_30