Context Reasoning

Context understanding and reasoning tasks test accurate answers grounded in provided context. This capability is essential for knowledge-base support bots, policy lookup systems, and internal knowledge Q&A applications. Models are evaluated on their ability to provide accurate answers that are properly grounded in the given context rather than hallucinating information.

32
mistral/mistral-small-eu
Average duration
6s
Average tokens
4168
Average cost
$0.00
0
5s
3430
opper_context_sample_01
0
10s
3395
opper_context_sample_02
100
5s
3416
opper_context_sample_03
0
8s
3519
opper_context_sample_04
50
4s
3425
opper_context_sample_05
0
8s
3570
opper_context_sample_06
0
4s
3443
opper_context_sample_07
0
3s
3442
opper_context_sample_08
100
4s
3464
opper_context_sample_09
50
4s
3469
opper_context_sample_10
50
17s
4489
opper_context_sample_11
100
4s
4531
opper_context_sample_12
100
4s
4436
opper_context_sample_13
50
4s
4492
opper_context_sample_14
100
5s
4437
opper_context_sample_15
100
4s
4366
opper_context_sample_16
50
3s
4485
opper_context_sample_17
0
4s
8472
opper_context_sample_18
0
6s
8439
opper_context_sample_19
0
14s
8441
opper_context_sample_20
0
8s
8481
opper_context_sample_21
0
15s
8437
opper_context_sample_22
0
14s
8461
opper_context_sample_23
100
8s
8489
opper_context_sample_24
0
N/A
0
opper_context_sample_25
0
N/A
0
opper_context_sample_26
0
N/A
0
opper_context_sample_27
0
N/A
0
opper_context_sample_28
0
N/A
0
opper_context_sample_29
0
N/A
0
opper_context_sample_30