Context Reasoning

Context understanding and reasoning tasks test accurate answers grounded in provided context. This capability is essential for knowledge-base support bots, policy lookup systems, and internal knowledge Q&A applications. Models are evaluated on their ability to provide accurate answers that are properly grounded in the given context rather than hallucinating information.

78
fireworks/qwen3-coder-480b-a35b-instruct
Average duration
8s
Average tokens
21208
Average cost
$0.00
0
10s
3648
opper_context_sample_01
50
4s
3133
opper_context_sample_02
100
4s
3155
opper_context_sample_03
0
9s
3630
opper_context_sample_04
0
5s
3226
opper_context_sample_05
100
4s
3151
opper_context_sample_06
50
8s
3417
opper_context_sample_07
100
5s
3224
opper_context_sample_08
100
5s
3185
opper_context_sample_09
100
7s
3204
opper_context_sample_10
100
6s
3790
opper_context_sample_11
100
5s
3680
opper_context_sample_12
100
5s
3651
opper_context_sample_13
100
5s
3656
opper_context_sample_14
100
4s
3600
opper_context_sample_15
100
6s
3643
opper_context_sample_16
100
7s
3726
opper_context_sample_17
100
14s
6813
opper_context_sample_18
100
11s
6877
opper_context_sample_19
0
5s
6864
opper_context_sample_20
50
9s
7245
opper_context_sample_21
100
4s
6804
opper_context_sample_22
100
6s
6892
opper_context_sample_23
100
7s
6979
opper_context_sample_24
100
12s
88129
opper_context_sample_25
100
13s
88202
opper_context_sample_26
100
24s
88124
opper_context_sample_27
100
14s
88163
opper_context_sample_28
0
15s
88305
opper_context_sample_29
100
11s
88116
opper_context_sample_30