SQL

Natural language to SQL query generation evaluates text-to-query fidelity and schema reasoning. This task is particularly relevant for analytics chat assistants and simplified database interfaces where users need to query data using natural language. Models must understand both the intent behind the question and the structure of the underlying database schema.

92
xai/grok-3
Average duration
12s
Average tokens
882
Average cost
$0.00
100
11s
602
opper_sql_sample_01
100
12s
617
opper_sql_sample_02
100
12s
636
opper_sql_sample_03
100
11s
614
opper_sql_sample_04
100
11s
682
opper_sql_sample_05
100
14s
693
opper_sql_sample_06
100
11s
609
opper_sql_sample_07
100
14s
624
opper_sql_sample_08
100
11s
631
opper_sql_sample_09
100
11s
674
opper_sql_sample_10
100
11s
828
opper_sql_sample_11
100
14s
855
opper_sql_sample_12
100
11s
866
opper_sql_sample_13
100
14s
888
opper_sql_sample_14
100
14s
905
opper_sql_sample_15
100
16s
952
opper_sql_sample_16
75
14s
892
opper_sql_sample_17
75
14s
978
opper_sql_sample_18
50
14s
888
opper_sql_sample_19
75
14s
1004
opper_sql_sample_20
100
6s
1019
opper_sql_sample_21
100
13s
1078
opper_sql_sample_22
100
17s
1141
opper_sql_sample_23
75
13s
1110
opper_sql_sample_24
50
13s
1165
opper_sql_sample_25
100
12s
1074
opper_sql_sample_26
100
12s
1070
opper_sql_sample_27
75
12s
1184
opper_sql_sample_28
75
12s
1160
opper_sql_sample_29
100
5s
1014
opper_sql_sample_30