SQL

Natural language to SQL query generation evaluates text-to-query fidelity and schema reasoning. This task is particularly relevant for analytics chat assistants and simplified database interfaces where users need to query data using natural language. Models must understand both the intent behind the question and the structure of the underlying database schema.

97
openai/o1
Average duration
26s
Average tokens
1786
Average cost
$0.00
100
22s
1258
opper_sql_sample_01
100
20s
1197
opper_sql_sample_02
100
20s
1072
opper_sql_sample_03
100
20s
1072
opper_sql_sample_04
100
52s
1623
opper_sql_sample_05
100
22s
1579
opper_sql_sample_06
100
22s
1192
opper_sql_sample_07
100
20s
1449
opper_sql_sample_08
100
22s
1585
opper_sql_sample_09
100
22s
1617
opper_sql_sample_10
100
22s
1393
opper_sql_sample_11
100
22s
1167
opper_sql_sample_12
100
22s
1568
opper_sql_sample_13
100
22s
2028
opper_sql_sample_14
75
22s
1518
opper_sql_sample_15
100
22s
2500
opper_sql_sample_16
100
22s
1779
opper_sql_sample_17
100
22s
2467
opper_sql_sample_18
100
20s
2060
opper_sql_sample_19
100
22s
1738
opper_sql_sample_20
100
20s
1820
opper_sql_sample_21
100
22s
1990
opper_sql_sample_22
100
1m 7s
2673
opper_sql_sample_23
75
22s
2573
opper_sql_sample_24
100
1m 29s
2192
opper_sql_sample_25
100
21s
1716
opper_sql_sample_26
75
22s
2083
opper_sql_sample_27
75
20s
1992
opper_sql_sample_28
100
24s
3255
opper_sql_sample_29
100
20s
1412
opper_sql_sample_30