SQL

Natural language to SQL query generation evaluates text-to-query fidelity and schema reasoning. This task is particularly relevant for analytics chat assistants and simplified database interfaces where users need to query data using natural language. Models must understand both the intent behind the question and the structure of the underlying database schema.

94
openai/o4-mini
Average duration
30s
Average tokens
1382
Average cost
$0.00
100
22s
1063
opper_sql_sample_01
100
14s
806
opper_sql_sample_02
100
25s
948
opper_sql_sample_03
100
1m 32s
1006
opper_sql_sample_04
100
33s
1052
opper_sql_sample_05
100
26s
1013
opper_sql_sample_06
100
25s
804
opper_sql_sample_07
100
1m 8s
886
opper_sql_sample_08
100
33s
890
opper_sql_sample_09
100
22s
1260
opper_sql_sample_10
100
22s
1085
opper_sql_sample_11
100
26s
1370
opper_sql_sample_12
100
33s
1330
opper_sql_sample_13
100
22s
1206
opper_sql_sample_14
75
33s
1202
opper_sql_sample_15
100
51s
1974
opper_sql_sample_16
100
22s
1260
opper_sql_sample_17
100
22s
1755
opper_sql_sample_18
100
26s
1604
opper_sql_sample_19
75
33s
1519
opper_sql_sample_20
100
14s
1208
opper_sql_sample_21
100
26s
1774
opper_sql_sample_22
100
33s
1634
opper_sql_sample_23
75
22s
1482
opper_sql_sample_24
75
25s
1922
opper_sql_sample_25
100
26s
2261
opper_sql_sample_26
50
25s
1813
opper_sql_sample_27
75
26s
1663
opper_sql_sample_28
100
33s
1861
opper_sql_sample_29
100
22s
1794
opper_sql_sample_30