SQL

Natural language to SQL query generation evaluates text-to-query fidelity and schema reasoning. This task is particularly relevant for analytics chat assistants and simplified database interfaces where users need to query data using natural language. Models must understand both the intent behind the question and the structure of the underlying database schema.

95
openai/gpt-5-mini
Average duration
22s
Average tokens
1388
Average cost
$0.00
100
17s
801
opper_sql_sample_01
75
21s
930
opper_sql_sample_02
100
14s
885
opper_sql_sample_03
100
21s
936
opper_sql_sample_04
100
21s
989
opper_sql_sample_05
100
17s
1188
opper_sql_sample_06
100
21s
859
opper_sql_sample_07
100
14s
807
opper_sql_sample_08
100
21s
957
opper_sql_sample_09
100
17s
917
opper_sql_sample_10
100
17s
1072
opper_sql_sample_11
100
17s
1234
opper_sql_sample_12
100
34s
1353
opper_sql_sample_13
100
21s
1259
opper_sql_sample_14
100
21s
1210
opper_sql_sample_15
100
23s
1820
opper_sql_sample_16
100
17s
1329
opper_sql_sample_17
75
21s
1664
opper_sql_sample_18
100
1m 14s
1870
opper_sql_sample_19
100
41s
1827
opper_sql_sample_20
100
13s
1333
opper_sql_sample_21
100
20s
1774
opper_sql_sample_22
75
23s
2030
opper_sql_sample_23
75
23s
1942
opper_sql_sample_24
75
28s
1963
opper_sql_sample_25
100
16s
1656
opper_sql_sample_26
100
16s
1641
opper_sql_sample_27
75
10s
1555
opper_sql_sample_28
100
17s
1906
opper_sql_sample_29
100
19s
1929
opper_sql_sample_30