SQL

Natural language to SQL query generation evaluates text-to-query fidelity and schema reasoning. This task is particularly relevant for analytics chat assistants and simplified database interfaces where users need to query data using natural language. Models must understand both the intent behind the question and the structure of the underlying database schema.

93
gcp/gemini-2.5-flash
Average duration
21s
Average tokens
1304
Average cost
$0.00
100
19s
723
opper_sql_sample_01
100
19s
693
opper_sql_sample_02
100
18s
701
opper_sql_sample_03
100
19s
690
opper_sql_sample_04
100
18s
827
opper_sql_sample_05
100
19s
904
opper_sql_sample_06
100
19s
681
opper_sql_sample_07
100
18s
709
opper_sql_sample_08
100
18s
709
opper_sql_sample_09
100
19s
826
opper_sql_sample_10
100
18s
912
opper_sql_sample_11
100
18s
1016
opper_sql_sample_12
100
18s
1075
opper_sql_sample_13
100
18s
1051
opper_sql_sample_14
100
18s
1164
opper_sql_sample_15
100
25s
1762
opper_sql_sample_16
100
6s
1082
opper_sql_sample_17
75
25s
1292
opper_sql_sample_18
50
24s
2879
opper_sql_sample_19
100
25s
2024
opper_sql_sample_20
100
25s
1151
opper_sql_sample_21
100
25s
1401
opper_sql_sample_22
100
25s
1516
opper_sql_sample_23
75
24s
1335
opper_sql_sample_24
100
25s
1747
opper_sql_sample_25
75
24s
1384
opper_sql_sample_26
100
24s
2016
opper_sql_sample_27
75
23s
1376
opper_sql_sample_28
50
25s
4257
opper_sql_sample_29
100
24s
1225
opper_sql_sample_30