SQL

Natural language to SQL query generation evaluates text-to-query fidelity and schema reasoning. This task is particularly relevant for analytics chat assistants and simplified database interfaces where users need to query data using natural language. Models must understand both the intent behind the question and the structure of the underlying database schema.

93
openai/gpt-4o
Average duration
21s
Average tokens
897
Average cost
$0.00
100
20s
631
opper_sql_sample_01
100
19s
664
opper_sql_sample_02
100
18s
652
opper_sql_sample_03
100
19s
630
opper_sql_sample_04
100
20s
668
opper_sql_sample_05
100
19s
703
opper_sql_sample_06
100
25s
637
opper_sql_sample_07
100
18s
633
opper_sql_sample_08
100
19s
653
opper_sql_sample_09
100
20s
679
opper_sql_sample_10
100
20s
849
opper_sql_sample_11
100
18s
874
opper_sql_sample_12
100
19s
886
opper_sql_sample_13
100
20s
901
opper_sql_sample_14
75
18s
944
opper_sql_sample_15
100
20s
946
opper_sql_sample_16
100
20s
925
opper_sql_sample_17
75
20s
1016
opper_sql_sample_18
100
18s
890
opper_sql_sample_19
100
18s
996
opper_sql_sample_20
100
18s
1048
opper_sql_sample_21
75
18s
1141
opper_sql_sample_22
100
18s
1141
opper_sql_sample_23
75
18s
1135
opper_sql_sample_24
100
18s
1123
opper_sql_sample_25
100
18s
1103
opper_sql_sample_26
50
1m 21s
1119
opper_sql_sample_27
75
20s
1194
opper_sql_sample_28
50
19s
1091
opper_sql_sample_29
100
18s
1031
opper_sql_sample_30