SQL

Natural language to SQL query generation evaluates text-to-query fidelity and schema reasoning. This task is particularly relevant for analytics chat assistants and simplified database interfaces where users need to query data using natural language. Models must understand both the intent behind the question and the structure of the underlying database schema.

91
fireworks/deepseek-r1
Average duration
17s
Average tokens
1571
Average cost
$0.00
100
44s
731
opper_sql_sample_01
100
47s
887
opper_sql_sample_02
100
13s
726
opper_sql_sample_03
100
44s
898
opper_sql_sample_04
100
8s
1005
opper_sql_sample_05
100
8s
1005
opper_sql_sample_06
100
7s
714
opper_sql_sample_07
100
9s
996
opper_sql_sample_08
100
7s
823
opper_sql_sample_09
100
12s
888
opper_sql_sample_10
100
7s
1049
opper_sql_sample_11
50
10s
1147
opper_sql_sample_12
100
12s
1528
opper_sql_sample_13
100
9s
1189
opper_sql_sample_14
75
10s
1290
opper_sql_sample_15
75
32s
3397
opper_sql_sample_16
100
12s
1584
opper_sql_sample_17
75
16s
1915
opper_sql_sample_18
75
30s
3393
opper_sql_sample_19
100
14s
1789
opper_sql_sample_20
100
8s
1331
opper_sql_sample_21
100
13s
1853
opper_sql_sample_22
100
15s
2156
opper_sql_sample_23
50
10s
1566
opper_sql_sample_24
100
15s
1887
opper_sql_sample_25
100
11s
1660
opper_sql_sample_26
50
24s
2891
opper_sql_sample_27
75
13s
2006
opper_sql_sample_28
100
33s
3476
opper_sql_sample_29
100
7s
1345
opper_sql_sample_30