Meme Understanding

Evaluates a model’s ability to interpret culture-dependent, tricky, and humor-driven content that feels obvious to humans but is hard for AI.

77
fireworks/deepseek-r1
Average duration
10s
Average tokens
1053
Average cost
$0.00
100
10s
1016
opper_memes_meme_01
0
30s
2918
opper_memes_meme_02
100
18s
1250
opper_memes_meme_03
100
27s
3347
opper_memes_meme_04
100
13s
906
opper_memes_meme_05
100
6s
766
opper_memes_meme_06
100
5s
616
opper_memes_meme_07
100
7s
568
opper_memes_meme_08
100
4s
512
opper_memes_meme_09
100
6s
571
opper_memes_meme_10
100
6s
703
opper_memes_meme_11
0
17s
2108
opper_memes_meme_12
100
9s
1098
opper_memes_meme_13
100
6s
720
opper_memes_meme_14
100
3s
407
opper_memes_meme_15
100
5s
482
opper_memes_meme_16
100
5s
537
opper_memes_meme_17
0
12s
1146
opper_memes_meme_18
100
11s
1158
opper_memes_meme_19
0
5s
522
opper_memes_meme_20
100
8s
750
opper_memes_meme_21
100
10s
1037
opper_memes_meme_22
100
9s
903
opper_memes_meme_23
100
5s
552
opper_memes_meme_24
0
20s
2135
opper_memes_meme_25
0
20s
2530
opper_memes_meme_26
0
6s
723
opper_memes_meme_27
100
8s
951
opper_memes_meme_28
100
7s
794
opper_memes_meme_29
100
3s
401
opper_memes_meme_30
100
4s
509
opper_memes_meme_31