Meme Understanding

Evaluates a model’s ability to interpret culture-dependent, tricky, and humor-driven content that feels obvious to humans but is hard for AI.

87
openai/o3
Average duration
10s
Average tokens
496
Average cost
$0.00
100
7s
408
opper_memes_meme_01
0
9s
606
opper_memes_meme_02
100
6s
573
opper_memes_meme_03
0
13s
1113
opper_memes_meme_04
100
6s
393
opper_memes_meme_05
100
8s
334
opper_memes_meme_06
100
5s
333
opper_memes_meme_07
100
8s
281
opper_memes_meme_08
100
4s
341
opper_memes_meme_09
100
6s
410
opper_memes_meme_10
100
8s
399
opper_memes_meme_11
0
6s
490
opper_memes_meme_12
100
8s
332
opper_memes_meme_13
100
7s
397
opper_memes_meme_14
100
8s
601
opper_memes_meme_15
100
8s
364
opper_memes_meme_16
100
19s
416
opper_memes_meme_17
0
7s
555
opper_memes_meme_18
100
8s
448
opper_memes_meme_19
100
8s
644
opper_memes_meme_20
100
21s
453
opper_memes_meme_21
100
22s
453
opper_memes_meme_22
100
25s
458
opper_memes_meme_23
100
8s
778
opper_memes_meme_24
100
8s
738
opper_memes_meme_25
100
8s
717
opper_memes_meme_26
100
8s
718
opper_memes_meme_27
100
6s
333
opper_memes_meme_28
100
8s
549
opper_memes_meme_29
100
5s
345
opper_memes_meme_30
100
23s
407
opper_memes_meme_31