meme-understanding with openai/gpt-4.1

Evaluates a model’s ability to interpret culture-dependent, tricky, and humor-driven content that feels obvious to humans but is hard for AI.

Average duration

Average tokens

267

Average cost

$0.00