Meme Understanding

Evaluates a model’s ability to interpret culture-dependent, tricky, and humor-driven content that feels obvious to humans but is hard for AI.

0
Duration
6s
Input Tokens
264
Output Tokens
226
Cost
$0.00
Context
Input
What is the 17th word in (include only the word): Today, at 10:45 AM, we shipped 1,200 widgets; however, only 97 passed final QA on line-3 today.
Expected output
{
  "word": "today"
}
Model output
{
  "word": "today."
}