Context Reasoning

Context understanding and reasoning tasks test accurate answers grounded in provided context. This capability is essential for knowledge-base support bots, policy lookup systems, and internal knowledge Q&A applications. Models are evaluated on their ability to provide accurate answers that are properly grounded in the given context rather than hallucinating information.

0
Duration
7s
Input Tokens
2771
Output Tokens
187
Cost
$0.00
Context
Input
Order all the cities named in the text by time of visiting them
Expected output
1.Cartagena (14 Jun 1995) 2.Leticia (mid-Jun 1995, the day after Cartagena) 3.Manchester (3 Oct 2002 entry) 4.Tangier (mid-Apr 2004, one week before 22 Apr) 5.Chefchaouen (mid-Apr 2004, just after Tangier) 6.Granada (22 Apr 2004) 7.Córdoba 8.Sevilla 9.Cádiz 10.Málaga 11.Jaén 12.Toledo 13.Madrid 14.Ávila 15.Salamanca 16.Zamora 17.San Sebastián 18.Bilbao (8 May 2004)
Model output
Cartagena, Colombia; Bilbao, Spain; and Manchester, England.