LLM Benchmark: Italian Driving License Theory Exam

LLM Exams Passed Exams Failed Average Errors Accuracy
* Category that contains mostly questions with images.
Every category has 24 questions.
Every model has a thinking budget limited to 2000 tokens.