From 'F' to 'A' on the N.Y. Regents Science Exams: An Overview of the Aristo Project

09/04/2019
by   Peter Clark, et al.
8

AI has achieved remarkable mastery over games such as Chess, Go, and Poker, and even Jeopardy, but the rich variety of standardized exams has remained a landmark challenge. Even in 2016, the best AI system achieved merely 59.3 an 8th Grade science exam challenge. This paper reports unprecedented success on the Grade 8 New York Regents Science Exam, where for the first time a system scores more than 90 exam's non-diagram, multiple choice (NDMC) questions. In addition, our Aristo system, building upon the success of recent language models, exceeded 83 the corresponding Grade 12 Science Exam NDMC questions. The results, on unseen test questions, are robust across different test years and different variations of this kind of test. They demonstrate that modern NLP methods can result in mastery on this task. While not a full solution to general question-answering (the questions are multiple choice, and the domain is restricted to 8th Grade science), it represents a significant milestone for the field.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/14/2018

Think you have Solved Question Answering? Try ARC, the AI2 Reasoning Challenge

We present a new question set, text corpus, and baselines assembled to e...
research
08/09/2023

A Comparative Study of Open-Source Large Language Models, GPT-4 and Claude 2: Multiple-Choice Test Taking in Nephrology

In recent years, there have been significant breakthroughs in the field ...
research
08/08/2023

Towards an AI to Win Ghana's National Science and Maths Quiz

Can an AI win Ghana's National Science and Maths Quiz (NSMQ)? That is th...
research
07/19/2017

Crowdsourcing Multiple Choice Science Questions

We present a novel method for obtaining high-quality, domain-targeted mu...
research
04/20/2016

Question Answering via Integer Programming over Semi-Structured Knowledge

Answering science questions posed in natural language is an important AI...
research
01/30/2023

Can an AI Win Ghana's National Science and Maths Quiz? An AI Grand Challenge for Education

There is a lack of enough qualified teachers across Africa which hampers...
research
11/02/2008

Effect of Tuned Parameters on a LSA MCQ Answering Model

This paper presents the current state of a work in progress, whose objec...

Please sign up or login with your details

Forgot password? Click here to reset