Sunday, March 08, 2009

Microsoft Research Question-Answering Corpus

Ran across this item on MS research site and was intrigued.

"This directory contains a set of 1.3K questions collected from 10-13 year old schoolchildren, who were asked 'If you could talk to an encyclopedia, what would you ask it?' Children wrote their questions on slips of paper, and these questions were then keyed in/spell-corrected. A research librarian then spent several months combing through the text of Encarta 98, creating a full recall set for each question. For each answer, a number of features were annotated (see below), including the best sentence match in the corpus."

One of my pet projects is to develop a decision support system (DSS) and I am still thinking about the best way to design querying and information retrieval. This article gave me a little thought around the processes of QA and UAT testing for an information-based application.

