Ling 573 - Natural Language Processing Systems and Applications
Spring 2011
Deliverable #2: Question Classification: Due April 19, 2011: 09:00


Goals

In this deliverable, you will begin development of your question-answering system proper. You will

Question processing

Question Classification

For this deliverable, you will need to implement a question classification system. You may build on techniques presented in class, described in the reading list, and proposed in other research articles.

We will be building on the Question Classification Taxonomy developed by Roth and Li at UIUC. Their site includes training data for developing your system, as well as the taxonomy itself. You are only required to classify at the top level of the taxonomy (NUMERIC, LOCATION, HUMAN, etc), though you may try more fine-grained classification if you wish.

Data

Training Data and Development Test Data
You should use the Roth and Li training and test data above to develop your question classification system. This data is of the form
Top_level_tag:fine_tag question_text,
with one question per line.

You should treat the training data as training data and the test data as development test data. You may tune on in but you should not use it in the training phase of your classifier.

Additional annotated question data is also available on patas. The data can be found in the /dropbox/10-11/573/Data/Questions/training/ directory on patas. The directory contains two files, with the questions and offset annotation: TREC-2004.xml and TREC-2004.tagged.txt.

The xml file contains the official questions in TREC standard xml format in question series. Each series begins by identifying the target, in the form target id = "1" text = "Crips". The questions themselves are tagged by type - FACTOID, LIST, or OTHER. You can ignore all 'OTHER' questions.

The tagged.txt file contains coarse grained tags in the Li and Roth style. The lines are of the form:
Question_ID\tQuestion_type.
For 'OTHER' type questions, the Question_type field is left intentionally blank.

Test Data
You should evaluate on two sets of questions:

Evaluation

You will compute accuracy (#_correct_samples/#_total_samples) on the test data. You must produce scores at the coarse grained level for both sets of test data. You may also include fine-grained scores for the Li and Roth data. These scores should be placed a file called QC.results in the results directory.

You are encouraged to include a more detailed results breakdown in your write-up, for example, analyzing per-class accuracy.

Outputs

Create two output files in the outputs directory, based on running your question classifier on the two test data files, please name one LR.outputs and the other TREC-2005.outputs . Your output should be similar to the Roth and Li format: one line per question as below:
Top_level_tag{:fine_tag} question_text
where the fine_tag is optional. For the TREC-2005 data, you will only be able to score the coarse tags, but it is fine if your system produces fine-grained tags. You should ignore the 'OTHER' type questions completely.

Other Question Processing

You may also perform additional question processing, not required for classification, that you anticipate will help in later processing. Such processing could include shallow chunking or deep parsing, creation of semantic or logical forms, and so on as guided by your reading and systems discussed in class.

NOTE: Query formulation for improvement of passage retrieval is NOT required for this assignment, though it will factor into Deliverable #3.

Extending the project report

This extended version should include all the sections from the original report (with many still as stubs) and additionally include the following new material:

Please name your report D2.pdf.

Presentation

Your presentation may be prepared in any computer-projectable format, including HTML, PDF, PPT, and Word. Your presentation should take about 10 minutes to cover your main content, including: Your presentation should be deposited in your doc directory, but it is not due until the actual presentation time. You may continue working on it after the main deliverable is due.

Summary