University of Washington: Linguistics: Ling 573: Spring 2011: Deliverable #2

Ling 573 - Natural Language Processing Systems and Applications
Spring 2011
Deliverable #2: Question Classification: Due April 19, 2011: 09:00

Goals

In this deliverable, you will begin development of your question-answering system proper. You will

Develop the first stage of a question-answering system by performing question processing.
Implement a question classification component, based on techniques discussed in class and in the readings.
Perform additional question processing to aid answer extraction.
Explore QA taxonomies, such as those developed by ISI and UIUC.
Identify the resources - software and corpus - that can support this task.

Question processing

Question Classification

For this deliverable, you will need to implement a question classification system. You may build on techniques presented in class, described in the reading list, and proposed in other research articles.

We will be building on the Question Classification Taxonomy developed by Roth and Li at UIUC. Their site includes training data for developing your system, as well as the taxonomy itself. You are only required to classify at the top level of the taxonomy (NUMERIC, LOCATION, HUMAN, etc), though you may try more fine-grained classification if you wish.

Data

Training Data and Development Test Data

You should use the Roth and Li training and test data above to develop your question classification system. This data is of the form
Top_level_tag:fine_tag question_text,
with one question per line.

You should treat the training data as training data and the test data as development test data. You may tune on in but you should not use it in the training phase of your classifier.

Additional annotated question data is also available on patas. The data can be found in the /dropbox/10-11/573/Data/Questions/training/ directory on patas. The directory contains two files, with the questions and offset annotation: TREC-2004.xml and TREC-2004.tagged.txt.

The xml file contains the official questions in TREC standard xml format in question series. Each series begins by identifying the target, in the form target id = "1" text = "Crips". The questions themselves are tagged by type - FACTOID, LIST, or OTHER. You can ignore all 'OTHER' questions.

The tagged.txt file contains coarse grained tags in the Li and Roth style. The lines are of the form:
Question_ID\tQuestion_type.
For 'OTHER' type questions, the Question_type field is left intentionally blank.

Test Data

You should evaluate on two sets of questions:

The Li and Roth test data set: This will allow you to make a direct comparison between your system and published results on question classification.
The questions in the TREC-2005.xml file in patas test directory (/dropbox/10-11/573/Data/Questions/test/): This will allow you to test on a fully held out set, as in a real QA evaluation. There is a corresponding tagged.txt file in the directory to allow you to compute your final score.
NOTE:Please do NOT tune on these questions. If you want to try experiments on questions series style questions, use the data in the training directory.

Evaluation

You will compute accuracy (#_correct_samples/#_total_samples) on the test data. You must produce scores at the coarse grained level for both sets of test data. You may also include fine-grained scores for the Li and Roth data. These scores should be placed a file called QC.results in the results directory.

You are encouraged to include a more detailed results breakdown in your write-up, for example, analyzing per-class accuracy.

Outputs

Create two output files in the outputs directory, based on running your question classifier on the two test data files, please name one LR.outputs and the other TREC-2005.outputs . Your output should be similar to the Roth and Li format: one line per question as below:
Top_level_tag{:fine_tag} question_text
where the fine_tag is optional. For the TREC-2005 data, you will only be able to score the coarse tags, but it is fine if your system produces fine-grained tags. You should ignore the 'OTHER' type questions completely.

Extending the project report

This extended version should include all the sections from the original report (with many still as stubs) and additionally include the following new material:

Approach

Query Processing: this subsection describes your approach so far.

Evaluation

Component Evaluation: this subsection will describe your query classification evaluation.

Please name your report D2.pdf.

Presentation

Your presentation may be prepared in any computer-projectable format, including HTML, PDF, PPT, and Word. Your presentation should take about 10 minutes to cover your main content, including:

Query processing
Issues and successes
Related reading which influenced your approach

Your presentation should be deposited in your doc directory, but it is not due until the actual presentation time. You may continue working on it after the main deliverable is due.

Summary

Finish coding and document all code.
Verify that all code runs effectively on patas using Condor.
Add any specific execution or other notes to a README.
Create your D2.pdf and add it to the doc directory.
Verify that all components have been added and any changes checked in.

Ling 573 - Natural Language Processing Systems and Applications Spring 2011 Deliverable #2: Question Classification: Due April 19, 2011: 09:00