Linguistics 573

Phase #3

 

Overview:

 

This is the third and final phase of our project, at least for the portion of the project we can complete in the context of the class.  For phase #2, you processed questions and returned results consisting of a ranked list of a thousand or fewer documents.  Evaluation was recall biased, with some modifications favoring higher ranking answer-bearing documents.  For phase #3, you are to produce output that consists of a much smaller set of answer-bearing documents.  Results will be more precision biased, in that those documents included that are not answer-bearing will receive a penalty.  This is a much more challenging phase since it is essentially passage retrieval, where the passages returned are the documents themselves.

 

Output format:

 

Follow the TREC guidelines for 2005 for answer submissions.  The basic format is as follows:

 

66.1         your-group-name XIE20000822.0203              

66.2         your-group-name NYT20000816.0039             

66.3         your-group-name XIE20000822.0297              

66.4         your-group-name APW20000819.0131           

66.5         your-group-name APW20000821.0185           

66.5         your-group-name XIE20000821.0036              

66.5         your-group-name XIE20000822.0018              

66.5         your-group-name XIE20000822.0046              

66.6         your-group-name XIE20000824.0003              

66.7         your-group-name NYT20000831.0401             

66.7         your-group-name NYT20000831.0401             

67.1         your-group-name APW20000513.0001           

67.2         your-group-name XIE20000611.0169              

67.3         your-group-name APW20000513.0016           

 

The only deviation from the TREC guidelines is the absence of the answers/passages themselves (which would normally be appended to the end of the output record).  Also, you will not be responsible for generating output for the “Other” questions.

 

 

Evaluation:

 

Evaluation will be precision-biased.  Inclusion of irrelevant documents, that is, documents that do not contain answers, will lower your score.  For factoid questions, the presence of one answering bearing document is sufficient, although the inclusion of more than one will have no effect on your score.  List questions will be evaluated separately, and will be scored based on the number of relevant documents included in the output.

 

More information on evaluation, and the specific tools that will be used, is forthcoming.

 

 

Some notes and rules:

 

1.      You may train against any of the 2003 and 2004 data, including the phase2Eval and phase3Eval output.  2005 data and results must be held out and are not available to training.

2.      For calculating the factoid results, the inclusion of more than one answer bearing document will have no effect on your score.

3.      Ignore the “Other” questions.

4.      Get started early.  The due date given is the last possible day and time that could be given, and cannot be revised or extended for any reason.

 

 

due date:  6 am, Thursday, June 1st

 

Submit the following:

  1. Your code.
  2. Your output file in the above format
  3. Your progress report, following the guidelines for progress reports previously discussed in class (included in the handout).  Please include evaluation.