Ling 472/CSE 472: Introduction to Computational Linguistics
Spring 2020

Final project information

General remarks about the project

Project components:
1. find an NLP package described in a peer-reviewed paper which reports results using a quantitative evaluation metric, such as precision/recall;
2. run it on the dataset that comes with it (or, in some exceptional cases, on a different dataset),
3. and perform careful error analysis over the results (for ~100 errors + some number of correct outputs for comparison)
Additional specifications:
1. All papers (packages) must include quantitative evaluation (e.g. in terms of precision and recall).
2. Find a package that you can get running easily and quickly.
3. Perform an analysis of the specific test items that the system does not get right by categorizing the errors, counting how many fall into which category, and finally providing a meaningful discussion about them, including either hypothetical or directly observed reasons for why the system is making these errors, as well as what implications this behavior might have.
4. Your error analysis: its method, categories, and results, will be described in detail in the term paper.
5. The term paper quality must be representative of the work that you did in this upper division course.
Further comments: The error analysis is the main point of the project, not reproducing the experiment, so beware of choosing a package which you will then waste a lot of time trying to run. Be especially careful about choosing a package which does not come with a dataset. Chances are you will not even get to error analysis this way, spending all your time making the tool run on new data. Also, we will not approve such a package!

Project groups

You are invited to complete the final project in groups of 2-3 people
- Form groups by 4/15, so you can work together on project milestone 1
- Please try to form groups where each member brings different expertise
- Only one person submits the project-related files; others submit a Canvas comment or a text file with the names of their partners.
- Project plan must include description of how the work will be allocated.
- Project write-up must include a clear description of who did what

Finding candidate packages

Visit the ACL Anthology, which provides access to most recent publications in computational linguistics/NLP.

Pick a conference or workshop (e.g. ACL, COLING (more linguistics), SIGBIOMED...)
Pick a year (more recent papers are more likely to have software that still works, but you don't have to do 2019 or even 2018)
Browse the titles. If the title interests you, open the paper and check that it has a link to their GitHub repository or other place where they store their code and data
Alternatively, some titles have "software" and/or "dataset" icons right next to them, in which case you may not need to dig in and look for the project repository (e.g. on GitHub). But it is rare to have software and data uploaded directly with the paper, so, do look inside the paper to see if code and data are easily discoverable and usable.
Another option is to browse the listings at Papers with Code
VERIFY that you can download both code and data and run the tool on at least sample data.

Resources

Here are a few resources about error analysis in NLP:

A COLING2018 blog post discussing what Error Analysis is.
A NAACL 2018 blog post on error analysis.
The readings (and blog posts) for 4/16

Project milestones

Form groups of 2-3 people 4/15. Be sure to find people to work with who bring complementary expertise to yours.

Milestone 1, due 4/24: Submit three (3) alternative proposals for the package/dataset, ranked in your order of preferences. For each alternative proposal, specify:
1. Paper's bibliographical entry (authors, title, publication venue)
2. URL for the paper
3. URL for the package download site
4. URL for the dataset (tell us explicitly which dataset(s) you are going to use.)
5. Include sample input and the command that you actually ran to obtain the output. (In other words: you must have run the tool on the dataset which you intend to use for your EA and you must have seen enough output with some errors in it so that you are confident that your plan is feasible! You must convince the instructors that you plan is feasible, too!).
6. Explain why you chose the package. Your choice will need to be approved by the instructor before you start working on it.
In general, you should pick packages that you can download and run on your computer. If you feel like you must use a tool in a different manner, talk to the instructor first.
Milestone 2, due 5/15: Complete project plan, 1st draft (2-3 pages)
1. Submit a clear and detailed description of the package that you chose. (What is the tool for? How is it implemented (high level)? How is it evaluated? Why did you choose it?)
2. Submit a clear and detailed description of the dataset that comes with the package, or, in exceptional circumstances, that you propose that you use for the project. How big is it? What is the format? Will you need to do any preprocessing? Etc
3. Include URL for the package download site
4. Include URL for the dataset download site
5. Copy the main results table from the paper and explain what sort of evaluation they use (what metrics etc.). Explain how the reader should interpret the table. What do these number mean with respect to what the tool is doing?
6. What error analysis have the authors already done, if any?
7. Include a clear plan of how you will perform your error analysis, with several examples:
  - For instance, if you are doing error analysis for a parser: find several sentences which the parser does not parse, and explain how you would go about categorizing these errors and what kind of discussion you might develop about them.
Milestone 3, due 5/29: Complete project plan, 2nd revision.
1. Revise your plan for error analysis based on the instructors' feedback. Submit the revised version.
Milestone 4, due 6/2: Presentation/demonstration of the package+dataset you picked and your error analysis in class 6/2, 6/4 or 6/5.
- This should include the results of your complete error analysis (over ~100 errors plus some correct outputs for comparison).
Milestone 5, due 6/11: The term paper (the final project write up), due 11:59pm on 6/11.

Term paper expectations

A good term paper is detailed, focused, and clear. It amalgamates what you learned during the entire quarter, as it relates to the particular tool that you were exploring. This includes levels of linguistic structure, algorithms, computational approaches, evaluation. A good paper demonstrates a strong ability to reason about all of those things in a manner such that the reasoning is easy to follow and clear to the reader. The descriptions of the items below are just minimal guidelines; please be warned that minimal effort may result in a low score.

You paper should be 8 pages maximum, double-spaced and include the following sections:

Introduction and Background: Present the tool that you chose. For example, what is the package doing? Why is it important and interesting? What is the novelty of their approach?
Data: What dataset are you using? Describe. If it did not come with the package, explain why and how you came to use this data and where you got it.
Replication results: Include both the table from the paper you are working from and a separate table giving the numbers you got when you ran the software and evaluation scripts.
Methodology: Describe the structure and the meaning of your error analysis. What error categories did you define? How did you come up with them? How did you sample the errors to analyze?
Error Analysis Results: Present the number of errors of each type in clear, easy to understand tables accompanied by clear, informative but ideally minimal prose.
Discussion: Discuss your findings in detail, particularly the meaning of the error categories and any technical hypotheses that you might have about the reasons behind the errors and the implications of the tool being deployed with these errors. How does your error analysis inform possible future development of the tool? What are the implications of this project for broader inquiry in computational linguistics?
Work allocation: A clear description of who did what in running software, designing error analysis, performing error analysis, and writing the results. All project members are expected to contribute to the writing.
Bibliography: For any resources you are using (corpora, toolkits, etc) you should include a proper citation, both in the text (as (Author, year)) and in the bibliography. Likewise for any works cited.

Back to course page

Ling 472/CSE 472: Introduction to Computational Linguistics Spring 2020