Ling/CSE 472: Assignment 2: Morphology, Parsing

Due October 21st, by the start of class

This assignment does not involve writing any code. As such, please turn in written answers to all of the questions below in class.

1. Morphology and FSTs

(Adapted form of:) Problem 3.2 (p.89)

Give one example of each of the noun, verb and adjective classes in Figure 3.6, and find two stems (noun, verb, or adjective) which are exceptions to the rules, i.e., which fit the apparent definition of the class, but don't in fact have all of the forms that the machine predicts.

Problem 3.4 (p.89)

Write a transducer(s) for the K insertion spelling rule in English. (Be sure to check it with both positive and negative examples, that is, words that it should apply to and words that it shouldn't.)

2. Parsing

Start the LKB, and load the grammar

The LKB System (Copestake 2002) (a parser, generator and grammar development environment) is installed on the machines in Denny 109 as well as the PCs in Denny 112.

Start the LKB by going to "My Computer" in the start menu, and then opening C:\Program Files\lkb_windows\windows. In this directory, you'll find lkb.exe. Double click lkb.exe, to start the LKB. You should see a window entitled "Lkb Top". This window has menus that you will use, as well as some space to print messages that you should pay attention to.

Next, you'll need to download the grammar, which consists of these files:

Create a directory in My Documents called Grammar, and save the files there. (No need to cut and paste -- just right click the links and select "Save Target As..."). If Windows added the .txt extension to any of the files, use Rename in (available in the right-click pop-up menu in Windows Explorer) to get rid of it.

Load the grammar by selecting Load > Complete Grammar in the Lkb Top window. In the dialogue that pops up, click "My Documents" (on the left hand side), then "Grammar", then "script" then "open" (the button).

A. Charts and trees, edges and nodes

B. Start symbols

C. Parsers and Grammars

For this part of the assignment, you'll need to edit the file rules.tdl. You need to make sure that when you save it, it is still a text file (although without the .txt extension) and not an rtf file or a Word document or anything else. WordPad seems to be okay.

In general, parsers (and grammar development environments) should be written to make as few assumptions about the grammars they parse as possible. For example, the LKB allows the user to define the (set of possible) start symbol(s) for their grammar, rather than assuming it's S (see above). However, parsers (and grammar development environments) do need to define what they will accept as well-formed grammars, check for well-formedness, and anticipate other kinds of user errors.


Back to main course page