Ling/CSE 472: Assignment 3: Feature structures, unification

Due October 28th, by the start of class, via ESubmit

NOTE This assignment may be more difficult than the last one, and the directions may not tell you everything you really need to know. Therefore we ask you to please get started early and ask questions as soon as they come up, rather than puzzling them over by yourself.

In order to give you some experience with feature structures and unification, this assignment asks you to do some grammar engineering, using the LKB (Copestake 2002). In particular, you will implement the feature-based analysis of agreement (subject-verb and determiner-noun) described in the chapter, and a simple feature-based version of subcategorization. Then, you will use types and inheritance to recast the same grammar in a maximally general fashion.

This assignment is to be submitted electronically. You should turn in two complete "grammar" directories (including the files test.items and test.out) one for each part. Please created zipped or tarred/gzipped archives to turn in.

Getting started

Background: the LKB, types, and tdl syntax

In assignment 2, you used the LKB to parse a CFG. In part 1 of this assignment, you'll be adding features to a (different) CFG. In part 2, you will start using typed feature structures to state generalizations.

In fact, all of these grammars are typed feature-structure (TFS) grammars, since that is what the LKB can parse. (The CFGs are just CFGs coded up as TFS grammars.) Furthermore, they are stated in Type Description Language (TDL), with the following syntax:

typename := supertype1 & supertype2 & ... & supertype3 &
 [ FEATURE_1 value_1,
   FEATURE_2 value_2,
      ...
   FEATURE_N value_n ].

Features with complex values look like this:

typename := supertype &
  [ FEATURE_1 [ FEATURE_2 value_2,
                FEATURE_3 value_3 ],
    FEATURE_4 value_4 ].

If you just want to talk about one feature inside the value of another, you can write it like this: (Note the period in between the feature names.)

typename := supertype &
   [ FEATURE_1.FEATURE_2 value2 ].

NB: The following is not equivalent to the above:

typename := supertype &
   [ FEATURE_2 value2 ].

Reentrancy is indicated with variables beginning with #. For example:

typename := supertype &
 [ FEATURE_2 #same,
   FEATURE_3.FEATURE_4 #same ].

Lists are represented as feature structures with two features, FIRST and REST. The value of FIRST is the first element on the list. The value of REST is another list, or *null*. (The relevant types are defined at the bottom of types.tdl.) You can use these features to constrain particular members of lists, or you can use the following notation:

typename := supertype &
  [ LIST_FEATURE < [ FEATURE_1 value_1 ] , ... > ].

typename := supertype &
  [ LIST_FEATURE < othertype , [ FEATURE_1 value_1 ] > ].

The first example above involves a list with at least one element on it. The second involves a list with exactly two elements on it.

The LKB requires that all features be declared for exactly one type (they can be inherited and further constrained by subtypes). If you try to introduce features with the same name on different types (and one is not a subtype of the other) it will complain.

The LKB distinguishes between types and instances, the latter to be found (in this grammar) in rules.tdl and lexicon.tdl. It is only because of this that we can 'overload' symbols like s and use them as both rule names and type names.

Here is a partial LKB/Grammar Engineering FAQ, which may be helpful.

Part 1: Feature-based analyses of agreement and valence

In this part of the assignment, you will modify the starting grammar to account for all of the grammaticality judgments indicated in the file test.items.

Part 2: Using types to capture generalizations

You may have noticed that you were typing the same thing over and over again in Part 1. This part of the assignment asks you to make use of the types to state each constraint exactly once, and make any instance or type that needs it inherit from that supertype.

Hints:

When you have a grammar that doesn't have any repeated constraints and that accounts for all of the data (both positive and negative) in test.items, turn it in!


Back to main course page