Ling/CSE 472: Assignment 2: Morphology

Due April 17th, by 11:59pm

This assignment involves a little bit of coding with xfst. You'll need to turn in some code and results files, which should be submitted via Canvas.

Problem 1. Morphology and FSTs

(Adapted form of:) Problem 3.3 (p.81)

Using xfst, write a finite-state transducer that can generate and analyze a small set of verbs in all of their inflected forms. This FST will handle two spelling change rules: the rule that deletes a final e before -ing or -ed, and the rule that inserts a k when c appears between a vowel and -ing or -ed. The first rule is provided already. Your job is to write the second.

Note: xfst defines a language for regular expressions which makes it relatively easy to write morphophonological rewrite rules. For this problem, however, you must stay with the basic operators. No credit will be given for answers that use the xfst operator  -> or its kin. On the other hand, if you get stuck, you might find it helpful to write the rule in that notation, and then examine the network that xfst produces.

To do this assignment, you'll need the following two files:

Copy them somewhere onto your Patas home directory. Log in to patas and do: 'wget path_to_file'. If you're using Windows to download them, make sure that it doesn't add any new file extensions.

verb_lexicon is the lexicon of verbs (in citation form) that we'll be working with.
k.xfst is the xfst script that does the work. It is the file you'll need to modify for this part of the assignment.

To start xfst, log onto Patas and type "xfst" (your $path variable should already be set appropriately). You'll get an xfst prompt.

To run the script, enter:

source k.xfst

After you've run the script, there should be an FST on the stack. To apply that FST, try:

apply up spruced

apply down picnic+ing

Observe that it doesn't yet have the right behavior in the second example.

Modify k.xfst until it has the right behavior. The files produced by the script (underlyingoneruletworules and threerules) should be helpful in testing it as you go. You can also use apply up and apply down to observe the behavior of the network. Here is a short summary of xfst syntax.

Hint: If you have trouble, try commenting out the first rule and then writing a most simple rule as a first rule, to make sure you can at least observe some change happening. Make sure you understand the function of the ':' and the concepts fof upper and lower tape.

To examine a network, type:

print net

The network defined in k.xfst is too large to be usefully examined like this, but you might try some others:

read regex [a b c];

print net

read regex [a+ b c];

print net

read regex [e %+ -> 0 || _ [e|i] ];

print net

Turn in

Problem 2. Text-to-Speech

Text-to-speech systems rely on large pronunciation dictionaries in combination with rules for dealing with unknown words. One large class of typically unknown words is names, including people's names. This assignment will focus on one particular aspect of predicting the pronunciation of names, namely stress assignment. In particular, there are 'name suffixes' (recurring forms at the end of many different names) which leave the stress assignment of the stem unchanged (stress-neutral name suffixes) and those which cause the stress to move (stress-changing suffixes). Stress assignment is important for figuring out pronunciation because a) phonological rules affecting the pronunciation of segments are sensitive to stress and b) lexical stress interacts with other factors to determine the prosody of an utterance.

Your tasks

Don't worry about

Turn in


Back to main course page