Text-to-speech systems rely on large pronunciation dictionaries in combination with rules for dealing with unknown words. One large class of typically unknown words is names, including people's names. This assignment will focus on one particular aspect of predicting the pronunciation of names, namely stress assignment. In particular, there are `name suffixes' (recurring forms at the end of many different names) which leave the stress assignment of the stem unchanged (stress-neutral name suffixes) and those which cause the stress to move (stress-changing suffixes). It is stress assignment is important for figuring out pronunciation because a) phonological rules affecting the pronunciation of segments are sensitive to stress and b) lexical stress interacts with other factors to determine the prosody of an utterance.
[ c a t | d o g ]
Or, using the `explode' operators, like this:
[ {cat} | {dog} ]
Recall that % is the escape character, and that most punctuation marks have a special meaning in xfst and therefore need to be escaped. More on xfst syntax can be found here.
[ A -> B || C _ D ]
This is the xfst form for A is rewritten as B when it occurs between C and D (where C and D refer to the upper tape context). A, B, C and D are all regular expressions.
Rules that run in parallel are separated by ,,:
[ A -> B || C _ D ,, E -> F || G _ H ]
Rules of epenthesis (which insert something where nothing was before) are written like this:
[ [..] -> A || B _ C ]This can be read as "nothing goes to A in the context B _ C". If you used 0 (epsilon) instead of [..] in this rule, it would try to insert an infinite number of As between B and C, because there are an infinite number of empty strings between B and C.