LING571

Homework 1

Due 10/6/05

 

 

      Read J&M Chapter 9 before working on Ex 1-3.

 

  1. (35 pts) Finish Exercise 9.2 on Page 355 in J&M, using the following tagset:

(a)    POS tags:

- N, V, P (preposition), Adj, Adv, Pron, Det, Conj

                        - CD (cardinal number),  PDT (pre-determiner), RP (particle), etc.

 

(b)   Syntactic tags: NP, VP, PP, AdjP, AdvP, S, etc.

 

      %%%%%%

     There are no unique answers for the parse trees.  You will get full credits as long as the

     parse trees look reasonable.

 

 

  1. (20 pts) List ten sentences that can be generated by the following context-free grammar:

(a)    S    => NP VP

(b)   NP => N

(c)    NP => Det N

(d)   NP => NP PP

(e)    VP => V NP

(f)     VP => VP PP

(g)    PP =>  P NP

 

(h)    N => book / Mary / store / card

(i)      Det => the

(j)     V => bought / is / was

(k)   P => in/with

 

%%%%%

Examples: (each sentence is worth 2 points)

      Mary bought the book in the store

      The book bought Mary    # it is semantically odd, but grammatical

      Mary bought the book in the store with the card

 

Note: sentences such as “Mary is in the store” cannot be generated by the grammar, because generating such a sentence requires either “VPèV PP” or “VPèV”. Neither rule is in the current grammar.

 

 

  1. (15 pts) Does the grammar in Ex 2 allow ungrammatical sentences? If so, list two of them. How can you modify the grammar so that the two sentences that you have chosen would no longer be possible under the new grammar?

 

%%%%%%%%%%%

“Ungrammatical” sentences mean the sentences that are ungrammatical with respect to English. In other words, English native speakers will say those sentences are illegal.

 

Some examples:  (each sentence is worth 5 points)

        Mary bought book in store

        The Mary bought book   

 

(This part is worth 5 points)

Singular nouns such as “book”/”store” need to follow a determiner in English, but the CFG above does not require that. Similarly, proper names such as “Mary” do not need the “determiner”.

 

Possible solutions: split symbols (e.g., split the tag N into many subclasses), or extend CFG to allow features.

                                 

       

  1. (30 pts) Given sentences (a)-(d):

(a)    The/Det   car/N    killed/V   the/Det   duck/N

(b)   The/Det   duck/N   died/V   under/P   her/Pron   car/N

(c)    We/Pron  duck/V   under/P   the/Det   car/N

(d)   We/Pron  retrieve/V  the/Det   poor/Adj  duck/N

 

Calculate the following probabilities:    %%% Each prob is worth 5 points.

1.      P (w2=duck | w1= the) = 2/5

2.      P (w1=the | w2 = duck) = 2/4 = 1/2

3.      P (w2 = duck | t2 = N) = 3/6 = 1/2

4.      P (w2 = duck | t1 = Det) = 2/5

5.      P (t2 = V | t1 = N) = 2/6 = 1/3

6.      P (t2 = P | t1 = V, t0 = N) = ½