Back to labs/answer key page

To find the 25 most common predicative adjectives (understood as adjectives following be):

tgrep -a 'VP < (/^VB/ < /^.re/|/^.m/|/^.s/|were|was|be|been|being)
  < (ADJP << `/^JJ/)' | sort | uniq -c | sort -nr | head -26

To find the 25 most common attributive adjectives:

tgrep -a '`/^JJ/ $ /^N/' | sort | uniq -c | sort -nr | head -26

Some explanation: I've used head -26 and not head -25 because tgrep outputs a blank line between each match. Sort puts these all together and uniq -c counts them up, and the blank line is of course always the most frequent thing in the output. (Incidentally, by looking at the number of blank lines returned in each case, we can see which use of adjectives is more frequent overall.)

The -a flag causes tgrep to return every match, not just one per sentence. Since we're counting here, it's important to make sure we're catching everything.

The backquote (`) before the pattern that matches the adjective tag (/^JJ/) causes just that node and its descendents (i.e., the adjective) to be printed. Since it will always be JJ or JJR or JJS (and for any given adjective, only one of these), this works fine. It is possible to ask tgrep to print just the adjective itself, by using a wildcard, but that slows down the search.

tgrep -a '/^JJ/ $ /^N/ < `__' | more

Back to labs/answer key page

-----

Emily M. Bender
Last modified: Fri Dec 8 12:00:20 2000