Distributed under the MIT license, see LICENSE. This directory contains the grammars and treebanks described in: Bender, Emily M. 2010. "Reweaving a Grammar for Wambaya: A Case Study in Grammar Engineering for Linguistic Hypothesis Testing." Linguistic Issues in Language Technology 3(3) pp.1-34. Any work using these data should cite the original source: Nordlinger, Rachel. 1998. A Grammar of Wambaya, Northern Australia. Canberra: Pacific Linguistics. Work using the grammars or treebanks should cite Bender 2010 in addition to Nordlinger 1998. These grammar versions are frozen, but the grammar is still under development. A more up-to-date snapshot can be found here: http://wiki.delph-in.net/moin/WambayaTop Table of contents: README This file LICENSE license file original-grammar/ grammar as it was at the start of the project arg-comp-grammar/ final version of the argument composition branch aux+vc-grammar/ aux+verb cluster grammar at the end of the project dot.tsdbrc configuration information for [incr tsdb()] tsdb/ treebanks and skeletons These grammars are compatible with DELPH-IN technology, specifically the LKB (Copestake 2002), PET (Callmeier 2002), and [incr tsdb()] (Oepen 2001). The software can be downloaded here: http://wiki.delph-in.net/moin/LogonTop The file dot.tsdbrc defines processing configurations for each of the grammar versions using the fast PET parser, as well as parameters pointing the software to the relevant directories to find treebanks and skeletons. This file should be copied to ~/.tsdbrc. It assumes that this directory resides in ~/wmb, and should be edited if you are using a different location. There are six testsuite profiles in tsdb/home: original/ original analyses, treebanked but not thinned original-gold/ `thinned' version, with only the trees annotated as correct arg-comp/ analyses from the final arg-comp grammar, not thinned arg-comp-gold/ `thinned' version, with only the trees annotated as correct aux+vc/ analyses from the final aux+vc grammar, not thinned aux+vc-gold/ `thinned' version, with only the trees annotated as correct The aux+vc profile can be recreated with the following steps (assuming that the LOGON software is installed and dot.tsdbrc is copied to ~/.tsdbrc and modified appropriately): 1. Run emacs 2. In emacs, type M-x logon 3. At the lisp prompt, evaluate: (lkb::read-script-file-aux "~/wmb/aux+vc-grammar/lkb/script") 4. At the lisp prompt, evaluate: (tsdb:tsdb :cpu :wmb-aux+vc :task :parse :file t :count 4) 5. In the [incr tsdb()] podium, select: File | Create | Examples from Nordlinger 1998, Ch 3-8 6. In the [incr tsdb()] podium, select: Process | All items 7. When that has finished, in the [incr tsdb()] podium, select: Compare | Source database | aux+vc Trees | Update 8. When that has finished, in the [incr tsdb()] podium, select: Trees | Switches | Automatic Update (to turn this option off) Options | TSQL Condition | Unannotated Trees | Annotate 9. There should be 58 trees to reject. When you have finished that process, select: Options | TSQL Condition | No condition Trees | Normalize References Bender, Emily M. 2010. "Reweaving a Grammar for Wambaya: A Case Study in Grammar Engineering for Linguistic Hypothesis Testing." Linguistic Issues in Language Technology 3(3) pp.1-34. Callmeier, Ulrich. 2002. "Preprocessing and encoding techniques in PET." In S. Oepen, D. Flickinger, J. Tsujii, and H. Uszkoreit, eds. Collaborative Language Engineering: A Case Study in Efficient Grammar-based Processing. Stanford: CSLI. Copestake, Ann. 2002. Implementing Typed Feature Structure Grammars. Stanford: CSLI. Nordlinger, Rachel. 1998. A Grammar of Wambaya, Northern Australia. Canberra: Pacific Linguistics. Oepen, Stephan. 2001. [incr tsdb()] --- Competence and Performance Laboratory. User manual. Technical report, Saabruecken, Germany.