Computational Linguistics in Support of Linguistic Theory
This page provides links to the slides from our presentation
at the 83rd Annual Meeting of the Linguistic Society of America
(San Francisco, CA, January 9, 2008) as well as to the various
resources we mentioned in the slides.
Slides
Slides can be found here.
Projects/Resources
Note: This list is not meant to be exhaustive, instead it focuses
on the projects and resources which were mentioned in our talk. We hope
that this will provide an interesting starting point for exploration.
Items within each category are listed in alphabetical order.
Corpora
- CHILDES: Child Language Data Exchange System
- ELRA: European Language Resources Association
- LDC: Linguistic Data Consortium
- Penn Treebank
Lexical
- FrameNet: Text annotated with semantic frames and roles
- LIFT (Lexicon Interchange FormaT): XML format for lexical information in dictionaries
- WordNet: Lexical resource sturctured around synonymy
Phonetics/Phonology
OT
- Erculator: Web-based software for creating, manipulating, and formatting OT tableaux.
- HaLP: Harmonic Grammar with Linear Programming
- OTSoft: Optimality Theory Software
Interlinear Glossed Text
Grammar engineering/parsing
- DELPH-IN: Deep Linguistic Processing with HPSG
- The Grammar Matrix: Multilingual grammar configuration system (HPSG)
- Grammix: Bootable CD Rom with TRALE grammar development environment and sample HPSG grammars.
- KPML: Multilingual text generation (SFG)
- ParGram: Parallel Grammar and Parallel Semantics Projects (LFG)
- TRALE: HPSG parser
- XMG: eXtensible MetaGrammar
Typology
- Autotyp: Precision typological databases
- WALS Online: The World Atlas of Language Structures Online
General/Misc
Journals
Examples from other fields
Bibliography
- Arnon, Inbal and Neal Snider. 2009. More than words: Speakers are sensitive to the frequency of multi-word sequences. Paper presented at the 83rd Annual Meeting of the LSA.
- Atkins, Sue, Charles J. Fillmore and Christopher R. Johnson.
2003. Lexicographic Relevance: Selecting Information From Corpus
Evidence. International Journal of Lexicography 16(3):251-280
- Baldridge, Jason, Sudipta Chatterjee, Alexis Palmer and Ben
Wing. 2007. DotCCG and VisCCG: Wiki and Programming
Paradigms for Improved Grammar Engineering with OpenCCG In King
and Bender (eds). Proceedings of the GEAF Workshop.
Stanford: CSLI.
- Bateman, John A. 1997. Enabling Technology for Multilingual
Natural Language Generation: The KPML Development Environment. In
Journal of Natural Language Engineering 3(1):15-55.
- Bickel, Balthasar and Johanna Nichols. 2002. Autotypologizing
databases and their use in fieldwork. In Proc. Int. LREC
Workshop on Resources and Tools in Field Linguistics.
- Bickel, Balthasar. 2007. Typology in the 21st century: Major current developments. Linguistic Typology 11:239-251.
- Bird, S. & G. Simons. 2001. The OLAC metadata set and controlled
vocabularies. Proceedings of the ACL 2001 Workshop on Sharing Tools and
Resources 15:7-18.
- Boersma, Paul and Bruce Hayes. 2001. Empirical Tests of the Gradual Learning Algorithm. Linguistic Inquiry 32:45-86.
- Bow, C., B. Hughes & S. Bird (2003) Towards a General Model for
Interlinear Text. Paper presented at the 2003 E-MELD workshop, Detroit
MI.
- Butt, Miriam, Helge Dyvik, Tracy Holloway King, Hiroshi Masuichi
and Christian Rohrer. 2002. The
Parallel Grammar Project. In Carroll, Oostdijk and Sutcliffe
(eds), Proceedings of the Workshop on Grammar Engineering and
Evaluation at the 19th International Conference on Computational
Linguistics.
- Butt, Miriam, Tracy Holloway King, María-Eugenia
Niño, and Frédérique Segond. 1999. A
Grammar Writer's Cookbook. Stanford: CSLI.
- Copestake, Ann. 2002. Implementing
Typed Feature Structure Grammars. Stanford: CSLI.
- Crabbé, Benoit and Denys Duchier. 2005. Metagrammar
Redux. In Constraint Solving and Language
Processing. Berlin: Springer.
- Farrar, Scott and Steve Moran. 2008. The
e-Linguistics Toolkit. In Proceedings of e-Humanities--an
emerging discipline: Workshop in the 4th IEEE International
Conference on e-Science. IEEE Press.
- Han, Chung-hye and Anthony Kroch. 2000. The rise of do-support in English: implications for clause structure. Proceedings of NELS 30.
- Haspelmath, Martin. 2003. The Leipzig Glossing Rules for interlinear morpheme-by-morpheme translations. Linguist List 14.1805(1).
- Haspelmath, Martin, Matthew S. Dryer, David Gil and Bernard
Comrie (eds). 2008. The World Atlas of Language Structures
Online. Munich: Max Planck Digital Library. http://wals.info.
- Hayes, Bruce, Bruce Tesar, and Kie Zuraw. 2003. "OTSoft 2.1,"
software package, http://www.linguistics.ucla.edu/people/hayes/otsoft/
- The International Phonetic Association. 1999. The Handbook of
the International Phonetic Alphabet A Guide to the Use of the
International Phonetic Alphabet. Cambridge University Press.
- Jaeger, T. Florian, Austin Frank, Carlos Gomez Gallo, and Susan
Wagner Cook. 2009. Rational language production: Evidence for uniform
information density. Paper presented at the 83rd meeeting of the LSA,
San Francisco, CA.
- Lewis, William D. 2006. ODIN:
A Model for Adapting and Enriching Legacy Infrastructure. In
‘Proceedings of the e-Humanities Workshop, held in cooperation with
e-Science 2006: 2nd IEEE International Conference on e-Science and
Grid Computing’, Amsterdam.
- MacWhinney, Brian. 2000. The CHILDES Project: Tools for
Analyzing Talk. Lawrence Erlbaum Associates.
- Marcus, M., B. Santorini & M. Marcinkiewicz. 1993 Building a
large annotated corpus of English: The Penn Treebank. Computational
Linguistics 19:313-330
- Moran, Steven and Richard A. Wright. 2009. Phonetics Information Base and Lexicon (PHOIBLE). Online: phoible.org.
- Nakhleh, L. D. Ringe, and T. Warnow. 2005. Perfect Phylogenetic
Networks: A New Methodology for Reconstructing the Evolutionary
History of Natural Languages. Language 81(2):382-420.
- Potts, Christopher and Florian Schwarz. 2008. Exclamatives
and Heightened Emotion: Extracting Pragmatic Generalizations from
Large Corpora. Ms., UMass Amherst.
- Riggle, Jason, Maximillian Bane, James Kirby, and Jeremy O'Brien. 2007. Efficiently Computing OT Typologies. Paper presented at the 81st Annual Meeting of the LSA.
- Silverman, Kim, Mary Beckman, John Pitrelli, Mari Ostendorf, Colin Wightman, Patti Price, Janet Pierrehumber, and Julia Hirschberg. 1992. TOBI: A Standard for Labeling English Prosody. Proceedings of the Second International Conference on Spoken Language Processing (ICSLP'92).
- Simons, Gary F. 2008. The Rise of Documentary Linguistics and a New Kind of Corpus. Paper presented at the 5th National Natural Language Research Symposium, De La Salle University, Manila, 25 Nov 2008.
- The Unicode Consortium. 2007. The Unicode Standard, Version 5.0. Boston, MA: Addison-Wesley.
- Verhagen, M., A. Stubbs & J. Pustejovsky. 2007. Combining independent
syntactic and semantic annotation schemes. Proceedings of the Linguistic
Annotation Workshop, 109-112, Prague, June 2007.
Last modified: Wed Jan 14 22:30:56 PST 2009