Linguistics 575: Lexical Aquisition
Course Info
Instructor Info
Syllabus
Description
Hand-built precision grammars can produce high-quality semantic
representations from input strings and generate well-formed strings
from input semantic representations. Resources such as the Grammar
Matrix can greatly speed up the creation of such precision grammars.
However, the expansion of the lexicon remains an important hurdle in
scaling up such grammars to practical coverage.
Recent work by members of the DELPH-IN consortium and other research
groups has demonstrated the possibility of automatic lexical
acquisition from corpora for broad-coverage precision grammars. This
seminar will review this work with an eye to how it could be applied
in the case of relatively small precision grammars of low-density
languages.
Prereqs: Ling 566, 570, and 571 or consent of instructor
Note: To request academic accommodations due to a
disability, please contact Disabled Student Services, 448 Schmitz,
206-543-8924 (V/TTY). If you have a letter from Disabled Student
Services indicating that you have a disability which requires academic
accommodations, please present the letter to the instructor so we can
discuss the accommodations you might need in this class.
Requirements
Schedule of Topics and Assignments (may be updated)
Bibliography
- Alishahi, A. and S. Stevenson (2007). "A
cognitive model for the representation and acquisition of verb
selectional preferences." Proceedings of the ACL-2007 Workshop on
Cognitive Aspects of Computational Language Acquisition ,
41-48. Prague, Czech Republic
- Baldwin, Timothy (2005a) General-Purpose Lexical Acquisition:
Procedures, Questions and Results, In Proceedings of the Pacific
Association for Computational Linguistics 2005, Tokyo, Japan, pp. 23
- Baldwin, Timothy (2005b) The Deep Lexical
Acquisition of English Verb-particle Constructions, Computer
Speech and Language, Special Issue on Multiword Expressions, Volume
19, Issue 4, pp. 398
- Baldwin, Timothy (2007) Scalable Deep Linguistic Processing: Mind
the Lexical Gap, In Proceedings of the 21st Pacific Asia Conference on
Language, Information and Computation (PACLIC21), Seoul, Korea,
3. pp12.
- Bangalore, Srinivas and Aravind K. Joshi. 1999. Supertagging: An
approach to almost parsing. Computational Linguistics,
25(2):237--65.
- BASILI, R., M.T. PAZIENZA, P.VELARDI (1996) Integrating General Purpose
and ased Verb Classifications, Computational Linguistics, Volume 22,
Number 4, er 1996.DecemCorpus
- BASILI, R., PAZIENZA, M.T.,VINDIGNI M (1997) Corpus-driven
Unsupervised Learning of Verb Subcategorization Frames, in AI*IA 97:
Advances in Artificial Intelligence, M. Lenzerini Ed., Lecture Notes
in Artificial Intelligence n., 1321, Springer Verlag, Berlin,
Heidelberg,Corpus
- Boleda, Gemma, Toni Badia and Eloi Batlle. 2004. Acquisition of
Semantic Classes for Adjectives from Distributional Evidence. In
Proceedings of the 20th International Conference on Computational
Linguistics (COLING 2004), pp. 1119-1125, Geneva, Switzerland
- Boleda, G., T. Badia, S. Schulte im Walde. 2005. Morphology vs. Syntax in Adjective Class Acquisition. In
Proceedings of the ACL-SIGLEX 2005 Workshop on Deep Lexical
Acquisition, June 30, Ann Arbor, USA
- Blunsom, Phil and Timothy Baldwin (2006) Multilingual Deep Lexical
Acquisition for HPSGs via Supertagging, In Proceedings of the 2006
Conference on Empirical Methods in Natural Language Processing (EMNLP
2006), Sydney, Australia, pp. 164
- Bouillon, Pierrette, Vincent Claveau, Cécile Fabre, and Pascale
Sébillot. (2002) Acquisition of qualia elements from corpora--evaluation
of a symbolic learning method. In LREC 2002.
- Brent, Michael R. 1993. From grammar to
lexicon: Unsupervised learning of lexical syntax. Computational
Linguistics, 19(2):243--62.
- Briscoe, Ted and John Carroll. 1997. Automatic extraction of subcategorization from corpora. In Proc. of the 5th
Conference on Applied Natural Language Processing (ANLP), pages 356--63, Washington DC, USA.
- Cimiano, Philipp and Johanna Wenderoth (2005) Automatically Learning Qualia
Structures from the Web In Timothy Baldwin, Anna Korhonen, Aline
Villavicencio, Proceedings of the ACL Workshop on Deep Lexical
Acquisition, pp. 28-37. Association for Computational Linguistics, Ann
Arbor, Michigan
- Cimiano, Philipp and Johanna Wenderoth (2007) Automatic Acquisition of
Ranked Qualia Structures from the Web In Proceedings of the Annual
Meeting of the Association for Computational Linguistics (ACL),
pp. 888--895.
- Clark, Stephen and James R. Curran. 2004. The importance of
supertagging for wide-coverage CCG parsing. In Proc. of the 20th
International Conference on Computational Linguistics (COLING 2004),
pages 282--8, Geneva, Switzerland.
- Cucerzan, Silviu and David Yarowsky. 2003. Minimally
supervised induction of grammatical gender. In Proc. of the 3rd
International Conference on Human Language Technology Research and 4th
Annual Meeting of the NAACL (HLT-NAACL 2003), pages 40--7, Edmonton,
Canada.
- Fazly, Afsaneh; North, Ryan, and Stevenson, Suzanne (2005) Automatically
distinguishing literal and figurative usages of highly polysemous
verbs, in Proceedings of the ACL 2005 Workshop on Deep Lexical
Acquisition, Ann Arbor, USA, June 2005
- Fazly, Afsaneh; Stevenson, Suzanne, and North, Ryan (2007) Automatically
learning semantic knowledge about multiword predicates, (29
pages), journal of Language Resources and Evaluation, 41(1); original
publication is available from Springer
- Joanis, E., S. Stevenson, and D. James (2007). ``A
General Feature Space for Automatic Verb Classification.'' Natural
Language Engineering. Forthcoming.
- Kim, Su Nam and Timothy Baldwin (2006) Automatic
Identification of English Verb Particle Constructions using Linguistic
Features, In Proceedings of the Third ACL-SIGSEM Workshop on
Prepositions, Trento, Italy, pp. 65
- Korhonen, Anna. 2008. Tools and Procedures for the Acquisition of
Morphological and Syntactical Information from Corpora. To Appear in
the International Handbook of Dictionaries. Mouton de Gruyter, Berlin.
- Lewis, W. D. & Xia, F. 2008. Automatically Identifying Computationally
Relevant Typological Features, in Proceedings of The Third
International Joint Conference on Natural Language Processing
(IJCNLP). Hyderabad, January 2008.
- Light, Marc. 1996. Morphological cues
for lexical semantics. In Proc. of the 34th Annual Meeting of the
ACL, pages 25--31, Santa Cruz, USA
- Manning, Christopher D. 1993. Automatic acquisition of a large
subcategorization dictionary from corpora. In Proc. of the 31st Annual
Meeting of the ACL, pages 235--42.
- Mayol, L., G. Boleda, T. Badia. 2005. Automatic acquisition of
syntactic verb classes with basic resources. Language Resources and
Evaluation, 39(4):295-312
- Merlo, P. and S. Stevenson (2005). ``Structure and Frequency in
Verb Classification.'' In L. Bruge, G. Giusti, N. Munaro,
W. Schweikert, and G. Turano, Eds., Contributions to the Thirtieth
``Incontro di Grammatica Generativa'', 43-61. Venice, Italy, February
2004
- Merlo, P., S. Stevenson, V. Tsang and G. Allaria (2002). ``A
Multilingual Paradigm for Automatic Verb Classification.'' Proceedings
of the 40th Annual Meeting of the Association for Computational
Linguistics (ACL-2002), 207-214. Philadelphia, Pennsylvania, July,
2002.
- Mihalcea, Rada and Timothy Chklovski. (2003) The Web as Collective Mind: Building Large Annotated Corpora with User's Help 1st Meaning Workshop, San Sebastian, Basque Country.
- Moschitti, Alessandro and Roberto Basili. (2005) Verb
subcategorization kernels for automatic semantic labeling. In
Proceedings of the ACL05 Workshop on Deep Lexical Acquisition, Ann
Arbor (MI), USA, 2005.
- Nicholson, Jeremy, Timothy Baldwin, and Phil Blunsom. 2006. Die
morphologie (f): Targeted lexical acquisition for languages other than
English. In Proc. of the Australasian Language Technology Workshop
2006, pages 67--74, Sydney, Australia.
- Oepen, Stephan, Kristina Toutanova, Stuart Shieber, Christopher
Manning, Dan Flickinger, and Thorsten Brants. 2002. The LinGO Redwoods
Treebank: Motivation and preliminary applications. In Proc. of the
19th Interna- tional Conference on Computational Linguistics (COLING
2002), pages 1253--7, Taipei, Taiwan.
- Preiss, Judita, Ted Briscoe and Anna Korhonen. 2007. A System for
Large-scale Acquisition of Verbal, Nominal and Adjectival
Subcategorization Frames from Corpora. In Proceedings of the 45th
Annual Meeting of the Association for Computational
Linguistics. Prague, Czech Republic.
- Schulte im Walde, Sabine (forthcoming)
The Induction of Verb Frames and Verb Classes from Corpora.
In: Anke Lüdeling and Merja Kytö (eds)
Corpus Linguistics. An International Handbook. Mouton de Gruyter, Berlin.
- Sun, Lin, Anna Korhonen, and Yuval Krymolowski. 2008. Automatic
Classification of English Verbs Using Rich Syntactic Features. In
Proceedings of the 3rd International Joint Conference on Natural
Language Processing. Hyderabad, India.
- Tsang, V., S. Stevenson and P. Merlo (2002). ``Crosslinguistic
Transfer in Automatic Verb Classification.'' Proceedings of the
19th International Conference on Computational Linguistics
(COLING-2002), 1023-1029. Taipei, Taiwan, August, 2002
- Villavicencio, Aline, Valia Kordoni, Yi Zhang, Marco Idiart, and
Carlos Ramisch (2007) Validation and
evaluation of automatically acquired multiword expressions for grammar
engineering. In Proceedings of the 2007 joint conference on
empirical methods in natural language processing and computational
natural language learning (EMNLP-CoNLL 2007), pages 1034
- Xia, F. & Lewis, W. D. (2007) Multilingual Structural Projection across
Interlinearized Text, in The Annual Conference of the North American
Chapter of the Association for Computational Linguistics (NAACL-HLT
2007), Rochester, NY.
- Xia, F. & Lewis, W. D. (2008) Repurposing
Theoretical Linguistic Data for Tool Development and Search, in
Proceedings of The Third International Joint Conference on Natural
Language Processing (IJCNLP). Hyderabad, India.
- Zhang, Yi, Timothy Baldwin and Valia Kordoni (2007) The Corpus and the
Lexicon: Standardising Deep Lexical Acquisition Evaluation, In
Proceedings of the ACL 2007 Workshop on Deep Linguistic Processing,
Prague, Czech Republic, pp. 152
- Zhang, Yi and Valia Kordoni (2006) Automated
deep lexical acquisition for robust open texts processing. In
Proceedings of the 5th international conference on language resources
and evaluation (LREC 2006), pages 275
- Zhang, Yi and Valia Kordoni. A
statistical approach towards unknown word type prediction for deep
grammars. In Proceedings of the australasian language technology
workshop 2005, pages 24