Linguistics 575: MRS in Applications

Spring quarter, 2016

Course Info

Lecture: Wednesdays, 3:30-5:50 in SAV 130 and online
Our GoPost bulletin board
Our CollectIt dropbox
Our CommonView page for course recordings and files not otherwise accessible on the web.

Instructor Info

Emily M. Bender
Office Hours: (most) Fridays 1:30-3 & by appointment
Office: GUG 418-B (If I'm not in my office, check the Treehouse.)
Phone: 543-6914 (nb: I pick up email before I pick up voice mail)
Email: ebender at u

Syllabus

Description

The English Resource Grammar (Flickinger 2000, 2011) is a broad-coverage precision grammar for English, written in HPSG (Pollard and Sag 1994) and producing semantic representations in the format of Minimal Recursion Semantics (Copestake et al 2005). It encompasses analyses of a wide range of phenomena in English, and a key piece of each analysis is the design of the resulting semantic representation. The MRS representations are built compositionally by the grammar and represent a significant abstraction away from the surface string.

The goal of this seminar is to explore how the MRS representations can be used to inform semantically-sensitive NLP tasks, such as anaphora resolution, event detection, or relation extraction. We will begin with an overview of MRS, and then move on to an exploration of candidate tasks and how to create machine learning features from MRSs to augment existing solutions to those tasks. Term projects (which may be done in pairs) will involve selecting an existing annotated data set for a semantically-sensitive task as well as an existing baseline solution and then attempting to improve on the baseline by adding MRS-based features.

Prereqs: This is a hands-on course that presupposes sufficient knowledge of NLP systems to work with and augment existing solutions. Students should have taken Ling 570 (or equivalent) and ideally also Ling 571/572 or be concurrently enrolled in those courses. Ling 566 may be beneficial, but is not required.

Note: To request academic accommodations due to a disability, please contact Disabled Student Services, 448 Schmitz, 206-543-8924 (V/TTY). If you have a letter from Disabled Student Services indicating that you have a disability which requires academic accommodations, please present the letter to the instructor so we can discuss the accommodations you might need in this class.

Requirements

KWLA paper (approx 7 pages) (20).
Smaller assignments (10)
In-class presentations (5)
Participation in discussions (incl. GoPost) (15).
Term project (50).

Schedule of Topics and Assignments (still subject to change)

Date	Topic	Reading	Due
3/30	Introduction, organization Why use semantics? The DELPH-IN ecology	Bender 2013: Ch 9 Bender et al 2015
4/4			Sample MRS output from the ERG
4/6	Minimal Recursion Semantics	Copestake et al 2005 DELPH-IN wiki page on EDS	KWLA: K and W due
4/13	Syntactic features in shared tasks	2-3 papers from the list below under "Papers about Tasks", or others that you propose
4/20	Target task/baseline system presentations	Wicke Monteverde; Garnick; Preddy
4/27	Target task/baseline system presentations	Team BioNLP; LaTerza; Shintani
5/2			Target task/baseline system descriptions
5/4	Evaluation and Error Analysis	Discussion prep
5/9			Evaulation plans
5/11	MRS Feature Design	Zhang et al (ms) (Sec 5 & 6) Flickinger et al 2013: main page, Basics, then choose 2-3 phenomena pages to read Kramer and Gordon 2014
5/18	MRS Feature Design	Lien and Kouylekov 2015 Tanaka et al 2007, Packard et al 2014
5/23			Feature design
5/25	Term project presentations	LaTerza; Preddy; Lane+Horn
6/1	Term project presentations	Garnick; Shintani; Wicke-Monteverde
6/3			KWLA papers due
6/9			Final projects due 11pm

Bibliography

General background

Bender, Emily M., Dan Flickinger, Stephan Oepen, Woodley Packard and Ann Copestake. 2015. Layers of Interpretation: On Grammar and Compositionality. In Proceedings of the 11th International Conference on Computational Semantics (IWCS 2015), London. pp.239-249.
Bender, Emily. 2013. Linguistic Fundamentals for Natural Language Processing: 100 Essentials from Morphology and Syntax. Synthesis Lectures on Human Language Technologies #20. Morgan & Claypool Publishers.
Copestake, A., Flickinger, D., Pollard, C., & Sag, I. A. 2005. Minimal recursion semantics: An introduction. Research on Language & Computation, 3 (4), 281-332.
Dridan, Rebecca and Stephan Oepen. 2011. Parser evaluation using elementary dependency matching. Proceedings of the 12th International Conference on Parsing Technologies. Association for Computational Linguistics.
Flickinger, Dan, Bender, Emily M. and Oepen, Stephan. 2013. ERG Semantic Documentation. Available online at http://www.delph-in.net/esd. Accessed on 2014-02-11.

Papers about tasks

Clinical Temp Eval

Bethard, Steven, et al. Semeval-2015 task 6: Clinical tempeval. Proc. SemEval (2015).
Xu Y, Wang Y, Liu T, et al. An end-to-end system to identify temporal relation in discharge summaries: 2012 i2b2 challenge. J Am Med Inform Assoc 2013;20:849–58.
Cherry C, Zhu X, Martin J, et al. A la Recherche du Temps Perdu—extracting temporal relations from medical text in the 2012 i2b2 NLP challenge. J Am Med Inform Assoc 2013;20:843–8.
Miller, T. A., Bethard, S., Dligach, D., Lin, C., & Savova, G. K. (2010). Extracting Time Expressions from Clinical Text.
Ling, X., & Weld, D. S. (2010, March). Temporal Information Extraction. In AAAI (Vol. 10, pp. 1385-1390).

Risk Factor Identification (i2b2 2014)

Stubbs, Amber, et al. Identifying risk factors for heart disease over time: overview of 2014 i2b2/UTHealth shared task Track 2. Journal of biomedical informatics 58 (2015): S67-S77.

De-identification

Sibanda, T., He, T., Szolovits, P., & Uzuner, O. (2006). Syntactically-informed semantic category recognizer for discharge summaries. In AMIA annual symposium proceedings (Vol. 2006, p. 714). American Medical Informatics Association.
Uzuner, Ö., Sibanda, T. C., Luo, Y., & Szolovits, P. (2008). A de-identifier for medical discharge summaries. Artificial intelligence in medicine, 42(1), 13-35.

Ontology Extraction

Park, Jinsoo, Wonchin Cho, and Sangkyu Rho. Evaluating ontology extraction tools using a comprehensive evaluation framework. Data & Knowledge Engineering 69.10 (2010): 1043-1061.
Poon, Hoifung, and Pedro Domingos. Unsupervised ontology induction from text. Proceedings of the 48th annual meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 2010.

Robust Textual Entailment/Semantic Textual Similarity

Agichtein, Eugene, Walt Askew, and Yandong Liu. 2008. Combining lexical, syntactic, and semantic evidence for textual entailment classification. Proceedings of TAC 31.
Agirrea, Eneko, et al. Semeval-2015 task 2: Semantic textual similarity, english, spanish and pilot on interpretability. Proceedings of the 9th international workshop on semantic evaluation (SemEval 2015). 2015.
Mehdad, Yashar, Alessandro Moschitti, and Fabio Massimo Zanzotto. 2010. Syntactic/semantic structures for textual entailment recognition. In Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics, pp. 1020-1028. Association for Computational Linguistics.
Šarić, Frane, Goran Glavaš, Mladen Karan, Jan Šnajder, and Bojana Dalbelo Bašić. 2012. Takelab: Systems for measuring semantic text similarity. In Proceedings of the First Joint Conference on Lexical and Computational Semantics-Volume 1: Proceedings of the main conference and the shared task, and Volume 2: Proceedings of the Sixth International Workshop on Semantic Evaluation, pp. 441-448. Association for Computational Linguistics.
Wang, Mengqiu, and Christopher D. Manning. 2010. Probabilistic tree-edit models with structured latent variables for textual entailment and question answering. In Proceedings of the 23rd International Conference on Computational Linguistics, pp. 1164-1172. Association for Computational Linguistics.
Vo, N. P. A., & Popescu, O. Learning the Impact and Behavior of Syntactic Structure: A Case Study in Semantic Textual Similarity.
Aliaksei Severyn, Massimo Nicosia, and Alessandro Moschitti. 2013. ikernels-core: Tree kernel learning for textual similarity. In Proceedings of the Second Joint Conference on Lexical and Computational Semantics, volume 1, pages 53–58. Citeseer.
Marsi, E., & Krahmer, E. (2010, August). Automatic analysis of semantic similarity in comparable text through syntactic tree matching. In Proceedings of the 23rd International Conference on Computational Linguistics (pp. 752-760). Association for Computational Linguistics.

Coreference resolution

Bergsma, Shane, and Dekang Lin. 2006. Bootstrapping path-based pronoun resolution. In Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics, pp. 33-40. Association for Computational Linguistics.
Haghighi, Aria, and Dan Klein. 2009. Simple coreference resolution with rich syntactic and semantic features. In Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 3-Volume 3, pp. 1152-1161. Association for Computational Linguistics.
Ng, Vincent. 2010. Supervised noun phrase coreference research: The first fifteen years. In Proceedings of the 48th annual meeting of the association for computational linguistics, pp. 1396-1411. Association for Computational Linguistics.

Question Answering

Ravichandran, Deepak, and Eduard Hovy. Learning surface text patterns for a question answering system. Proceedings of the 40th annual meeting on association for computational linguistics. Association for Computational Linguistics, 2002.

Discourse Processing

Xue, Nianwen, et al. The conll-2015 shared task on shallow discourse parsing. Proceedings of CoNLL. 2015.

Robust Textual Entailment/Semantic Textual Similarity

Agichtein, Eugene, Walt Askew, and Yandong Liu. 2008. Combining lexical, syntactic, and semantic evidence for textual entailment classification. Proceedings of TAC 31.
Mehdad, Yashar, Alessandro Moschitti, and Fabio Massimo Zanzotto. 2010. Syntactic/semantic structures for textual entailment recognition. In Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics, pp. 1020-1028. Association for Computational Linguistics.
Šarić, Frane, Goran Glavaš, Mladen Karan, Jan Šnajder, and Bojana Dalbelo Bašić. 2012. Takelab: Systems for measuring semantic text similarity. In Proceedings of the First Joint Conference on Lexical and Computational Semantics-Volume 1: Proceedings of the main conference and the shared task, and Volume 2: Proceedings of the Sixth International Workshop on Semantic Evaluation, pp. 441-448. Association for Computational Linguistics.
Wang, Mengqiu, and Christopher D. Manning. 2010. Probabilistic tree-edit models with structured latent variables for textual entailment and question answering. In Proceedings of the 23rd International Conference on Computational Linguistics, pp. 1164-1172. Association for Computational Linguistics.

Sentiment Analysis

Jia, Lifeng, Clement Yu, and Weiyi Meng. 2009. The effect of negation on sentiment analysis and retrieval effectiveness. In Proceedings of the 18th ACM conference on Information and knowledge management, pp. 1827-1830. ACM.
Nakagawa, Tetsuji, Kentaro Inui, and Sadao Kurohashi. 2010 Dependency tree-based sentiment classification using CRFs with hidden variables. In Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics, pp. 786-794. Association for Computational Linguistics.
Socher, Richard, Alex Perelygin, Jean Y. Wu, Jason Chuang, Christopher D. Manning, Andrew Y. Ng, and Christopher Potts. 2013 Recursive deep models for semantic compositionality over a sentiment treebank. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP).

Summarization

Louis, Annie, Aravind Joshi, and Ani Nenkova. 2010. Discourse indicators for content selection in summarization. In Proceedings of the 11th Annual Meeting of the Special Interest Group on Discourse and Dialogue, pp. 147-156. Association for Computational Linguistics.
Wang, Sicui, Weijiang Li, Feng Wang, and Hui Deng. 2010. A Survey on Automatic Summarization. In Information Technology and Applications (IFITA), 2010 International Forum on, vol. 1, pp. 193-196. IEEE.
Christensen, Janara, Stephen Soderland Mausam, and Oren Etzioni. 2013. Towards Coherent Multi-Document Summarization. In Proceedings of NAACL-HLT, pp. 1163-1173.

Word Sense Disambiguation

Lu, Wenpeng, Heyan Huang, and Chaoyong Zhu. 2012. Feature Words Selection for Knowledge-based Word Sense Disambiguation with Syntactic Parsing. Przegląd Elektrotechniczny 88, no. 1b: 82-87.
Navigli, Roberto. 2009. Word sense disambiguation: A survey. ACM Computing Surveys (CSUR) 41, no. 2: 10.
Pitler, Emily, and Ani Nenkova. 2009. Using syntax to disambiguate explicit discourse connectives in text. In Proceedings of the ACL-IJCNLP 2009 Conference Short Papers, pp. 13-16. Association for Computational Linguistics.
Tratz, Stephen, Antonio Sanfilippo, Michelle Gregory, Alan Chappell, Christian Posse, and Paul Whitney. 2007. PNNL: a supervised maximum entropy approach to word sense disambiguation. In Proceedings of the 4th International Workshop on Semantic Evaluations, pp. 264-267. Association for Computational Linguistics.

MRS feature design

Fujita, Sanae, Francis Bond, Stephan Oepen, and Takaaki Tanaka. 2010. Exploiting semantic information for HPSG parse selection. Research on Language and Computation 8(1):1-22.
Kramer, Jared and Clara Gordon. 2014. Improvement of a Naive Bayes Sentiment Classifier Using MRS-Based Features. Proceedings of Starsem 2014.
Lien, Elisabeth and Kouylekov, Milen. 2015. Semantic parsing for textual entailment. In Proceedings of the 14th International Conference on Parsing Technologies (p. 40–49). Bilbao, Spain.
Oepen, Stephan, Erik Velldal, Jan Tore Lønning, Paul Meurer, Victoria Rosén, and Dan Flickinger. 2007. Towards hybrid quality-oriented machine translation. on linguistics and probabilities in MT. In: 11th International Conference on Theoretical and Methodological Issues in Machine Translation: TMI2007. [pdf available from course CommonView]
Packard, Woodley, Emily M. Bender, Jonathon Read, Stephan Oepen and Rebecca Dridan. 2014. Simple Negation Scope Resolution through Deep Parsing: A Semantic Solution to a Semantic Problem. Proceedings of ACL 2014, Baltimore, MD. [data/software]
Pozen, Zinaida. 2013. Using Lexical and Compositional Semantics to Improve HPSG Parse Selection. MS thesis, University of Washington, 2013.
Tanaka, Takaaki, Francis Bond, Timothy Baldwin, Sanae Fujita, and Chikara Hashimoto. 2007. Word Sense Disambiguation Incorporating Lexical and Structural Semantic Information. In EMNLP-CoNLL, pp. 477-485.
Zhang, Y., Oepen, S., Dridan, R., Flickinger, D., and Krieger, H.U. Robust Parsing, Meaning Composition, and Evaluation. Integrating Grammar Approximation, Default Unification, and Elementary Semantic Dependencies (unpublished manuscript).

ebender at u dot washington dot edu

Last modified: 4/7/16