Preferred contact method
Office hours (Autumn 2022)
(Most) Mondays 4-5pm, (most) Thursdays 11am-noon & by appointment, online only, email for Zoom link
Guggenheim 414C map
I have a been a member of the faculty at the University of Washington since 2003. I am currently a Professor in the Department of Linguistics and the faculty director of the CLMS program and the director of the Computational Linguistics Laboratory. For 2019-2022, I was honored to be the Howard and Frances Nostrand Endowed Professor. I am an Adjunct Professor in both the School of Computer Science and Engineering and the Information School at UW, and a member of the Tech Policy Lab, Value Sensitive Design Lab, and RAISE.
I am the past Chair (2016-2017) of the Executive Board of NAACL and have previously served as a member of the ICCL (2014-2018; the committee responsible for Coling). I will be serving on the Executive Board of the Association for Computational Linguistics from 2022-2025 as VP Elect, VP, President and then Past President.
Prior to coming to UW, I held temporary positions at Stanford University and UC Berkeley, and worked in industry at YY Technologies. I received my PhD from the Linguistics Department at Stanford University, where I joined the HPSG and LinGO projects at CSLI. My AB (also in Linguistics) is from UC Berkeley, and I've also studied at Tohoku University in Sendai, Japan.
In 2012, LINGUIST List asked me to write an essay about how I came to be a linguist and in 2018, the LSA interviewed me for their member spotlight feature. My pronouns are she/her and my Erdős number is 4.
Multilingual Grammar EngineeringMy grammar engineering work centers on the LinGO Grammar Matrix, an open-source starter kit for the development of broad-coverage precision HPSG grammars. These grammars map strings to detailed linguistic representations in the framework of Minimal Recursion Semantics. The AGGREGATION project is investigating the automatic creation of grammars from IGT with the Grammar Matrix for the benefit of language documentation.
The Grammar Matrix is developed in the context of the DELPH-IN consortium, and Matrix-derived grammars are compatible with the DELPH-IN suite of open-source tools. The Grammar Matrix itself represents an approach to computational linguistic typology, using computational methodology to combine depth of formal methods (creating grammars which map surface strings to semantic representations) with the breadth of typological investigation (attempting to cover the known range of variants across languages for each phenomenon we approach).
Linguistics in NLP/Computation in Linguistics
I'm interested in both how computational methods can serve the purposes of linguistic analysis (as with grammar engineering) and how linguistic knowledge can be deployed to improve the performance of NLP systems. I have written two books which present linguistic concepts in a manner accessible to NLP practitioners: Linguistic Fundamentals for Natural Language Processing: 100 Essentials from Morphology and Syntax (2013) and Linguistic Fundamentals for Natural Language Processing II: 100 Essentials from Semantics and Pragmatics (2019; with Alex Lascarides).
Societal Impacts of Language Technology
Since 2016, I have been working on societal impacts of language technology, what they mean for how we carry out research and design technology, and how to adapt the NLP curriculum to include this focus. This work has included teaching semimars on the topic since early 2017, co-chairing the ethics review committee for NAACL 2021, as well as research on data documentation (the data statements project) and on the dangers of specific technology (such as large language models, or chatbots used for search). Most of my public scholarship has concerned these topics.
I am also interested in sociolinguistic variation, or the ways in which speakers manipulate the possibilities allowed by their languages to create style and register. This interest led to my involvement in the LiCORICE project, investigating the ways in which speakers express and deploy claims to authority and align with or against interlocutors. My dissertation (available online) explored how competence grammar can accommodate the relationship between non-categorical constraints on sociolinguistic variation and social meaning. In my work on the societal impacts of language technology, I frequently draw on sociolinguistic insight into language variation, linguistic discrimination, and the role of language in the production of style.
- Data Statements
- Grammar Matrix
- Language CoLLAGE
- ERG Semantic Documentation
- EL-STEC: Endangered Languages--Shared Task Evaluation Challenge
- Cyberling blog
- AMTRL: Alliance for Multilingual Teaching, Research, and Learning
- Olga Zamaraeva 2021 Assembling Syntax: Modeling Constituent Questions in a Grammar Engineering Framework
- Kristen Howell 2020 Inferring Grammars from Interlinear Glossed Text: Extracting Typological and Lexical Properties for the Automatic Generation of HPSG Grammars
- Joshua Crowgey 2019 Braiding Language (by Computer): Lushootseed Grammar Engineering
- David Inman 2019 Multi-predicate Constructions in Nuuchahnulth
- Ned Letcher 2018 Discovering Syntactic Phenomena with and within Precision Grammars (U. Melbourne)
- Michael Goodman 2018 Semantic Operations for Transfer-based Machine Translation
- Anstke Fokkens 2014 Enhancing Empirical Research for Linguistically Motivated Precision Grammars (U. Saarlandes)
- Sanghoun Song 2014 A Grammar Library for Information Structure
- Steven Moran 2012 Phonetics Information Base and Lexicon
- David Goss-Grubbs 2010 Deep Processing for a Portable Natural Language Interface to Databases
- Scott Drellishak 2009 Widespread but Not Universal: Improving the Typological Coverage of the Grammar Matrix
- Allison Dods (CLMS) 2022 Automatically Inferring Grammar Specifications for Adnominal Possession from Interlinear Glossed Text
- Elizabeth Conrad (CLMS) 2021 Tracing and Reducing Lexical Ambiguity in Automatically Inferred Grammars
- Preeti Mohan (CLMS) 2020 An Analysis of Gender Bias in K-12 Assigned Literature Through Comparison of Non-Contextual Word Embedding Models
- Lonny Alaskuk Strunk (CLMS) 2020 A Finite-State Morphological Analyzer for Central Alaskan Yup’ik
- Elizabeth Nielsen (CLMS) 2018 Modeling Adnominal Possession in the LinGO Grammar Matrix
- Chris Curtis (CLMS) 2018 A Parametric Implementation of Valence-changing Morphoplogy in the LinGO Grammar Matrix
- Michael Haeger (CLMS) 2017 An Evidentiality Library for the LinGO Grammar Matrix
- Michael Lockwood (CLMS) 2016 Automated Gloss Mapping for Inferring Grammatical Properties
- Woodley Packard (CLMS) 2015 Full Forest Treebanking
- TJ Trimble (CLMS) 2014 Adjectives in the LinGO Grammar Matrix
- David Wax (CLMS) 2014 Automated Grammar Engineering for Verbal Morphology
- Megan Schneider (CLMS) 2013 Comparative Analysis of DeepBank and the Penn Treebank
- Zina Pozen (CLMS) 2013 Using Lexical and Compositional Semantics to Improve HPSG Parse Selection
- Francesca Gola (CLMS) 2012 An Analysis of Translation Divergence Patterns using PanLex Translation Pairs
- Glenn Slayden (CLMS) 2012 Array TFS Storage for Unification Grammars
- Matt Hohensee (CLMS) 2012 It's Only Morpho-Logical: Modeling Agreement in Cross-linguistic Dependency Parsing
- Varya Gracheva 2013 Markers of Contrast in Russian: A Corpus-Based Study
- Joshua Crowgey 2012 The Syntactic Exponence of Sentential Negation: A Model for the LinGO Grammar Matrix
- Safiyyah Saleem (CLMA) 2010. Argument Optionality: A New Library for the Grammar Matrix Customization System
- Michael Goodman (CLMA) 2009 Egad: Efficiently Evaluating and Extracting Errors from Deep Grammars.
- Kelly O'Hara (CLMA) 2008 A Morphotactic Infrastructure for a Grammar Customization System
- Laurie Poulson 2006 Evaluating a Cross-Linguistic Grammar Model: Methodology and Test-Suite Resource Development