Linguistics 575: Societal Impacts of Language Technology

Autumn Quarter, 2023

Course Info

Instructor Info

Syllabus

Description

The goal of this course is to better understand the ethical considerations that arise in the deployment of NLP technology, including how to identify people likely to be impacted by the use of the technology (direct and indirect stakeholders), what kinds of risks the technology poses, and how to design systems in ways that better support stakeholder values.

Through discussions of readings in the growing research literature on fairness, accountability, transparency and ethics (FATE) in NLP and allied fields, and value sensitive design, we will seek to answer the following questions:

Course projects are expected to take the form of a term paper analyzing some particular NLP task or data set in terms of the concepts developed through the quarter and looking forward to how ethical best practices could be developed for that task/data set.

Prerequisites: Graduate standing. The primary audience for this course is expected to be CLMS students, but graduate students in other programs are also welcome.

Accessibility policies

If you have already established accommodations with Disability Resources for Students (DRS), please communicate your approved accommodations to me at your earliest convenience so we can discuss your needs in this course.

If you have not yet established services through DRS, but have a temporary health condition or permanent disability that requires accommodations (conditions include but not limited to; mental health, attention-related, learning, vision, hearing, physical or health impacts), you are welcome to contact DRS at 206-543-8924 or uwdrs@uw.edu or disability.uw.edu. DRS offers resources and coordinates reasonable accommodations for students with disabilities and/or temporary health conditions. Reasonable accommodations are established through an interactive process between you, your instructor(s) and DRS. It is the policy and practice of the University of Washington to create inclusive and accessible learning environments consistent with federal and state law.

Washington state law requires that UW develop a policy for accommodation of student absences or significant hardship due to reasons of faith or conscience, or for organized religious activities. The UW's policy, including more information about how to request an accommodation, is available at Faculty Syllabus Guidelines and Resources. Accommodations must be requested within the first two weeks of this course using the Religious Accommodations Request form available at https://registrar.washington.edu/students/religious-accommodations-request/.

[Note from Emily: The above language is all language suggested by UW and in the immediately preceding paragraph in fact required by UW. I absolutely support the content of both and am struggling with how to contextualize them so they sound less cold. My goal is for this class to be accessible. I'm glad the university has policies that help facilitate that. If there is something you need that doesn't fall under these policies, I hope you will feel comfortable bringing that up with me as well.]

Requirements

Schedule of Topics and Assignments (still subject to change)

NOTE: still need to place the ethics statement assignment.
DateTopicReadingDue
9/27 Introduction, organization
Why are we here? What do we hope to accomplish?
No reading assumed for first day  
9/29     KWLA papers: K & W due 11pm
10/4 Foundational readings Choose two articles from Foundations below, and be prepared to discuss our reading questions:
  • What "common sense" do we take for granted when thinking about "AI" technology that might not necessarily be true?
  • How does technology exacerbate societal harms? (examples and mechanisms)
  • How do we define "ethical" or "unethical"?
  • What expertise are the authors writing from?
  • How much do people who are benefitting from the tech know about how it's harming others?
  • How do we think about identity categories/types of communities in this space and which ones are particularly salient? How do we identify affected communities?
 
10/11 Value sensitive design Choose two articles from Value sensitive design below, and be prepared to discuss our reading questions:
  • What kinds of values recur across different projects? What values are specific to different communities?
  • How does "values" differ from "ethics"?
  • How do you evaluate the success of a design process/product in terms of supporting values?
  • Is value sensitive design intended to reduce harm? If so, how?
  • Who is considered the target demographic, within a given a value sensitive design study?
  • How can these methodologies be applied in computational linguistics?
  • What does the opposite look like? What is value agnostic design?
 
10/18 Bias and discrimination Choose two articles from Bias/Discrimination below, and be prepared to discuss our reading questions:
  • How is bias defined and measured?
  • Is it possible to over-interpret measures of bias?
  • What kind of taxonomies of harm arising from bias could we create?
  • How do the papers address intersectional identities?
  • Where is the line between generalizability and discrimination?
  • What are the ways that bias sneaks in?
  • What solutions do the authors offer?
 
10/25 Language variation and emergent bias
Translation technologies
Choose two articles from Language Variation and Emergent Bias and/or Translation Technologies below, and be prepared to discuss our reading questions:
  • How well does model adaptation work, especially across related languages, to boost performance in low-resource languages?
  • How are language attitudes reflected in the design of NLP tools?
  • How does knowledge of sociolinguistics help in the improvement of cross-variety performance?
  • How do MT products affect the study of translation and literature?
  • What kinds of biases can machine translation amplify? How can these be mitigated?
  • What positive and negative impacts does MT have on the world? How can negative impacts be mitigated?
    • What are the risks of using MT as opposed to human translators?
  • Bonus, perhaps not well-matched to these readings: How do tech policies accommodate different languages and their different needs?
 
10/27 Term paper proposals due
11/1 Scicomm and ethics education Choose two articles from SciComm and Ethics Education below, and be prepared to discuss our reading questions:
  • How can we account for cultural differences when doing science communication?
  • How might technology (internet, NLP, social media) make science communication easier or more difficult?
  • How do you decide what to communicate about in science communication? (How do you know, "Here's something I need to tell the public about?")
  • How do you decide what level of detail to include?
  • What are the incentives for researchers to engage in good science communication?
  • How do you get the science communication artifact out where it can be seen?
  • What are the pushbacks against ethics education?
  • What's the current state of ethics education in NLP? What is/isn't being talked about?
  • Why are we currently lacking in ethics education in NLP?
 
11/3 Scicomm exercise due
11/8 Documentation and transparency Choose two articles from Documentation and Transparency below, and be prepared to discuss our reading questions:
  • What kinds of metadata are requested by the documentation toolkits? How should the information be collected?
  • Who has access to the documentation?
  • How can transparency benefit others in the field & how can it benefit the public? (What makes documentation actionable?)
  • How does the process of documentation help dataset & model creators think about societal impact?
  • How are data documentation toolkits evaluated?
  • What ethical considerations are raised about data and model documentation?
  • How the has practice of dataset and model documentation changed over time? How can documentation be kept current with information needs?
  • What incentives and disincentives exist for doing dataset & model documentation?
 
11/13 Term paper outline due
11/15 Content moderation and toxicity detection Choose two articles from Content moderation and toxicity detection below, and be prepared to discuss our reading questions:
  • What metrics are used to classify speech as hateful or toxic?
  • What language varieties are more or less likely to be classified as abusive/hateful?
  • What strategies can be employed to recognize and handle culturally specific forms of hate speech and abusive language?
  • What steps are taken to prevent automated content moderation systems from performing unintended censorship?
  • What areas are often neglected by existing toxicity detections systems and taxonomies of abuse & harm?
  • How do these systems enforce existing systems of power and how can that be prevented? Whose perspective are privileged?
  • How important is model transparency in NLP applications for content moderation, and how can it be achieved?
 
11/22 Policy, regulation and guidelines,
Ethics statements
Choose one article from Changing Practice: Policy, regulation, and guidelines below, and be prepared to discuss our reading questions (as relevant to that piece). Also, bring your draft ethical considerations section.
  • What are the differences between guidelines, guardrails, principles, policies, regulations, and how are they made binding or non-binding and how are they decided on?
  • Who is the target audience of the policy/guidelines/etc.?
  • How have policy proposals evolved/adapted to new technologies over time?
  • How are these policies enforced/what are the incentives or consequences involved?
  • Who is responsible for proposing policies? Whose else is getting their interests/perspectives consulted?
Ethical considerations section draft (bring to class)
11/27 Ethical considerations section
11/29 Privacy Choose two articles from Privacy below, and be prepared to discuss our reading questions:
  • What is PII and what does it include? What does it leave out?
  • What communities may be disproportionately impacted by privacy concerns?
  • Is privacy an individual or collective right?
  • What personal information isn't really private (practically speaking)?
  • How much is tacit/coerced consent relied on to skirt privacy concerns? What are the issues with this?
  • How to navigate tensions between privacy and public safety? Who resolves this and how?
  • What laws/policies are in place to protect privacy/personal information? How can policies be revised?
  • How do privacy issues impact research design decisions and our responsibility as researchers?
 
12/1     Term paper draft due
12/6 ChatGPT/synthetic media machines
  • TBD
Comments on partner's paper draft due
12/8     KWLA papers due
12/12     Final papers due 11pm


Bibliography

NOTE This is still very much a work in progress! I have more papers to add, and some of these need to be recategorized.

Overviews/Calls to Action

Foundations

Philosophical Underpinnings

Value Sensitive Design and Other Design Approaches

Documentation and Transparency

Other Best Practices

Bias/Discrimination

Fairness

Other resources on bias

More papers in the Proceedings of the First Workshop on Gender Bias in Natural Language Processing

Demographic variables

Chatbots

Privacy

Social Media

Content moderation/Toxicity detection

See also the papers in the Proceedings Workshop on Online Abuse and Harms: WOAH 2021, WOAH 2022, WOAH 2023

Crowdsourcing and labor conditions

Language Variation and Emergent Bias

Translation Technologies

Biomedical NLP, Mental Health and Social Media

NLP Apps Addressing Ethical Issues/NLP for Social Good

Other Issues in NLP: Carbon Emissions, Generation, ...

SciComm and Ethics Education

Changing Practice: Policy, regulation, and guidelines

Ethics Statements

Reading notes

Papers:

Synthetic Media Machines

Other Readings

Links

Conferences/Workshops

Other lists of resources

Other courses


ebender at u dot washington dot edu
Last modified: