Linguistics 575: Societal Impacts of NLP

Autumn Quarter, 2021

Course Info

Instructor Info

Syllabus

Description

The goal of this course is to better understand the ethical considerations that arise in the deployment of NLP technology, including how to identify people likely to be impacted by the use of the technology (direct and indirect stakeholders), what kinds of risks the technology poses, and how to design systems in ways that better support stakeholder values.

Through discussions of readings in the growing research literature on fairness, accountability, transparency and ethics (FATE) in NLP and allied fields, and value sensitive design, we will seek to answer the following questions:

Course projects are expected to take the form of a term paper analyzing some particular NLP task or data set in terms of the concepts developed through the quarter and looking forward to how ethical best practices could be developed for that task/data set.

Prerequisites: Graduate standing. The primary audience for this course is expected to be CLMS students, but graduate students in other programs are also welcome.

Accessibility policies

If you have already established accommodations with Disability Resources for Students (DRS), please communicate your approved accommodations to me at your earliest convenience so we can discuss your needs in this course.

If you have not yet established services through DRS, but have a temporary health condition or permanent disability that requires accommodations (conditions include but not limited to; mental health, attention-related, learning, vision, hearing, physical or health impacts), you are welcome to contact DRS at 206-543-8924 or uwdrs@uw.edu or disability.uw.edu. DRS offers resources and coordinates reasonable accommodations for students with disabilities and/or temporary health conditions. Reasonable accommodations are established through an interactive process between you, your instructor(s) and DRS. It is the policy and practice of the University of Washington to create inclusive and accessible learning environments consistent with federal and state law.

Washington state law requires that UW develop a policy for accommodation of student absences or significant hardship due to reasons of faith or conscience, or for organized religious activities. The UW's policy, including more information about how to request an accommodation, is available at Faculty Syllabus Guidelines and Resources. Accommodations must be requested within the first two weeks of this course using the Religious Accommodations Request form available at https://registrar.washington.edu/students/religious-accommodations-request/.

[Note from Emily: The above language is all language suggested by UW and in the immediately preceding paragraph in fact required by UW. I absolutely support the content of both and am struggling with how to contextualize them so they sound less cold. My goal is for this class to be accessible. I'm glad the university has policies that help facilitate that. If there is something you need that doesn't fall under these policies, I hope you will feel comfortable bringing that up with me as well.]

Requirements

Schedule of Topics and Assignments (still subject to change)

NOTE: still need to place the ethics statement assignment.
DateTopicReadingDue
9/29 Introduction, organization
Why are we here? What do we hope to accomplish?
No reading assumed for first day  
10/1     KWLA papers: K & W due 11pm
10/6 Foundational readings Choose two articles from Foundations below, and be prepared to discuss our reading questions:
  • Who's "truth" or "right" should we be considering?
  • In a perfect world, what does this look like? (And what is a "perfect world"?)
  • What are the harms that the paper identifies? How are they quantified and what perspective is involved in that?
  • How does the paper engage with systems of power?
  • What are the implications for or connections to NLP?

NB: For book-length pieces, it's fine to choose a chapter to read.

 
10/13 Value sensitive design Choose two articles from Value sensitive design below, and be prepared to discuss our reading questions:
  • What techniques are proposed in this paper?
  • What's the relationship between value sensitive design and NLP or AI? How can we apply the ideas in this paper to NLP related tasks?
  • What has changed and what has stayed the same between old and new? What are key principles that emerge as central to the enterprise?
  • Whose notions of ethics/whose values are centered?
 
10/20 Bias and discrimination Choose two articles from Bias/Discrimination below, and be prepared to discuss our reading questions:
  • What is the operational definition of bias in this paper?
  • What kinds of bias/bias against whom did the author try to mitigate and what solutions did they find?
  • What are the real-world harms linked to the bias?
  • How was the bias discovered (how did they think to check for this?) and how was it quantified/measured?
  • What planning is proposed to consider before building a dataset/system?
  • What has changed between older and newer papers in this space in terms of how we think about/talk about bias?
 
10/27 Labor conditions/crowdsourcing + demographic variables Choose two articles from Demographic variables and Crowdsourcing below (with at least one from Demographic variabels), and be prepared to discuss our reading questions:
  • What are the implications for worker's power and solidarity?
  • What are the connections to or implications for NLP?
  • What are the best practices suggested/normative suggestions in the paper?
  • What are the main ethical issues brought up in these papers?
  • What is the relationship between this paradigm and data collection paradigms based on surveillance capitalism?
  • What conditions and complications does crowd-sourcing hide?
 
10/29 Term paper proposals due
11/3 SciComm and Ethics Education Choose two articles from SciComm and Ethics Education below, and be prepared to discuss our reading questions:
  • Who is the audience of the SciComm work to be produced?
  • How does the article address complexity, in the context of making complex issues accessible to the public?
  • What is recommended about how to do SciComm well?
  • To what extent and why should we feel obligated to communicate to the public?
 
11/5 Scicomm exercise due
11/10 Content Moderation/Toxicity Detection Choose two articles from Content Moderation/Toxcity Detection below, and be prepared to discuss our reading questions:
  • What is the definition of abuse/toxic content?
  • Who gets to define that/decide what counts in this context?
  • What is the intended use case?
  • What are the failure modes (false positive, false negative) and who would be affected by them?
  • Who should moderation be protecting? (In particular, is value in protecting bots from harassment)
  • How is automated content moderation motivated and how is it related to human moderation? Why automate / how is the automated system sitauted within a deployed process?
  • What is the evidence that that system is applicable across a relevant set of domains? What are the dependencies on existing resources that would limit the applicability?
 
11/12 Term paper outline due
11/17 Social Media + Privacy Choose two articles from Social Media and Privacy below, and be prepared to discuss our shared investigation questions:
  • What does "privacy" mean and why do the communities we represent value it?
  • How do we balance privacy against competing values?
  • How should social media data be best used (or not) in research?
  • What stance should commercial entites take in providing data to the public or researchers for research purposes?
 
11/24 Policy/regulation/guides + NLP for social good Choose two articles total from Changing Practice and NLP for Social Good below, and be prepared to discuss our reading questions:
  • Why should we write ethical considerations sections?
  • How are ethical considerations sections perceived: by researchers, by regulators, by the lay public?
  • What is the purpose of adopting a code of ethics?
  • What other practices can be recommended and why (open source, licensing, IRBs, ...)?
  • What positive impact has come from NLP for social good projects?
  • What is "social good" and who does it benefit?
  • What particular ethical concerns do "NLP for social good" projects raise?
Term paper draft due
12/1 Language Variation and Emergent Bias Choose two articles from Language Variation and Emergent Bias below, and be prepared to discuss our reading questions:
  • What are the real-world consequences of emergent bias in language technology?
  • How can emergent bias be measured and what different types do we observe?
  • Is it possible to prevent implicit standardization?
  • Dual use consideration: does more inclusive language tech enable more surveillance of marginalized groups?
 
12/3     KWLA papers due
Comments on partner's paper draft due
12/8 Documentation and Transparency Choose two articles from Documentation and Transparency below, and be prepared to discuss our reading questions:
  • What types of information should be included in the documentation?
  • How does documentation help mitigate harms and what types of harms can be mitigated based on documentation?
  • How to balance transparency and data subject privacy?
  • How to balance transparency with corporate IP and other forces pushing for secrecy?
  • How to balance documentation and "benefits" of dataset scale?
12/10 Paper annotations due
12/14     Final papers due 11pm


Bibliography

NOTE This is still very much a work in progress! I have more papers to add, and some of these need to be recategorized.

Overviews/Calls to Action

Foundations

Philosophical Underpinnings

Value Sensitive Design and Other Design Approaches

Documentation and Transparency

Other Best Practices

Bias/Discrimination

Fairness

Other resources on bias

More papers in the Proceedings of the First Workshop on Gender Bias in Natural Language Processing

Demographic variables

Chatbots

Privacy

Social Media

Content moderation/Toxicity detection

See also the papers in the Proceedings of the 5th Workshop on Online Abuse and Harms (WOAH 2021)

Crowdsourcing and labor conditions

Language Variation and Emergent Bias

Biomedical NLP, Mental Health and Social Media

NLP Apps Addressing Ethical Issues/NLP for Social Good

Other Issues in NLP: Carbon Emissions, Generation, ...

SciComm and Ethics Education

Changing Practice: Policy, regulation, and guidelines

Ethics Statements

Reading notes

Papers:

Other Readings

Links

Conferences/Workshops

Other lists of resources

Other courses


ebender at u dot washington dot edu
Last modified: