Linguistics 575: Ethics in NLP
Autumn Quarter, 2019
Course Info
- Lecture: Wednesdays, 3:30-5:50 in SAV 130 and online (Zoom link in Canvas)
- Course Canvas (discussion board, assignment submission, grades)
Instructor Info
- Emily M. Bender
- Office Hours: (most) Thursdays 10-11 and (most) Fridays 10:30-11:30
- Office: GUG 414-C
- Email: ebender at u
Syllabus
Description
The goal of this course is to better understand the ethical
considerations that arise in the deployment of NLP technology,
including how to identify people likely to be impacted by the use of
the technology (direct and indirect stakeholders), what kinds of risks
the technology poses, and how to design systems in ways that better
support stakeholder values.
Through discussions of readings in ethics from various philosophical
traditions, the growing research literature on fairness,
accountability, transparency and ethics (FATE) in NLP and allied
fields, and value sensitive design, we will seek to answer the
following questions:
- What ethical considerations arise in the design and deployment of NLP technologies?
- Which of these are specific to NLP (as opposed to AI or technology more generally?)
- What best practices can/should NLP developers deploy in light of the ethical concerns identified?
Course projects are expected to take the form of a term paper
analyzing some particular NLP task or data set in terms of the
concepts developed through the quarter and looking forward to how
ethical best practices could be developed for that task/data set.
Prerequisites: Graduate standing. The primary audience for this course
is expected to be CLMS students, but graduate students in other
programs are also welcome.
Accessibility policies
If you have already established accommodations with Disability
Resources for Students (DRS), please communicate your approved
accommodations to me at your earliest convenience so we can discuss
your needs in this course.
If you have not yet established services through DRS, but have a
temporary health condition or permanent disability that requires
accommodations (conditions include but not limited to; mental health,
attention-related, learning, vision, hearing, physical or health
impacts), you are welcome to contact DRS at 206-543-8924 or
uwdrs@uw.edu
or disability.uw.edu. DRS
offers resources and coordinates reasonable accommodations for
students with disabilities and/or temporary health conditions.
Reasonable accommodations are established through an interactive
process between you, your instructor(s) and DRS. It is the policy and
practice of the University of Washington to create inclusive and
accessible learning environments consistent with federal and state
law.
Washington state law requires that UW develop a policy for
accommodation of student absences or significant hardship due to
reasons of faith or conscience, or for organized religious
activities. The UW's policy, including more information about how to
request an accommodation, is available at Faculty Syllabus Guidelines
and Resources. Accommodations must be requested within the first two
weeks of this course using the Religious Accommodations Request form
available
at https://registrar.washington.edu/students/religious-accommodations-request/.
[Note from Emily: The above language is all language suggested
by UW and in the immediately preceding paragraph in fact required
by UW. I absolutely support the content of both and am struggling with
how to contextualize them so they sound less cold. My goal is for
this class to be accessible. I'm glad the university has policies that
help facilitate that. If there is something you need that doesn't
fall under these policies, I hope you will feel comfortable bringing
that up with me as well.]
Requirements
Schedule of Topics and Assignments (still subject to change)
Date | Topic | Reading | Due |
---|
9/25 |
Introduction, organization
Why are we here? What do we hope to accomplish? |
Hovy and Spruit 2016 plus at least two
other papers/articles listed under Overviews/Calls to Action below (or just one, if you pick something particularly long) |
|
9/27 |
|
|
KWLA papers: K & W due 11pm |
10/2 |
Philosophical foundations |
Two items from Philosophical Foundations below, at least one of which comes from an author whose perspective varies greatly from your own life experience. Be prepared to discuss the following:
- What is the main thesis of the reading?
- What is their definition of ethics?
- In what ways do they contrast their definition with others?
- How does this reading relate to ethics in NLP?
|
|
10/9 |
Philosophical foundations (cont)
|
|
|
10/16 |
Value sensitive design |
Read Sections 1-2 of Friedman et al (2017) plus any two other papers from Value Sensitive Design below. Reading questions:
- How could you apply VSD theoretical constructs and methods to the NLP tasks you are most concerned with? Prepare two or three concrete examples.
- How do VSD theoretical constructs and methods build on or provide counterpoint to what you read in Philosophical Underpinnings?
In addition, for an NLP project you are interested in:
- Make a list of the direct and indirect stakeholders. Identify how each stakeholder group you identify might benefit or be harmed by the technology you are considering.
- For those who choose the paper by Nathan et al. on value scenarios, write a value scenario like those illustrated in the paper for the technology you are interested in investigating.
|
|
10/23 |
Word Embeddings & Language Behavior as Ground Truth |
Read the excerpt from Bender & Lascarides in the Canvas files page (on the distributional hypothesis), plus two papers from Word Embeddings & Language Behavior as Ground Truth below. If you would like fruther background on word embeddings in general, see Camacho-Collados and Pilehvar 2018
For the papers you read, answer the following questions:
- How do the word embedding readings relate to the distributional hypothesis? ("You know a word by the company it keeps")
- What, if any, bias did the authors discover? What impacts do they describe following from that bias?
- What, if any, means of mitigating the bias do they authors propose? How are they evaluated?
- How do the scenarios described relate to the issue of using descriptive models prescriptively?
|
|
10/25 |
|
|
Term paper proposals due |
10/30 |
Language Variation and Emergent Bias
Exclusion/Discrimination/Bias
|
Read three papers total from the sections on Language Variation and Emergent Bias and Exclusion/Discrimination/Bias below. Be sure to read at least one from each set. Reading questions are listed with each of the sets below. |
|
11/6 |
Ethics statements |
|
Draft Ethics Statement (bring to class) |
11/8 |
|
|
Ethics Statement |
11/13 |
Chatbots; Biomedical, mental health, social media
|
Read three papers total from the sections on Chatbots and
Biomedical NLP, Mental Health and Social Media below. Be sure to read at least one from each set. Reading questions are listed with each of the sets below. |
Term paper outline due |
11/20 |
Privacy
Documentation and Transparency
|
Read three papers total from the sections on Privacy and Documentation and Transparency below.Be sure to read at least one from each set. Reading questions are listed with each of the sets below. |
|
11/22 |
|
|
Term paper draft due |
11/27 |
NLP Applications Addressing Ethical Issues |
Read three papers (or two, if they're long) from the set under NLP Apps Addressing Ethical Issues/NLP for Social Good below. Discussion/reading questions are listed together with the papers. |
|
12/2 |
|
|
KWLA papers due
Comments on partner's paper draft due |
12/4 |
Scicomm and Ethics in NLP Education
|
Read three papers from SciComm and Ethics Education below. We will be discussing them in light of what they mean for ethics and NLP:
- What do we as language technologists have a responsibility to communicate to the public?
- What are our goals in doing that communication?
- How do these responsibilities and goals inform how we should undertake science communication?
- What are the key takeaways in terms of best practices for science communication, given our goals?
|
|
12/6 |
|
|
OpEd/Letter to the Editor |
12/12 |
|
|
Final papers due 11pm |
Bibliography
- Amblard, M. (2016). Pour un TAL responsable. Traitement Automatique des Langues , 57 (2), 21-45.
- boyd danah. (Sept 13, 2019). Facing the great reckoning head-on. Medium.
- Crawford, K., & Calo, R. (2016). There is a blind spot in AI research. Nature, 538 (7625), 311.
- Executive Office of the President National Science and Technology Council Committee on Technology. (2016). Preparing for the future of artificial intelligence. (See especially p. 30)
- Fort, K., Adda, G., & Cohen, K. B. (2016). Ethique et traitement automatique des langues et de la parole: entre truismes et tabous. Traitement Automatique des Langues, 57 (2), 7-19.
- Grissom II, A. (2019). Thinking about how NLP is used to serve power: Current and future trends. /Presentation at Widening NLP 2019. [Slides] [Video]
- Hovy, D., & Spruit, S. L. (2016, August). The social impact of natural language processing. In Proceedings of the 54th annual meeting of the association for computational linguistics (volume 2: Short papers) (pp. 591-598). Berlin, Germany: Association for Computational Linguistics.
- Lefeuvre-Halftermeyer, A., Govaere, V., Antoine, J.-Y., Allegre, W., Pouplin, S., Departe, J.-P., et al. (2016). Typologie des risques pour une analyse éthique de l'impact des technologies du TAL. Traitement Automatique des Langues, 57 (2), 47-71.
- Markham, A. (May 18, 2016). OKCupid data release fiasco: It's time to rethink ethics education. Data & Society: Points.
- O'Neil, C. (2016). Weapons of math destruction: How big data increases inequality and threatens democracy. NY: Crown Publishing Group.
- Rogaway, P. (2015). The moral character of cryptographic work.
- Shneiderman, B. (2016). Opinion: The dangers of faulty, biased, or malicious algorithms requires independent oversight. Proceedings of the National Academy of Sciences, 113 (48), 13538-13540.
- Sourour, B. (Nov 13, 2016). The code I'm still ashamed of. Medium.
- Bartky, S. L. (2002). "Sympathy and solidarity" and other essays (Vol. 32). Rowman & Littlefield.
- Bryson, J. J. (2015). Artificial intelligence and pro-social behaviour. In C. Misselhorn (Ed.), Collective agency and cooperation in natural and artificial systems: Explanation, implementation and simulation (pp. 281-306). Cham: Springer International Publishing.
- Butler, J. (2005). Giving an account of oneself. Oxford University Press. (Available online, through UW libraries)
- Cho, S., Crenshaw, K. W., & McCall, L. (2013). Toward a field of intersectionality studies: Theory, applications, and praxis. Signs: Journal of Women in Culture and Society, 38 (4), 785-810.
- Crenshaw, K. (1989). Demarginalizing the intersection of race and sex: A black feminist critique of antidiscrimination doctrine, feminist theory and antiracist politics. U. Chi. Legal f., 139.
- Crenshaw, K. (1990). Mapping the margins: Intersectionality, identity politics, and violence against women of color. Stan. L. Rev., 43, 1241.
- DeLaTorre, M. A. (2013). Ethics: A liberative approach. Fortress Press. (Available online through UW Libraries; read intro + chapter of choice)
- Edgar, S. L. (2003). Morality and machines: Perspectives on computer ethics.
Jones & Bartlett Learning. (Available online through UW libraries)
- Fieser, J., & Dowden, B. (Eds.). (2016). Internet encyclopedia of philosophy: Entries on Ethics
- Liamputtong, P. (2006). Researching the vulnerable: A guide to sensitive research methods. Sage. (Available online, through UW libraries)
- Prabhumoye, S., Mayfield, E., & Black, A. W. (2019, August). Principled frameworks for evaluating ethics in NLP systems. In Proceedings of the 2019 workshop on widening nlp (pp. 118-121). Florence, Italy: Association for Computational Linguistics.
- Quinn, M. J. (2014). Ethics for the information age. Pearson.
- Zalta, E. N. (Ed.). (2019). The Stanford encyclopedia of philosophy (Winter 2016 Edition ed.): Entries on Ethics
Reading questions:
- What went wrong?
- Who was harmed?
- Who benefitted?
- What (if anything) is offered as a way to mitigate such harm in the future?
- How does the reading you did for "philosophical foundations" relate to this issue?
- What (if any) analogies do you see to the kind of NLP tasks you work on?
More papers in the Proceedings of the First Workshop on Gender Bias in Natural Language Processing
- Angwin, J., & Larson, J. (Dec 30, 2016). Bias in criminal risk scores is mathematically inevitable, researchers say. ProPublica.
- boyd, d. (2015). What world are we building? (Everett C Parker Lecture. Washington, DC, October 20)
- Brennan, M. (2015). Can computers be racist? big data, inequality, and discrimination. (online; Ford Foundation)
- Clark, J. (Jun 23, 2016). Artificial intelligence has a `sea of dudes' problem. Bloomberg Technology.
- Crawford, K. (Apr 1, 2013). The hidden biases in big data. Harvard Business Review.
- Daumé III, H. (Nov 8, 2016). Bias in ML, and teaching AI. (Blog post, accessed 1/17/17)
- Emspak, J. (Dec 29, 2016). How a machine learns prejudice: Artificial intelligence picks up bias from human creators--not from hard, cold logic. Scientific American.
- Friedman, B., & Nissenbaum, H. (1996). Bias in computer systems. ACM Transactions on Information Systems (TOIS), 14(3), 330-347.
- Guynn, J. (Jun 10, 2016). `Three black teenagers' Google search sparks outrage. USA Today.
- Hardt, M. (Sep 26, 2014). How big data is unfair: Understanding sources of unfairness in data driven decision making. Medium.
- Jacob. (May 8, 2016). Deep learning racial bias: The avenue Q theory of ubiquitous racism. Medium.
- Larson, B. (2017). Gender as a variable in natural-language processing: Ethical considerations. In Proceedings of the first ACL workshop on ethics in natural language processing (pp. 1-11). Valencia, Spain: Association for Computational Linguistics.
- Larson, J., Angwin, J., & Parris Jr., T. (Oct 19, 2016). Breaking the black box: How machines learn to be racist. ProPublica.
- Morrison, L. (Jan 9, 2017). Speech analysis could now land you a promotion. BBC capital.
- Noble, S. U. (2018). Algorithms of oppression: How search engines reinforce racism. NYU Press.
- Rao, D. (n.d.). Fairness in machine learning. (slides)
- Rudinger, R., May, C., & Van Durme, B. (2017). Social bias in elicited natural language inferences. In Proceedings of the first ACL workshop on ethics in natural language processing (pp. 74-79). Valencia, Spain: Association for Computational Linguistics.
- Sweeney, L. (May 1, 2013). Discrimination in online ad delivery. Communications of the ACM, 56 (5), 44-54.
- Zliobaite, I. (2015). On the relation between accuracy and fairness in binary classification. CoRR, abs/1505.05723.
Reading questions:
- How do the word embedding readings relate to the distributional hypothesis? ("You know a word by the company it keeps")
- What, if any, bias did the authors discover? What impacts do they describe following from that bias?
- What, if any, means of mitigating the bias do they authors propose? How are they evaluated?
- How do the scenarios described relate to the issue of using descriptive models prescriptively?
Papers:
- Ananya, Parthasarthi, N., & Singh, S. (2019). GenderQuant: Quantifying mention-level genderedness. In Proceedings of the 2019 conference of the north American chapter of the association for computational linguistics: Human language technologies, volume 1 (long and short papers) (pp. 2959-2969). Minneapolis, Minnesota: Association for Computational Linguistics.
- Bolukbasi, T., Chang, K.-W., Zou, J. Y., Saligrama, V., & Kalai, A. T. (2016). Man is to computer programmer as woman is to homemaker? Debiasing word embeddings. In D. D. Lee, M. Sugiyama, U. V. Luxburg, I. Guyon, & R. Garnett (Eds.), Advances in neural information processing systems 29 (pp. 4349-4357). Curran Associates, Inc.
- Bordia, S., & Bowman, S. R. (2019). Identifying and reducing gender bias in word-level language models. In Proceedings of the 2019 conference of the north American chapter of the association for computational linguistics: Student research workshop (pp. 7-15). Minneapolis, Minnesota: Association for Computational Linguistics.
- Caliskan, A., Bryson, J., & Narayanan, A. (2016). A story of discrimination and unfairness. (Talk presented at HotPETS 2016)
- Caliskan, A., Bryson, J. J., & Narayanan, A. (2016a). Semantics derived automatically from language corpora necessarily contain human biases. CoRR, abs/1608.07187.
- Daumé III, H. (2016). Language bias and black sheep. (Blog post, accessed 12/29/16)
- Garg, N., Schiebinger, L., Jurafsky, D., & Zou, J. (2018). Word embeddings quantify 100 years of gender and ethnic stereotypes. Proceedings of the National Academy of Sciences , 115 (16), E3635-E3644.
- Gonen, H., & Goldberg, Y. (2019). Lipstick on a pig: Debiasing methods cover up systematic gender biases in word embeddings but do not remove them. In Proceedings of the 2019 conference of the north American chapter of the association for computational linguistics: Human language technologies, volume 1 (long and short papers) (pp. 609-614). Minneapolis, Minnesota: Association for Computational Linguistics.
- Herbelot, A., Redecker, E. von, & Müller, J. (2012). Distributional techniques for philosophical enquiry. In Proceedings of the 6th workshop on language technology for cultural heritage, social sciences, and humanities (pp. 45-54). Avignon, France: Association for Computational Linguistics.
- Manzini, T., Yao Chong, L., Black, A. W., & Tsvetkov, Y. (2019). Black is to criminal as caucasian is to police: Detecting and removing multiclass bias in word embeddings. In Proceedings of the 2019 conference of the north American chapter of the association for computational linguistics: Human language technologies, volume 1 (long and short papers) (pp. 615-621). Minneapolis, Minnesota: Association for Computational Linguistics.
- May, C., Wang, A., Bordia, S., Bowman, S. R., & Rudinger, R. (2019). On measuring social biases in sentence encoders. In Proceedings of the 2019 conference of the north American chapter of the association for computational linguistics: Human language technologies, volume 1 (long and short papers) (pp. 622-628). Minneapolis, Minnesota: Association for Computational Linguistics.
- Nissim, M., Noord, R. van, & Goot, R. van der. (2019). Fair is better than sensational:man is to doctor as woman is to doctor.
- Noble, S. U. (2018). Algorithms of oppression: How search engines reinforce racism. NYU Press.
- Schluter, N. (2018). The word analogy testing caveat. In Proceedings of the 2018 conference of the north American chapter of the association for computational linguistics: Human language technologies, volume 2 (short papers) (pp. 242-246). New Orleans, Louisiana: Association for Computational Linguistics.
- Schmidt, B. (2015). Rejecting the gender binary: A vector-space operation. (Blog post, accessed 12/29/16)
- Speer, R. (2017). Conceptnet numberbatch 17.04: better, less-stereotyped word vectors. (Blog post.)
- Webster, K., Recasens, M., Axelrod, V., & Baldridge, J. (2018). Mind the GAP: A balanced corpus of gendered ambiguous pronouns. Transactions of the Association for Computational Linguistics, 6, 605-617.
- Zhao, J., Wang, T., Yatskar, M., Cotterell, R., Ordonez, V., & Chang, K.-W. (2019). Gender bias in contextualized word embeddings. In Proceedings of the 2019 conference of the North American chapter of the association for computational linguistics: Human language technologies, volume 1 (long and short papers) (pp. 629-634). Minneapolis, Minnesota: Association for Computational Linguistics.
Reading questions:
- How does intent (user intent, system designer intent) relate to language generation tools?
- In what ways do we find tension between user satisfaction and potential ethical considerations?
- In what ways are chat bots beneficial?
- What are the implications of gendering virtual assistants?
Papers:
- Cercas Curry, A., & Rieser, V. (2018). #MeToo Alexa: How conversational systems respond to sexual harassment. In Proceedings of the second ACL workshop on ethics in natural language processing (pp. 7-14). New Orleans, Louisiana, USA: Association for Computational Linguistics.
- Elder, A. (2019). Conversation from beyond the grave? A neo-Confucian ethics of chatbots of the dead. Journal of Applied Philosophy.
- Fessler, Leah. (Feb 22, 2017). SIRI, DEFINE PATRIARCHY: We tested bots like Siri and Alexa to see who would stand up to sexual harassment. Quartz.
- Fung, P. (Dec 3, 2015). Can robots slay sexism? World Economic Forum.
- Mott, N. (Jun 8, 2016). Why you should think twice before spilling your guts to a chatbot. Passcode.
- Paolino, J. (Jan 4, 2017). Google home vs Alexa: Two simple user experience design gestures that delighted a female user. Medium.
- Seaman Cook, J. (Apr 8, 2016). From Siri to sexbots: Female AI reinforces a toxic desire for passive, agreeable and easily dominated women. Salon.
- Twitter. (Apr 7, 2016). Automation rules and best practices. (Web page, accessed 12/29/16)
- Yao, M. (n.d.). Can bots manipulate public opinion? (Web page, accessed 12/29/16)
Reading questions:
- How are people addressing privacy; which ethical frameworks?
- How is privacy defined?
- What is privacy in tension with?
- What purpose does privacy serve/why is it valued?
- How has the notion of privacy changed over the last few decades?
- What unique concerns are there in NLP and privacy?
Papers:
- Abadi, M., Chu, A., Goodfellow, I., Brendan McMahan, H., Mironov, I., Talwar, K., et al. (2016). Deep Learning with Differential Privacy. ArXiv e-prints.
- Amazon.com. 2017. Memorandum of Law in Support of Amazon's Motion to Quash Search Warrant
- Brant, T. (Dec 27, 2016). Amazon Alexa data wanted in murder investigation. PC Mag.
- Friedman, B., Kahn Jr, P. H., Hagman, J., Severson, R. L., & Gill, B. (2006). The watcher and the watched: Social judgments about privacy in a public place. Human-Computer Interaction, 21(2), 235-272.
- Golbeck, J., & Mauriello, M. L. (2016). User perception of facebook app data access: A comparison of methods and privacy concerns. Future Internet, 8(2), 9.
- Grissom II, A. (2019). Thinking about how NLP is used to serve power: Current and future trends. /Presentation at Widening NLP 2019. [Slides] [Video]
- Lewis, D., Moorkens, J., & Fatema, K. (2017). Integrating the management of personal data protection and open science with research ethics. In Proceedings of the first ACL workshop on ethics in natural language processing (pp. 60-65). Valencia, Spain: Association for Computational Linguistics.
- Narayanan, A., & Shmatikov, V. (2010). Myths and fallacies of "personally identifiable information". Communications of the ACM, 53 (6), 24-26.
- Nissenbaum, H. (2009). Privacy in context: Technology, policy, and the integrity of social life. Stanford: Stanford University Press.
- Solove, D. J. (2007). 'I've got nothing to hide' and other misunderstandings of privacy. San Diego Law Review, 44 (4), 745-772.
- Steel, E., & Angwin, J. (Aug 4, 2010). On the Web's cutting edge, anonymity in name only. The Wall Street Journal.
- Tene, O., & Polonetsky, J. (2012). Big data for all: Privacy and user control in the age of analytics. Northwestern Journal of Technology and Intellectual Property, 11(45), 239-273.
- Vitak, J., Shilton, K., & Ashktorab, Z. (2016). Beyond the Belmont principles: Ethical challenges, practices, and beliefs in the online data research community. In Proceedings of the 19th ACM conference on computer-supported cooperative work & social computing (pp. 941-953).
Papers:
- Hallinan, B., Brubaker, J. R., & Fiesler, C. (2019). Unexpected expectations: Public reaction to the facebook emotional contagion study. New Media & Society, 1-19. [Tweet thread]
- Metcalf, J., & Crawford, K. (2016). Where are human subjects in big data research? The emerging ethics divide.Big Data & Society 3(1).
- Shilton, K., & Sayles, S. (2016). ``We aren't all going to be on the same page about ethics'': Ethical practices and challenges in research on digital and social media. In 2016 49th Hawaii international conference on system sciences (HICSS) (pp. 1909-1918).
- Townsend, L., & Wallace, C. (2015). Social media research: A guide to ethics. The University of Aberdeen.
- Williams, M. L., Burnap, P., & Sloan, L. (2017). Towards an ethical framework for publishing twitter data in social research: Taking into account users views, online context and algorithmic estimation. Sociology, 51 (6), 1149-1168.
- Woodfield, K. (Ed.). (2017). The ethics of online research. Emerald Publishing Limited.
- See especially Chs 2, 5, 7 and 8
- Bederson, B. B., & Quinn, A. J. (2011). Web workers unite! Addressing challenges of online laborers. In CHI'11 extended abstracts on human factors in computing systems (pp. 97-106).
- Callison-Burch, C. (2016). Crowd workers. (Slides from Crowdsoucing and Human Computation, accessed online 12/30/16)
- Callison-Burch, C. (2016). Ethics of crowdsourcing. (Slides from Crowdsoucing and Human Computation, accessed online
12/30/16)
- Fort, K., Adda, G., & Cohen, K. B. (2011). Amazon mechanical turk: Gold mine or coal mine? Computational Linguistics, 37 (2), 413-420.
- Snyder, J. (2010). Exploitation and sweatshop labor: Perspectives and issues. Business Ethics Quarterly, 20 (2), 187-213.
Reading Questions:
- In what ways did the language vary?
- What social categories did it vary with?
- How did the language variation affect system performance?
- How would that differential performance lead to problems in the world?
- How could the system be made more robust to language variation?
- How could such a system be deployed more responsibly?
Papers:
- Broussard, M. (10 May 2018). Agenda: Why the scots are such a struggle for Alexa and Siri. The Herald.
- Garimella, A., Banea, C., Hovy, D., & Mihalcea, R. (2019). Women's syntactic resilience and men's grammatical luck: Gender-bias in part-of-speech tagging and dependency parsing. In Proceedings of the 57th annual meeting of the association for computational linguistics (pp. 3493-3498). Florence, Italy: Association for Computational Linguistics.
- Hovy, D., & Søgaard, A. (2015). Tagging performance correlates with author age. In Proceedings of the 53rd annual meeting of the Association for Computational Linguistics and the 7th international joint conference on natural language processing (volume 2: Short papers) (pp. 483-488). Beijing, China: Association for Computational Linguistics.
- Huang, X., & Paul, M. J. (2019). Neural user factor adaptation for text classification: Learning to generalize across author demographics. In Proceedings of the eighth joint conference on lexical and computational semantics (*SEM 2019) (pp. 136-146). Minneapolis, Minnesota: Association for Computational Linguistics.
- Jørgensen, A., Hovy, D., & Søgaard, A. (2015). Challenges of studying and processing dialects in social media. In Proceedings of the workshop on noisy user-generated text (pp. 9-18). Beijing, China: Association for Computational Linguistics.
- Jurgens, D., Tsvetkov, Y., & Jurafsky, D. (2017). Incorporating dialectal variability for socially equitable language identification. In Proceedings of the 55th annual meeting of the association for computational linguistics (volume 2: Short papers) (pp. 51-57). Vancouver, Canada: Association for Computational Linguistics.
- Tatman, R. (2017). Gender and dialect bias in YouTube's automatic captions. In Proceedings of the first ACL workshop on ethics in natural language processing (pp. 53-59). Valencia, Spain: Association for Computational Linguistics.
Reading questions:
- What medical or health intervention is proposed?
- How were the data collected? Any privacy concerns?
- If the system was/were to be deployed, what possible positive and negative impacts does it have?
- What, if anything, does the paper have to say about ethical considerations?
Papers:
- Andrade, N. N. Gomes de, Pawson, D., Muriello, D., Donahue, L., & Guadagno, J. (2018, Dec 01). Ethics and artificial intelligence: Suicide prevention on Facebook. Philosophy & Technology, 31(4), 669-684.
- Barnett, I., & Torous, J. (2019, 04). Ethics, Transparency, and Public Health at the Intersection of Innovation and Facebook's Suicide Prevention Efforts. Annals of Internal Medicine, 170 (8), 565-566.
- Benton, A., Coppersmith, G., & Dredze, M. (2017). Ethical research protocols for social media health research. In Proceedings of the first ACL workshop on ethics in natural language processing (pp. 94-102). Valencia, Spain: Association for Computational Linguistics.
- Elder, A. (2019). Conversation from beyond the grave? A neo-Confucian ethics of chatbots of the dead. Journal of Applied Philosophy.
- Linthicum, K. P., Schafer, K. M., & Ribeiro, J. D. (2019). Machine learning in suicide science: Applications and ethics. Behavioral sciences & the law , 37 (3), 214-222.
- McKernan, L. C., Clayton, E. W., & Walsh, C. G. (2018). Protecting life while preserving liberty: Ethical recommendations for suicide prevention with artificial intelligence. Frontiers in Psychiatry, 9.
- Suster, S., Tulkens, S., & Daelemans, W. (2017, April). A short review of ethical challenges in clinical natural language processing. In Proceedings of the first ACL workshop on ethics in natural language processing (pp. 80-87). Valencia, Spain: Association for Computational Linguistics.
- Tucker, R. P., Tackett, M. J., Glickman, D., & Reger, M. A. (2019). Ethical and practical considerations in the use of a predictive model to trigger suicide prevention interventions in healthcare settings. Suicide and Life-Threatening Behavior , 49 (2), 382-392.
Papers:
- Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., & Sutskever, I. (2019). Language models are unsupervised multitask learners. OpenAI Blog, 1(8).
- Seabrook, J. (2019). The next word: Where will predictive text take us? The New Yorker, 14 Oct 2019.
- Parra Escartín, C., Reijers, W., Lynn, T., Moorkens, J., Way, A., & Liu, C.-H. (2017). Ethical considerations in NLP shared tasks. In Proceedings of the first ACL workshop on ethics in natural language processing (pp. 66-73). Valencia, Spain: Association for Computational Linguistics.
- Smiley, C., Schilder, F., Plachouras, V., & Leidner, J. L. (2017). Say the right thing right: Ethics issues in natural language generation systems. In Proceedings of the first ACL workshop on ethics in natural language processing (pp. 103-108). Valencia, Spain: Association for Computational Linguistics.
- Strubell, E., Ganesh, A., & McCallum, A. (2019). Energy and policy considerations for deep learning in NLP. In Proceedings of the 57th annual meeting of the association for computational linguistics (pp. 3645-3650). Florence, Italy: Association for Computational Linguistics.
- Šuster, S., Tulkens, S., & Daelemans, W. (2017). A short review of ethical challenges in clinical natural language processing. In Proceedings of the first ACL workshop on ethics in natural language processing (pp. 80-87). Valencia, Spain: Association for Computational Linguistics.
- Torbati, Y. (Sept, 2019). Google says Google Translate can't replace human translators. Immigration officials have used it to vet refugees. ProPublica.
- Vincent, J. (Feb, 2019). AI researchers debate the ethics of sharing potentially harmful programs. The Verge.
Reading questions
- What was the social issue addressed?
- How well did it work/how could you carry out an evaluation if one wasn't done?
- Design noir: What could go wrong?
Papers:
- Demszky, D., Garg, N., Voigt, R., Zou, J., Shapiro, J., Gentzkow, M., et al. (2019). Analyzing polarization in social media: Method and application to tweets on 21 mass shootings. In Proceedings of the 2019 conference of the north American chapter of the association for computational linguistics: Human language technologies, volume 1 (long and short papers) (pp. 2970-3005). Minneapolis, Minnesota: Association for Computational Linguistics.
- Fokkens, A. (2016). Reading between the lines. (Slides presented at Language Analysis Portal Launch event, University of Oslo, Sept 2016)
- Jurgens, D., Hemphill, L., & Chandrasekharan, E. (2019). A just and comprehensive strategy for using NLP to address online abuse. In Proceedings of the 57th annual meeting of the association for computational linguistics (pp. 3658-3666). Florence, Italy: Association for Computational Linguistics.
- Lee, N., Bang, Y., Shin, J., & Fung, P. (2019). Understanding the shades of sexism in popular TV series. In Proceedings of the 2019 workshop on widening NLP (pp. 122-125). Florence, Italy: Association for Computational Linguistics.
- Gershgorn, D. (Feb 27, 2017). NOT THERE YET: Alphabet's hate-fighting AI doesn't understand hate yet. Quartz.
- Google.com. (2017). The women missing from the silver screen and the technology used to find them. Blog post, accessed March 1, 2017.
- Greenberg, A. (2016). Inside Google'S Internet Justice League and Its AI-Powered War on Trolls. Wired.
- Kellion, L. (Mar 1, 2017) Facebook artificial intelligence spots suicidal users. BBC News.
- Madnani, N., Loukina, A., Davier, A. von, Burstein, J., & Cahill, A. (2017). Building better open-source tools to support fairness in automated scoring. In Proceedings of the first ACL workshop on ethics in natural language processing (pp. 41-52). Valencia, Spain: Association for Computational Linguistics.
- Munger, K. (2016). Tweetment effects on the tweeted: Experimentally reducing racist harassment. Political Behavior, 1-21.
- Munger, K. (Nov 17, 2016). This researcher programmed bots to fight
racism on twitter. It worked. Washington Post.
- Murgia, M. (Feb 23, 2017). Google launches robo-tool to flag hate speech online. Financial Times.
- The times is partnering with jigsaw to expand comment capabilities. (Sep 20, 2016). The New York Times.
- Qian, J., ElSherief, M., Belding, E., & Wang, W. Y. (2019). Learning to decipher hate symbols. In Proceedings of the 2019 conference of the north American chapter of the association for computational linguistics: Human language technologies, volume 1 (long and short papers) (pp. 3006-3015). Minneapolis, Minnesota: Association for Computational Linguistics.
- Waseem, Z. (2016). Are you a racist or am I seeing things? Annotator influence on hate speech detection on Twitter. In Proceedings of the first workshop on nlp and computational social science (pp. 138-142). Austin, Texas: Association for Computational Linguistics.
- Waseem, Z., & Hovy, D. (2016). Hateful symbols or hateful people? Predictive features for hate speech detection on Twitter. In Proceedings of the naacl student research workshop (pp. 88-93). San Diego, California: Association for Computational Linguistics.
- Wiegand, M., Ruppenhofer, J., & Kleinbauer, T. (2019). Detection of Abusive Language: the Problem of Biased Datasets. In Proceedings of the 2019 conference of the north American chapter of the association for computational linguistics: Human language technologies, volume 1 (long and short papers) (pp. 602-608). Minneapolis, Minnesota: Association for Computational Linguistics.
Reading notes
- Schmaltz 2018 is a proposal around this practice. The other references here are examples of paper including an ethics statment. For this week, we'll be writing our own ethics statements, either for our own papers or for others we have selected.
Papers:
- Al-khazra ji, S., Berke, L., Kafle, S., Yeung, P., & Huenerfauth, M. (2018). Modeling the speed and timing of american sign language to generate realistic animations. In Proceedings of the 20th international ACM SIGACCESS conference on computers and accessibility (pp. 259-270). New York, NY, USA: ACM.
- Chen, H., Cai, D., Dai, W., Dai, Z., & Ding, Y. (2019). Charge-based prison term prediction with deep gating network. (To appear at EMNLP 2019)
- Schmaltz, A. (2018). On the utility of lay summaries and AI safety disclosures: Toward robust, open research oversight. In Proceedings of the second ACL workshop on ethics in natural language processing (pp. 1-6). New Orleans, Louisiana, USA: Association for Computational Linguistics.
- Borning, A., & Muller, M. (2012). Next steps for value sensitive design. In Proceedings of the SIGCHI conference on human factors in computing systems (pp. 1125-1134).
- Friedman, B. (1996). Value-sensitive design. ACM Interactions, 3 (6), 17-23.
- Friedman, B., & Hendry, D. G. (2019). Value sensitive design: Shaping technology with moral imagination. MIT Press.
- Friedman, B., Hendry, D. G., Borning, A., et al. (2017). A survey of value sensitive design methods. Foundations and Trends in Human-Computer Interaction , 11 (2), 63-125.
- Friedman, B., & Kahn Jr., P. H. (2008). Human values, ethics, and design. In J. A. Jacko & A. Sears (Eds.), The human-computer interaction handbook (Revised second ed., pp. 1241-1266). Mahwah, NJ.
- Leidner, J. L., & Plachouras, V. (2017). Ethical by design: Ethics best practices for natural language processing. In Proceedings of the first ACL workshop on ethics in natural language processing (pp. 30-40). Valencia, Spain: Association for Computational Linguistics.
- Nathan, L. P., Klasnja, P. V., & Friedman, B. (2007). Value scenarios: a technique for envisioning systemic effects of new technologies. In CHI'07 extended abstracts on human factors in computing systems (pp. 2585-2590).
- Schnoebelen, T. (2017). Goal-oriented design for ethical machine learning and NLP. In Proceedings of the first ACL workshop on ethics in natural language processing (pp. 88-93). Valencia, Spain: Association for Computational Linguistics.
- Young, M., Magassa, L., & Friedman, B. (2019). Toward inclusive tech policy design: A method for underrepresented voices to strengthen tech policy documents. Ethics and Information Technology, 21(2), 89-103.
Reading questions:
- What would be required to collect the information requested?
- What kinds of ethical issues that we've considered would this help with and how?
- What kinds of ethical issues that we've considered would this not help with?
Papers
- Bender, E. M., & Friedman, B. (2018). Data statements for natural language processing: Toward mitigating system bias and enabling better science. Transactions of the Association for Computational Linguistics, 6, 587-604.
- Gebru, T., Morgenstern, J., Vecchione, B., Vaughan, J. W., Wallach, H., Daumé III, H., et al. (2018). Datasheets for datasets.
- Holland, S., Hosny, A., Newman, S., Joseph, J., & Chmielinski, K. (2018). The dataset nutrition label: A framework to drive higher data quality standards.
- Mieskes, M. (2017, April). A quantitative study of data in the NLP community. In Proceedings of the first ACL workshop on ethics in natural language processing (pp. 23-29). Valencia, Spain: Association for Computational Linguistics.
- Mitchell, M., Wu, S., Zaldivar, A., Barnes, P., Vasserman, L., Hutchinson, B., et al. (2019). Model cards for model reporting. In Proceedings of the conference on fairness, accountability, and transparency (pp. 220-229). New York, NY, USA: ACM.
- Partnership on AI. (2019). ABOUT-ML: Annotation and benchmarking on understanding and transparency of machine learning lifecycles (ABOUT ML).
Reading questions:
- What is shared with value sensitive design?
- What contrasts to value sensistive design?
- How could this be applied to [insert your favorite NLP task]?
Papers:
- Amodei, D., Olah, C., Steinhardt, J., Christiano, P., Schulman, J., & Mané, D. (2016). Concrete problems in AI safety. CoRR, abs/1606.06565.
- Markham, A. (2012). Fabrication as ethical practice: Qualitative inquiry in ambiguous Internet contexts. Information, Communication & Society, 15(3), 334-353.
- Ratto, M. (2011). Critical making: Conceptual and material studies in technology and social life. The Information Society, 27 (4), 252-260.
- Russell, S., Dewey, D., & Tegmark, M. (2015). Research priorities for robust and benefcial artifcial intelligence. AI Magainze.
- Shilton, K., & Anderson, S. (2016). Blended, not bossy: Ethics roles, responsibilities and expertise in design. Interacting with Computers.
- Shilton, K., & Sayles, S. (2016). "We aren't all going to be on the same page about ethics": Ethical practices and challenges in research on digital and social media. In 2016 49th Hawaii international conference on system sciences (HICSS) (pp. 1909-1918).
Papers:
- Burns, T. W., O'Connor, D. J., & Stocklmayer, S. M. (2003). Science communication: A contemporary definition. Public Understanding of Science, 12(2), 183-202.
- Di Bari, M., & Gouthier, D. (2002). Tropes, science and communication. Journal of Communication, 2(1).
- Fischhoff, B. (2013). The sciences of science communication. Proceedings of the National Academy of Sciences, 110(Supplement 3), 14033-14039.
- Mooney, C. (2010). Do scientists understand the public? American Academy of Arts & Sciences.
- Ngumbi, E. (2018, January 26). If you want to explain your science to the public, here's some advice. Scientific American.
- Phillips, C. M. L., & Beddoes, K. (2013). Really changing the conversation: The deficit model and public under-standing of engineering. In Proceedings of the 120th ASEE annual conference & exposition.
- Shepherd, M. (2016, November 22). 9 tips for communicating science to people who are not scientists. Forbes, 1-4.
- Simis, M. J., Madden, H., Cacciatore, M. A., & Yeo, S. K. (2016). The lure of rationality: Why does the deficit model persist in science communication? Public Understanding of Science, 25 (4), 400-414.
(Proposals for) Codes of Ethics
- Cohen, K. B., Pestian, J., & Fort, K. (2015). Annotateurs volontaires investis et éthique de l'annotation de lettres de suicidés. In ETeRNAL (ethique et traitement automatique des langues).
- Fort, K., & Couillault, A. (2016). Yes, we care! results of the ethics and natural language processing surveys. In Proceedings of the tenth international conference on language resources and evaluation (LREC 2016). Paris, France: European Language Resources Association (ELRA).
- Gillespie, T. (2014). The relevance of algorithms. In T. Gillespie, P. J. Boczkowski, & K. A. Foot (Eds.), Media technologies: Essays on communication, materiality, and society (pp. 167-194). MIT Press.
- Hardt, M., Price, E., & Srebro, N. (2016). Equality of opportunity in supervised learning. (Accessed online, 12/30/16)
- Kleinberg, J. M., Mullainathan, S., & Raghavan, M. (2016). Inherent trade-offs in the fair determination of risk scores. CoRR, abs/1609.05807.
- Metcalf, J., Keller, E. F., & boyd, d. (2016). Perspectives on big data, ethics, and society. (Accessed 12/30/16)
- Meyer, M. N. (2015). Two cheers for corporate experimentation: The A/B illusion and the virtues of data-driven innovation. Colo. Tech. L.J., 13, 273.
- Wallach, H. (Dec 19, 2014). Big data, machine learning, and the social sciences: Fairness, accountability, and transparency. Medium.
- Wattenberg, M., Viégas, F., & Hardt, M. (Oct 7, 2016). Attacking discrimination with smarter machine learning.
Links
Conferences/Workshops
- ACM FAT* Conference 2020 Barcelona, Spain, January 2020
- ALW3: 3rd Workshop on Abusive Language Online at ACL 2019, Florence, Italy, August 2019
- 1st ACL Workshop on Gender Bias for Natural Language Processing at ACL 2019, Florence, Italy, August 2019
- ACM FAT* Conference 2019 Atlanta, GA, January 2019
- ALW2: 2nd Workshop on Abusive Language Online at EMNLP 2018, Brussels, Belgium, October 2018
- FAT/ML 2018 at ICML 2018, Stockholm, Sweden, July 2018
- Ethics in Natural Language Processing at NAACL 2018, New Orleans LA, June 2018
- ACM FAT* Conference 2018 NY, NY, February 2018
- FAT/ML 2017 at KDD 2017, Halifax, Canada, August 2017
- ALW1: 1st Workshop on Abusive Language Online at ACL 2017, Vancouver, Canada, August 2017
- Ethics in Natural Language Processing at EACL 2017, Valencia, Spain, April 2017
- 3rd International Workshop on AI, Ethics and Society 4th or 5th February 2017 San Francisco, USA
- PDDM16 The 1st IEEE ICDM International Workshop on Privacy and Discrimination in Data Mining December 12, 2016 - Barcelona
- Machine Learning and the Law NIPS Symposium 8 December, 2016 Barcelona, Spain
- FAT/ML 2016 NY, NY, November 2016
- AAAI Fall Symposium on Privacy and Language Technologies, November 2016
- Workshop on Data and Algorithmic Transparency (DAT'16) November 19, 2016, New York University Law School
- WSDM 2016 Workshop on the Ethics of Online Experimentation, February 22, 2016
San Francisco, California
- ETHI-CA2 2016: ETHics In Corpus Collection, Annotation and Application LREC 2016, Protoroz, Slovenia.
- FAT/ML 2015, at ICML 2015, Lille, France, July 2015
- ETeRNAL - Ethique et TRaitemeNt Automatique des Langues June 22, 2015, Caen
- Éthique et Traitement Automatique des Langues, Journée d'étude de l'ATALA
Paris, France, November 2014
- Fairness, Accountability, and Transparency in Machine Learning (FAT/ML) 2014, at NeurIPS 2014, Montréal, Canada, December 2014
Other lists of resources
Other courses
ebender at u dot washington dot edu
Last modified: