logobanner1
research label grad course label undergrad course label other labelbook linkarticles linkworking paper linkAdvanced Quantitative Political Methodology
	        linkmax likelihood linkvisualizing data linkpanel data linkPolitical Science as Social Science
	       LinkIntro to Soc Stat linkCase-Based Stat linkPolitical Economy Seminar linkSoftware linkData
	        link

Visualizing Data and Models

CSSS 569

Good visual displays uncover patterns quantitative scientists might otherwise miss, and can make or break a paper. This course takes the design of graphics and tables seriously, and surveys a variety of visual techniques for exploring data and summarizing statistical models. Emphasis on principles of effective visualization, novel visual displays, examples from the social sciences, and implementation of recommended techniques in R.



CSSS 569

Visualizing Data and Models

Offered every Winter at the
University of Washington

Syllabus  

Readings  



Winter 2024

Class meets:
MW 4:30-5:50 pm
Smith Hall 211

TA:

Brian Leung
(UW Political Science)

Section meets:
F 3:30 pm–5:20 pm
Taught by Zoom

Lectures           Click on lecture titles to view slides or the buttons to download them as PDFs.

Short Course

Visualizing Model Inference and Robustness  

This is a 9-hour short course version of the full Data Visualization course; the lectures for the full term course are below. Students taking the short course will also need these additional resources:


Topic 1

Course Introduction  


Topic 2

Principles for the Visual Display of Scientific Information  


Topic 3

Cognitive Issues in Visualization  


Topic 4

Graphical Programming in R  

In the first part of the lecture, we will consider examples from ggplot2 collected in this R script, which relies on this dataset.


In the second half of the lecture, time permitting, we will work through an R script that uses grid graphics to solve a basic task, showing confidence intervals around a regression line using this dataset. Finally, a more advanced grid graphics script replaces ticks with gridlines and packs the grid graphics code inside a more general and usable function, contained in this required helper file. The final graphic can be viewed here.


Interested students can find detailed instructions for downloading, installing, and learning my recommended software for quantitative social science here. Focus on steps 1.1 and 1.3 for now, and then, optionally, step 1.2. (Note: These recommendations may seem dated, as many students prefer to use RStudio as an integrated design environment in combination with RMarkdown. You are free to follow that model, which minimizes start-up costs. I still prefer a combination of Emacs, the plain R console, and Latex/XeLatex for my own productivity, with occasional use of Adobe Illustrator for graphics touch-up.)


Topic 5

Exploratory Data Analysis: Between Data & Model    


Topic 6

Visualizing Inference  

Download instructions for the tile package can be found under the Software tab at left. We will discuss up to four examples in detail:


Topic 7

Interactive Visual Displays with R + Shiny

The Shiny package makes it easy to convert your R code and graphics, including those made with the tile package, into interactive displays for the web. We’ll work through this written Shiny tutorial.


We will discuss several Shiny interactives written by your instructor. Feel free to study the applications and code and come to class with questions:


Topic 8

Advanced Latex for Scientific Typesetting

Time permitting, we will consider the use of modern Latex typesetting tools, especially Xetex and the fontspec package. I offer three stylesheets for students looking to spruce up their documents. (Students new to Latex should read the Not So Short Introduction to Latex before embarking on any of the advanced stylesheets below.)

  • caxetexFreeOL (manual). A powerful XeLaTeX stylesheet using free typefaces and implemented for the popular, easy-to-use Latex platform Overleaf. You can find everything you need to get started with caxetexFreeOL at this Overleaf project. Note in particular the template for research papers.
  • caxetexFree (manual). The same powerful XeLaTeX stylesheet using free typefaces, but for use on your local computer's TeX installation. You will need to download the relevant typefaces as instructed in the manual.
  • caxetexBook (manual). The main XeLaTeX stylesheet I use in my own publishing. You will need to purchase the commercial typefaces listed in the manual if you wish to use this stylesheet.


Gallery 1

Scales and Storytelling  


Gallery 2

Maps as Visual Displays of Information  


Gallery 3

Time Series as Narrative  

See also this excellent confection from XKCD explaining the scale of global temperature variation over the last 20,000 years.


Gallery 4

Grayscale Images of Continuous Data  


Gallery 5

Turning Tables into Graphs  


Gallery 6

Heatmaps for Visualizing Continuous Dyadic Data    


Gallery 7

Ternary Plots for Compositional Data Analysis  


Student Assignments

Problem Set 1  

Due by 24 January 2024; turn in problem 2 early if possible

You will need these data.


Problem Set 2  

Due 14 February 2024


Problem Set 3  

Due 13 March 2024


Breakout Group

Individual memo due before group meets; Group essay due by 27 February

Students will join a small group to discuss a visual display problem of common interest; creation and organization of these groups to be coordinated through the web. Students will write a 2-5 page memo before the first group meeting, and each group will write a 5-8+ page essay for the class on what they have learned, to be distributed by 26 February. Groups will answer questions from the class during the week of 26 February. See the syllabus for further details.


Final Poster

Presented during the final two classes

On an assigned day during the last week of the course, each poster group will present a poster applying the tools learned in class to their own research. Alternatively, students can take an article published in their field and show how better visuals would either more clearly convey the findings or cast doubt on them. The final presentation may address problems raised in the breakout session or problem sets, but it is usually more fruitful for students to tackle a new problem.


Labs

Lab 1

Intro to labs; intermediate R and prediction  

Supplementary material: We will go through this R Markdown file centered on the O-ring example and different methods to compute and visualize confidence intervals. If you want to follow the section while writing your own code, you can use this clean R Markdown file with no code but just notes. See this PDF report directly "knitted" from the R Markdown file, which is one of the recommended ways to produce and submit your homework.

For a crash course in base R and tidyverse, see supplementary slides (also, the source .Rmd; two datasets for practice and knitting: econ.csv and pop.csv).


Lab 2

R Markdown, LaTeX and Overleaf  

Supplementary material: We will go through this R Markdown Sample file, which cover most of its basic functionalities; and you need this JPG to knit the final PDF output.

We will also go through this HW 1 starting file; you need these two figures (Fig 1 and Fig 2) to knit this PDF output.

We will also learn basic LaTeX with Overleaf. We will go through this template, which requires this figure and this bib file to be compiled.


Lab 3

ggplot2  

Supplementary material: Two datasets you need to reproduce the figures for the electric vehicles in Washington State example: ev_data.csv and county_data.csv, with our customized ggplot2 theme. A supplementary .Rmd file containing the main code chunks for reproducing the graphs is available for reference.


Lab 4

advanced ggplot2  

Supplementary material: Four datasets you need to reproduce the figures for the exercises: Nobel prize winners data, Measles data, 92 Presidential election data and Cy Young award data. A supplementary .Rmd file containing the main code chunks for reproducing the graphs is available for reference.


Lab 5

Intro to tile  

Supplementary material: Inequality example's script and data; voting example 's script and data; crime example's script.


Lab 6

Visualizing spatial data  

Supplementary material: To reproduce the graphs, download the Washington Post dataset, and also the shapefiles for the United States (cb_2018_us_state_20m.zip) and New York City.


Lab 7

Visualizing relational data  

Supplementary material: To reproduce the graphs, download the Medici data and migration flow data. A supplementary .Rmd file containing the main code chunks for reproducing the graphs is available for reference.


Lab 8

Interactive Visual Display with R + Shiny  

Supplementary material: Simple Shiny app script 1, script 2 and a script to demonstrate the use of action button.


Lab 9

Latest extension packages for visualization  

Supplementary material: The underlying .Rmd written with flipbookr + xaringan packages.



University of Washington link

CSSS Center for Statistics and the Social Sciences link

Designed by
Chris Adolph & Erika Steiskal

Copyright 2011–2024
Privacy · Terms of Use

Jefferson (2007-2011)