Lectures Click on lecture titles to view slides or the buttons to download them as PDFs. Topic 1 Introduction to the Course, Probability, and R You may want to read through Kevin Quinn’s matrix algebra and probability distribution reviews, or consult my undergrad lectures on discrete and continuous distributions. There are also R code and data for exploratory data analysis using histograms and boxplots, code and data for a simple bivariate linear regression, and code and data for a multiple regression example. Finally, you’ll find detailed instructions for downloading, installing, and learning my recommended software for quantitative social science here. Focus on steps 1.1 and 1.3 for now, and then, optionally, step 1.2. (Note: These recommendations may seem dated, as many students prefer to use RStudio as an integrated design environment in combination with RMarkdown. You are free to follow that model, which minimizes start-up costs. I still prefer a combination of Emacs, the plain R console, and Latex/XeLatex for my own productivity, with occasional use of Adobe Illustrator for graphics touch-up.) Topic 2 Introduction to Maximum Likelihood The R code to simulate heteroskedastic data and model that data using a heteroskedastic normal maximum likelihood is here. There are additional mini-lectures on two topics. The first is a review of Bayes Rule (download PDF). The second presents the Monty Hall problem (download PDF). Three versions of the Monte Hall simulation code are available: the first uses a loop and many small steps, the second uses a loop and more compact code, and the third uses lapply() to avoid looping. Topic 3 There are separate R scripts for interpreting and selecting binary logit models, as well as an example dataset. The goodness of fit code also relies on R functions for computing the percent correctly predicted and making predicted-versus-actual plots and ROC plots, which you should place in your working directory. An example trio of plots showing actual versus predicted probabilities, error versus predicted probabilities, and the ROC curve can be seen here. Topic 4 R code and data for an ordered probit, which produces graphics for expected value plots and first difference plots. Topic 5 R code for a multinomial logit, which produces a variety of graphical summaries of a multinomial logit model: for expected values plotted together, expected values plotted separately in a tiled format, first difference plotted for a single scenario and all categories, relative risks plotted for a single scenario and all categories, and relative risks plotted for many scenarios at once. Topic 6 Two code examples are discussed in this lecture. The first example analyzes bounded counts using Binomial, Beta-Binomial, and Quasibinomial models of turnout in the 2004 general election in Washington state. Example output includes this plot of expected counts under different models for various counterfactual scenarios. Note this example uses multiple imputation to fill in missing data. You will need: The second example analyzes unbounded counts using Poisson, Negative Binomial, Quasipoisson, Zero-inflated Poisson, and Zero-inflated Negative Binomial models of foreclosure filings by Houston, Texas area Home Owner Associations (HOAs). Example output includes this plot of expected values from a zero-inflated negative binomial model. You will need: Advanced Topic 1 Missing Data and Multiple Imputation See the Topic 6 example on turnout for an R code using multiple imputation of missing data. Also available is an example (R script, data, plot) showing the use of overimputation to compute coverage of multiple imputation prediction intervals for real data. Advanced Topic 2 Introduction to Multilevel Models For the curious, the R script used to construct the example plots in the first half of this lecture is here. Self-Study Lecture 1 Introduction to Contingency Tables This lecture and the two below it introduce log-linear models of tabular data, and will not be presented as part of POLS/CSSS 510. They are posted here for interested students, especially for the use of mosaic plots to investigate cross-tabulated data (in this lecture, and in the third lecture on multidimensional tables). Students interested in a CSSS course on log-linear models should investigate CSSS 536. Self-Study Lecture 2 Log-linear Models of Contingency Tables: 2D tables Self-Study Lecture 3 Log-linear Models of Contingency Tables: 3D+ tables Student Assignments Due in Canvas by start of class Wednesday 9 October 2024 Due in Canvas by start of class Monday 21 October 2024 Due in Canvas by start of class Wednesday 6 November 2024 Data for problem 1 in comma-separated variable format. Due in Canvas by start of class Wednesday 20 November 2024 Data for problem 1 in comma-separated variable format. Due in Canvas by start of class Monday 2 December 2024 Data for problem 1 in comma-separated variable format; data for problem 2 in R data format. Poster Presentations 2 to 6 December 2024 Requirements and suggestions for poster presentations will be presented in class. Final Paper Due Tuesday 10 December 2024, 3:00 pm by email See the syllabus for paper requirements, and see my guidelines and recommendations for quantitative research papers. Labs Lab 1 R Review + Intro to RMarkdown and Overleaf Supplementary material: Find our Slack channel here, and the recurring Zoom lab sessions, held every Friday, here. For today's lab, you will need to download the following review_script.R and RMarkdownSample.Rmd files. Furthermore, you will need to load the following datasets: pop.csv and gapminder.csv. As an optional homework for next week, you can download the files Lab1_practice.Rmd and lab1_data.csv. You can download all the materials in the following ZIP file. You can access to the lab recording in this link. Lab 2 Probability Distributions and Statistical Inference Supplementary material: For today's lab, we will go over this .Rmd file. We will also work with Lab2_practice.Rmd. Find also the answer key from last week's code practice file. You can download all the materials in the following ZIP file. You can access to the lab recording in this link. Lab 3 Supplementary material: For today's lab, we will go over this .Rmd file. We will also work with Lab3_practice.Rmd. Find also the answer key from last week's code practice file. You can download all the materials in the following ZIP file. You can access to the lab recording in this link. Lab 4 Prediction and Quantities of Interest (QoI) Supplementary material: For today's lab, we will go over this .Rmd file. We will shortly review Lab3_practice_key.Rmd. Furthermore, you will need to load the following datasets: crime.csv and nes00a.csv. You can download all the materials in the following ZIP file. You can access to the lab recording in this link. Find also the script solution for Problem Set 1. Lab 5 Binary Models and Intro to tile Supplementary material: For today's lab, we will go over this .Rmd file. Notice that this lab includes a custom ggplot theme theme_cavis created by former TA Brian Leung. Furthermore, you will need to load the following dataset: nes00a.csv. You can download all the materials in the following ZIP file. You can access to the lab recording in this link. Lab 6 In- and Out-Sample Goodness of Fit, and Cross-Valdiation for Binary Models Supplementary material: For today's lab, we will go over this .Rmd file. Notice that this lab includes Chris' source code binPredict to compute predicted versus actual plots. Furthermore, you will need to load the following dataset: nes00a.csv. You can download all the materials in the following ZIP file. Find also the script solution for Problem Set 2. You can access to the lab recording in this link. Lab 7 Supplementary material: For today's lab, we will go over this .Rmd file. Furthermore, you will need to load the following dataset: ordwarm2.csv. You can download all the materials in the following ZIP file. |
Designed byChris Adolph & Erika SteiskalCopyright 2011–2024Privacy · Terms of Use |