Term project

The term project requirement must be completed by students enrolled for the 4-credit option only. Students enrolled for 2 credits have a different, alternative, term assignment.  Both are described, below.

4-credit option:

The term project involves using R to address some research question in your subfield.  You will demonstrate your ability to 1) use basic R functions (from the base R package), and 2) show you can locate and use an R package or library that is new to you. To address your research question, you will choose a preexisting, toy dataset with which to work for the quarter. This can be one from the corpora available in the department's LDC holdings on Patas (see "Links" above), a dataset you already have (e.g., for a thesis project), or another corpus that is appropriate for the project you envision. Datasets must be cleared with the instructor.  The reason we use a toy dataset is to avoid the time and effort necessary to craft or collect one from scratch.  You will import this dataset into R, and write a script that will enable you to analyze this dataset in a manner that clearly addresses your research question.  We assume you focus on best practices for collecting your data in other courses.  We will focus on how you dig into your data in this class.  You'll describe what your script does in presentation. Your submission will be a small data analysis using that library, with a script that you can share on your research laboratory's or roundtable group's website.

Term Project Milestones:

Week 3: Turn in to Canvas a research question on which you will focus for your presentation/term project.  This should be a question of real research interest to you, defined narrowly enough to allow you to state a clear, testable hypothesis

Week 4: Give a 2-3 minute in-class progress report on your search for R packages in your subfield.

Weeks 6-8: Incorporate into your presentation a discussion of what you want your script to do.

Week 7: Come to class prepared to talk about the structure of the database you will use for your final project.

Week 10: List (in writing in Canvas) the functions to be incorporated into the script you're writing for your final project

Week 11: Give an in-class report on the progress you've made writing your script.  Show us the functions you intend to use, and the sample dataset on which the script will be run.

To be turned in (in Canvas):

1. Your script (.txt format).  (Script should have a transparent name)

2. An introductory paragraph describing what the script does that can be published with the script on a department lab or roundtable website (.txt format)

3. A toy database on which the script runs (and which can be linked to the website where the script is published) (.txt format)

4. A 5-7 page write-up containing all the following sections (in .pdf format):

a. Research Question: articulate your research question, and state your hypothesis/es

b. Methods: set forth your methods for addressing the research question

c. Materials: describe your dataset (cite its source, describe its structure and summarize its contents)

d. Procedures: describe your analysis procedures (including your script-writing in R)

e. Results: present your results

f. Discussion: supply a summary discussion.  How does this project lead you to respond to your research question? This section should then include an assessment of your toy dataset--what were its limitations? How would the database be different in order for you to consider it "really good" data for this project? Also consider how you might improve on your script if you re-ran your study.

2-credit option:

Students enrolled for two credits are NOT expected to complete term projects.  However, students will be responsible for:

1. Adding any functions covered in their class-leading to the shared list of R functions in the Sociolinguistics Laboratory Wiki (link is included on the syllabus). 

2. Supplying an example of usage of the function(s) provided in (1).

The purpose of this requirement is to ensure that all students have hands-on work related to their subfield of interest.