Corpus Methods in Syntax

Here is the syllabus from the course I taught on corpus methods in syntax at UC Berkeley, Fall 2000. The labs (with answer keys) are all available here.


This class will cover (1) tools and techniques for using natural language corpora and (2) the research methodology of previous corpus-based studies of syntax.


Course requirements


Now that the lecture has been reduced from one and a half hours to one hour, the lab will take the "full" two hours (minus :10, of course) from 1-3.

You can also go to the lab during open hours to use those computers to log in to corpora.berkeley.edu. The open hours are subject to change, so be sure to check (at the link above) when you plan to go.

Term project

The primary course requirement is a term project involving original, corpus-based research. You will be asked to turn in a proposal early in the semester so that I can help you design and execute your project. Proposals will also be discussed in class.

S/U credit

S/U credit will be given for attendance and participation. However, S/U students who want to do a term project, or who have some other project they'd like to "workshop" are welcome to do so.


Term project proposal (1 page)10/2
Term project abstract (BLS/CLS style)11/27
Term paper writing up results of project12/15

Schedule of Topics

This schedule is tentative. The lab topics may change depending on what resources we can acquire and when. The lecture topics may change depending on how many students wish to workshop their projects.

Week of Lecture Lab
Why corpus syntax?
(Kennedy Ch 1)
Lab orientation
Web resources
9/4 Corpora
Kennedy Ch 2
Intro to Unix
9/11 English corpus syntax: Lexis
Kennedy Ch 3.1--3.2
regular expressions
9/18 English corpus syntax: Lexis (cont)
Kennedy Ch 3.2
Update on ANC
tagged corpora
9/25 English corpus syntax: Grammar
Kennedy Ch 3.3--3.4
Fillmore on FrameNet: 2-4pm 6th floor ICSI
10/2 English corpus syntax: Variation
Kennedy Ch 3.5
tgrep I
Project proposal due 10/2
10/9 Corpus-based analysis
Kennedy Ch 5.1--5.2
tgrep II
10/16 Project workshop Catch-up/review
10/23 Project workshop Perl
10/30 Statistical sampling Perl
11/6 Applying corpus methods
Arnold et al. 2000
NO LAB (Veteran's Day)
11/13 Frequencies and competence
MacDonald 1994
11/20 Frequencies and grammaticality
Sampson 1987
NO LAB (Thanksgiving)
11/27 The balancing act
Abney 1996
Project abstract due 11/28
12/4 Wrap-up/review Perl
Final paper due 12/15

Emily M. Bender (bender at csli dot stanford dot edu)
Last modified: Nov 6 2001