In order to ensure compliance with the licenses for the various
corpora we have installed, we have instituted the following policies.
| Title | LDC Catalogue number | Restricted access | Language(s) | License link |
| Callhome Egyptian Arabic Transcripts Supplement | LDC2002T38 | | Arabic | general |
| Arabic Gigaword | LDC2003T12 | | Arabic | general |
| Arabic Treebank: Part 2 v 2.0 | LDC2004T02 | | Arabic | general |
| Arabic Treebank: Part 3 v 1.0 | LDC2004T11 | | Arabic | general |
| Arabic News Translation Text Part 1 | LDC2004T17 | | Arabic | general |
| Arabic Treebank: Part 1 v 3.0 (POS with full vocal.+ syntactic analysis | LDC2005T02 | | Arabic | general |
| CALLHOME Egyptian Arabic Transcripts | LDC97T19 | | Arabic | general |
| TIDES Extraction (ACE) 2003 Multilingual Training Data | LDC2004T09 | | Arabic, Chinese, English | general |
| ACE 2004 Multilingual Training Corpus | LDC2005T09 | | Arabic, Chinese, English | general |
| Arabic Treebank Part 1 --- 10K-word English translation | LDC2003T07 | | Arabic, English | general |
| Multiple-Translation Arabic (MTA) Part 1 | LDC2003T18 | | Arabic, English | general |
| Arabic English Parallel News Part 1 | LDC2004T18 | | Arabic, English | general |
| Arabic English Parallel News Part 1 | LDC2004T18 | | Arabic, English | general |
| Multiple-Translation Arabic (MTA) Part 2 | LDC2005T05 | | Arabic, English | general |
| TREC Mandarin | LDC2000T52 | | Chinese | specific |
| Chinese Gigaword | LDC2003T09 | | Chinese | general |
| Chinese Treebank 5.0 | LDC2005T01 | | Chinese | general |
| Mandarin Chinese News Text | LDC95T13 | | Chinese | specific |
| CALLHOME Mandarin Chinese Transcripts | LDC96T16 | | Chinese | general |
| TDT2 Multilanguage Text Version 4.0 | LDC2001T57 | | "Chinese | English" | general |
| TDT3 Multilanguage Text Version 2.0 | LDC2001T58 | | Chinese, English | general |
| Chinese-English Translation Lexicon (v3.0) | LDC2002L27 | | Chinese, English | general |
| Multiple-Translation Chinese Corpus | LDC2002T01 | | Chinese, English | general |
| SummBank 1.0 | LDC2003T16 | | Chinese, English | general |
| Multiple-Translation Chinese (MTC) Part 2 | LDC2003T17 | | Chinese, English | general |
| Multiple-Translation Chinese (MTC) Part 3 | LDC2004T07 | | Chinese, English | general |
| Hong Kong Parallel Text | LDC2004T08 | | Chinese, English | specific |
| Chinese News Translation Text Part 1 | LDC2005T06 | | Chinese, English | general |
| Chinese-English News Magazine Parallel Text | LDC2005T10 | | Chinese, English | general |
| Czech Broadcast News Transcripts | LDC2004T01 | | Czech | general |
| Prague Dependency Treebank 1.0 | LDC2001T10 | | Czech, English | general |
| Prague Czech-English Dependency Treebank Version 1.0 | LDC2004T25 | | Czech, English | general |
| Grassfields Bantu Fieldwork: Dschang Lexicon | LDC2003L01 | | Dschang | general |
| Grassfields Bantu Fieldwork: Dschang Tone Paradigms | LDC2003S02 | | Dschang | general |
| CELEX 2 | LDC96L14 | | Dutch, German, English | specific |
| Santa Barbara Corpus of Spoken American English Part-I | LDC2000S85 | | English | general |
| BLLIP 1987-89 WSJ Corpus Release 1 | LDC2000T43 | | English | specific |
| MUC 7 | LDC2001T02 | | English | general |
| Temporal Evaluation Examples | LDC2002E05 | | English | general |
| RST Discourse Treebank | LDC2002T07 | | English | general |
| The AQUAINT Corpus of English News Text | LDC2002T31 | | English | general |
| Santa Barbara Corpus of Spoken American English Part-II | LDC2003S06 | | English | general |
| ACE-2 Version 1.0 | LDC2003T11 | | English | general |
| MUC 6 | LDC2003T13 | | English | general |
| SLX Corpus of Classic Sociolinguistic Interviews | LDC2003T15 | | English | general |
| ANC First Release | LDC2003T20 | Restricted access | English | specific |
| Santa Barbara Corpus of Spoken American English III | LDC2004S10 | | English | general |
| Proposition Bank I | LDC2004T14 | | English | general |
| ACE Time Normalization (TERN) 2004 English Training Data v1.0 | LDC2005T07 | | English | general |
| English Gigaword Second Edition | LDC2005T12 | | English | general |
| CCGbank | LDC2005T13 | | English | general |
| HCRC Map Task Corpus | LDC93S12 | | English | general |
| ACL/DCI | LDC93T1 | | English | specific |
| North American News Text Corpus | LDC95T21 | Restricted access | English | specific |
| English Treebank 2 | LDC95T7 | | English | general |
| COMLEX Syntax Text Corpus Version 2.0 | LDC96T11 | Restricted access | English | specific |
| DSO Corpus of Sense-Tagged English | LDC97T12 | | English | general |
| CALLHOME American English Transcripts | LDC97T14 | | English | general |
| COMLEX English syntax Lexicon | LDC98L21 | Restricted access | English | specific |
| North American News Text Supplement | LDC98T30 | Restricted access | English | specific |
| Treebank-3 | LDC99T42 | | English | general |
| Hansard French/English | LDC95T20 | | French, English | general |
| European Language Newspaper Text | LDC95T11 | Restricted access | French, German, Portuguese | specific |
| UN Parallel Text (Complete) | LDC94T4A | | French, Spanish, English | specific |
| CALLHOME German Transcripts | LDC97T15 | | German | general |
| Japanese Business News Text | LDC95T8 | Restricted access | Japanese | specific |
| CALLHOME Japanese Transcripts | LDC96T18 | | Japanese | general |
| Japanese Business News Text Supplement | LDC99T34 | Restricted Access | Japanese | specific |
| Korean Newswire | LDC2000T45 | | Korean | general |
| Korean Telephone Conversations Transcripts | LDC2003T08 | | Korean | general |
| Klex: Finite-State Lexical Transducer for Korean | LDC2004L01 | | Korean | general |
| Morphologically Annotated Korean Text | LDC2004T03 | | Korean | general |
| Korean English Treebank Annotations | LDC2002T26 | | Korean, English | specific |
| ECI Multilingual Text | LDC94T5 | | Multi | specific |
| Grassfields Bantu Fieldwork: Ngomba Tone Paradigms | LDC2001S16 | | Ngomba | general |
| Cetempublico | LDC2001T62 | | Portuguese | specific |
| Portuguese Newswire Text | LDC99T40 | | Portuguese | general |
| CALLHOME Spanish Dialogue Act Annotation | LDC2001T61 | | Spanish | general |
| Spanish Newswire Text, Volume 2 | LDC99T41 | | Spanish | general
|