Corpus resources: Software
The following list is in no particular order.
A search engine for searching treebanks, using a query language akin to TFS formalisms.
Kenji Kita's list
of all kinds of software related to corpus linguistics and NLP (includes POS taggers, parsers, stemmers, language identification, and corpus tools).
: the X for Mac program we're using in the lab.
The PLUG word aligner
Analysis and Prediction of Innovation in the Lexicon.
An adaptation of Brill's tagger for French and Windows.
A commerical POS tagger (also gives partial grammatical analysis) for French.
A prototype morphological-semantic parser for English compound participles.
Windows software providing some of the functionality of the SARA tool distributed with the BNC (SARA is a UNIX tool).
Bigram Statistics Package
by Ted Pedersen.
by William H. Fletcher. This is software you download to your computer to do KWIC searches of the web.
Computational Linguistics Group, University of Wolverhampton
Anaphora resolution and other software.
Technical University of Catalonia
Tagger, Parser and Morphosyntactic Analyzer for English, Spanish, and Catalan.
A new-improved version of tgrep by Douglas L. T. Rohde.
A prototype program for processing multi-lingual texts, initially written for processing the Sheffield University's METER corpus. By Scott Songlin Piao.
segmenter/tagger for Chinese
(This link takes you to a registration page, which displays in Chinese.)
(Japanese Morphological Analyzer -- Tokenizer and POS tagger)
Emily M. Bender (bender at csli dot stanford dot edu)
Last modified: Jan 18 2002