Publications (by research area)

Bridging NLP and Linguistics (The RiPLes project)

  • William Lewis and Fei Xia. "Developing ODIN: A Multilingual Repository of Annotated Language Data for Hundreds of the World's Languages," Journal of Literary and Linguistic Computing (LLC).

  • Fei Xia, William Lewis and Hoifung Poon. "Language ID in the Context of Harvesting Language Data off the Web," The 12th Conference of the European Chapter of the Association for Computational Linguistics (EACL-2009), Athens, Greece, March 30 - April 3, 2009. [pdf]

  • William Lewis and Fei Xia. "Parsing, Projecting & Prototypes: Repurposing Linguistic Data on the Web," The 12th Conference of the European Chapter of the Association for Computational Linguistics (EACL-2009), Demo session, Athens, Greece, March 30 - April 3, 2009. [pdf]

  • Fei Xia, Carrie Lewis, and William Lewis. "Language ID for a Thousand Languages," LSA-2010, Baltimore, Maryland, Jan 7-10, 2010.

  • Fei Xia and William Lewis. "Applying NLP Technologies to the Collection and Enrichment of Language Data on the Web to Aid Linguistic Research," The workshop on Language Technology and Resources for Cultural Heritage, Social Sciences, Humanities, and Education (LaTeCH-SHELT&R 2009), in conjunction with EACL-2009, Athens, Greece, March 30, 2009. [pdf]

  • Fei Xia and William Lewis. "Repurposing Theoretical Linguistic Data for Tool Development and Search," The Third International Joint Conference on Natural Language Processing (IJCNLP-2008), Hyderabad, India, Jan 7-12, 2008. [pdf]

  • William Lewis and Fei Xia. "Automatically Identifying Computationally Relevant Typological Features," The Third International Joint Conference on Natural Language Processing (IJCNLP-2008), Hyderabad, India, Jan 7-12, 2008. [pdf]

  • Fei Xia and William Lewis. "Multilingual Structural Projection across Interlinearized Text, " The Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL-HLT 2007), Rochester, NY, April 22-27, 2007. [pdf]

  • William Lewis, Fei Xia, and Dan Jinguji. "Enriching Language Data through Projected Structures", The Workshop on Computational Linguistics for Less-studied Languages, organized by Texas Linguistics Society (TLSX), Austin, Texas, Nov 3-5, 2006. [pdf]

    Treebank development

  • Martha Palmer, Rajesh Bhatt, Bhuvana Narasimhan, Owen Rambow, Dipti Misra Sharma, and Fei Xia. "Hindi Syntax: Annotating Dependency, Lexical Predicate-Argument Structure, and Phrase Structure", The 7th International Conference on Natural Language Processing (ICON-2009), Hyderabad, India, Dec 14-17, 2009. [pdf]

  • Rajesh Bhatt, Bhuvana Narasimhan, Martha Palmer, Owen Rambow, Dipti Misra Sharma, Fei Xia. 2009. "A Multi-Representational and Multi-Layered Treebank for Hindi/Urdu," The Third Linguistic Annotation Workshop (The LAW III) in conjunction with ACL/IJCNLP 2009. Singapore. Aug 6-7, 2009. [pdf]

  • Fei Xia, Owen Rambow, Rajesh Bhatt, Martha Palmer, and Dipti Misra Sharma. "Towards a Multi-Representational Treebank," the 7th International Workshop on Treebanks and Linguistic Theories (TLT 2009), Groningen, Netherlands, Jan 23-24, 2009. [pdf]

  • Nianwen Xue, Fei Xia, Fu-dong Chiou, and Martha Palmer. "The Penn Chinese Treebank: Phrase Structure Annotation of a Large Corpus", Journal of Natural Language Engineering, 11(2): 207-238, 2005.

  • Fei Xia and Martha Palmer. "Converting Dependency Structures to Phrase Structures", the 1st Human Language Technology Conference (HLT-2001), San Diego, Mar 18-21, 2001. [pdf]

  • Fei Xia, Martha Palmer, Nianwen Xue, Mary Ellen Okurowski, John Kovarik, Fu-Dong Chiou, Shizhe Huang, Tony Kroch, Mitch Marcus. "Developing Guidelines and Ensuring Consistency for Chinese Text Annotation", the 2nd International Conference on Language Resources and Evaluation (LREC-2000), Athens, Greece, May 31 - June 2, 2000. [pdf]

  • Fei Xia. "The Segmentation Guidelines for the Penn Chinese Treebank (3.0)", IRCS Report 00-06, University of Pennsylvania, Oct 2000. [pdf]

  • Fei Xia. "The Part-of-Speech Guidelines for the Penn Chinese Treebank (3.0)", IRCS Report 00-07, University of Pennsylvania, Oct 2000. [pdf]

  • Nianwen Xue and Fei Xia. "The Bracketing Guidelines for the Penn Chinese Treebank (3.0)", IRCS Report 00-08, University of Pennsylvania, Oct 2000. [pdf]

    Bio-NLP

  • Imre Solti, Colin R. Cooke, Fei Xia, and Mark M. Wurfel, "Peeling Away the Black Box Label: Clinical Validation of a MaxEnt Machine Learning Character N-gram Feature Set for Acute Lung Injury," 2010 AMIA Summit on Translational Bioinformatics, San Francisco, CA, March 10-12, 2010. [pdf]

  • Scott Russell Halgrim, Fei Xia, Imre Solti, Eithon Cadag, and Ozlem Uzuner. "Statistical Extraction of Medication Information from Clinical Records," 2010 AMIA Summit on Translational Bioinformatics, San Francisco, CA, March 10-12, 2010. [pdf]

  • Imre Solti, Colin Cooke, Fei Xia, and Mark Wurfel: "Automated classification of radiology reports for acute lung injury: Comparison of keyword and machine learning based natural language processing approaches," NLP Workshop, IEEE International Conference on Bioinformatics and Biomedicine (BIBM-2009), Washington DC, November 1-4, 2009. [pdf]

  • Ozlem Uzuner, Imre Solti, and Fei Xia, 2009. "i2b2 Medication Extraction Challenge Preliminary Annotation Guidelines," Manuscript.

  • Ozlem Uzuner, Imre Solti, and Fei Xia, 2009. "i2b2 Medication Extraction Challenge Evaluation Metrics," Manuscript.

    Machine Translation

  • Achim Ruopp and Fei Xia. "Finding parallel texts on the web using cross- language information retrieval", The Workshop on Cross Language Information Access in conjunction with IJCNLP-2008. Hyderabad, India, Jan 7-12, 2008. [pdf]

  • Fei Xia and Michael McCord. "Improving a Statistical MT System with Automatically Learned Rewrite Patterns", the 20th International Conference on Computational Linguistics (COLING 2004), Geneva, Switzerland, Aug 22-29, 2004. [pdf]

  • Christoph Tillmann and Fei Xia. "A Phrase-Based Unigram Model for Statistical Machine Translation", the 3rd Human Language Technology Conference (HLT/NAACL 2003), Edmonton, Canada, May 27 -- June 2, 2003. [pdf]

  • Y. Al-Onaizan, R. Florian, M. Franz, H. Hassan, Y. S. Lee, S. McCarley, K. Papineni, S. Roukos, J. Sorensen, C. Tillmann, T. Ward, F. Xia. "TIPS: A Translingual Information Processing System", The 3rd Human Language Technology Conference (HLT/NAACL-2003), Demo Session, Edmonton, Canada, May 27 - June 2, 2003. [pdf]

  • Hiyan Alshawi, Adam Buchsbaum and Fei Xia. "A Comparison of Head Transducers and Transfer for a Limited Domain Translation", the 35th Annual Meeting of the Association for Computational Linguistics (ACL-1997), Madrid, Spain, July 7-11, 1997. [pdf]

  • Hiyan Alshawi and Fei Xia. "English-to-Mandarin Speech Translation with Head Transducers", the Workshop of Spoken Language Translation (SLT-1997), Madrid, Spain, July 11, 1997. [pdf]

    Grammar Extraction (LexTract)

  • Fei Xia, Martha Palmer and Aravind Joshi. "From Treebank to Tree-Adjoining Grammar", in Complexity of Lexical Descriptions and its Relevance to Natural Language Processing: A Supertagging Approach, edited by Srinivas Bangalore and Aravind K. Joshi, MIT Press, To appear.

  • Fei Xia, Chung-hye Han, Martha Palmer and Aravind Joshi. "Automatically Extracting and Comparing Lexicalized Grammars for Different Languages", the 17th International Joint conference on Artificial Intelligence (IJCAI-2001), Seattle, Aug 4-10, 2001. [pdf]

  • Fei Xia, Martha Palmer, and Aravind Joshi. "A Uniform Method of Grammar Extraction and Its Applications", the Joint SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora (EMNLP/VLC-2000), Hong Kong, Oct 7-8, 2000. [pdf]

  • Fei Xia, Chung-hye Han, Martha Palmer, and Aravind Joshi. "Comparing Lexicalized Treebank Grammars Extracted from Chinese, Korean, and English Corpora", the 2nd Chinese Language Processing Workshop (CLP-2000), Hong Kong, Oct 8, 2000. [pdf]

  • Fei Xia and Martha Palmer. "Evaluating the Coverage of LTAGs on Annotated Corpora", the Workshop on Using Evaluation within HLT Programs: Results and Trends, Athens, Greece, May 30, 2000. [pdf]

  • Fei Xia and Tonia Bleam. "A Corpus-based Evaluation of Syntactic Locality in TAGs", the 5th International Workshop on Tree Adjoining Grammar and Related Formalisms (TAG+ 2000), Paris, France, May 25-27, 2000. [pdf]

  • Fei Xia, Martha Palmer. "Comparing and Integrating Tree Adjoining Grammars", the 5th International Workshop on Tree Adjoining Grammar and Related Formalisms (TAG+ 2000), Paris, France, May 25-27, 2000. [pdf]

  • Fei Xia, "Extracting Tree Adjoining Grammars from Bracketed Corpora", the 5th Natural Language Processing Pacific Rim Symposium (NLPRS-99), Beijing, China, Nov. 1999. [pdf]

    Grammar Generation (LexOrg)

  • Fei Xia, Martha Palmer, and Vijay Shanker, "Developing Tree-Adjoining Grammars with Lexical Descriptions," in Complexity of Lexical Descriptions and its Relevance to Natural Language Processing: A Supertagging Approach, edited by Srinivas Bangalore and Aravind K. Joshi, MIT Press, To appear.

  • Fei Xia, Martha Palmer and K. Vijay-Shanker. "Automatically Generating Tree Adjoining Grammars from Abstract Specifications", Journal of Computational Intelligence, 21(3), 246-287, 2005. [pdf]

  • Fei Xia, Martha Palmer, K. Vijay-Shanker. "Towards Semi-automating Grammar Development", the 5th Natural Language Processing Pacific Rim Symposium (NLPRS-99), Beijing, China, Nov. 1999. [pdf]

  • Fei Xia, Martha Palmer, K. Vijay-Shanker and Joseph Rosenzweig. "Consistent Grammar Development Using Partial-Tree Descriptions for LTAGs", the 4th International Workshop on Tree Adjoining Grammar and Related Formalisms (TAG+ 1998), Philadelphia, Aug 1-3, 1998. [pdf]

    Other Topics on LTAG

  • Anoop Sarkar, Fei Xia, and Aravind Joshi. "Some Experiments on Indicators of Parsing Complexity for Lexicalized Grammars", Efficiency in Large-Scale Parsing Systems Workshop, Luxembourg, Germany, Aug 5, 2000. [pdf]

  • Christy Doran, Beth Ann Hockey, Anoop Sarkar, B. Srinivas and Fei Xia. "Evolution of the XTAG System", in Tree Adjoining Grammars: Formalisms, Linguistic Analysis and Processing, a CSLI volume edited by Anne Abeille and Owen Rambow, 2000. [pdf]

  • Martha Palmer, Chung-hye Han, Fei Xia, Dania Egedi and Joseph Rosenzweig. "Constraining Lexical Selection across Languages Using Tree Adjoining Grammars", in Tree Adjoining Grammars: Formalisms, Linguistic Analysis and Processing, a CSLI volume edited by Anne Abeille and Owen Rambow, 2000. [pdf]

  • C. Doran, B. Hockey, P. Hopely, J. Rosenzweig, A. Sarkar, B. Srinivas, F. Xia, A. Nasr and O. Rambow. "Maintaining the Forest and Burning out the Underbrush in XTAG", the Workshop on Computational Environments for Grammar Development and Language Engineering (ENVGRAM-1997), Madrid, Spain, July 12, 1997. [pdf]

  • Chung-hye Han, Fei Xia, Martha Palmer and Joseph Rosenzweig. "Capturing Language Specific Constraints on Lexical Selection with Feature-Based LTAGs", International Conference on Chinese Computing (ICCC-1996), Singapore, June 1996. [pdf]

    Morphology and POS tagging

  • Michael Tepper and Fei Xia. "Inducing Morphemes Using Light Knowledge," To appear.

  • Michael Tepper and Fei Xia. "A Hybrid Approach to the Induction of Underlying Morphology," The Third International Joint Conference on Natural Language Processing (IJCNLP-2008), Hyderabad, India, Jan 7-12, 2008. [pdf]

  • Fei Xia and Lap Cheung, "Features, Bagging, and System Combination for the Chinese POS Tagging Task," The 5th SIGHAN Workshop on Chinese Language Processing (SIGHAN 2006), Sydney, Australia, July 22-23, 2006. [pdf]

    Teaching CL

  • Fei Xia. 2008. "The evolution of a statistical NLP course," In Proceedings of the Third ACL Workshop on Effective Tools and Methodologies for Teaching NLP and CL, Columbus, Ohio, June 19-20, 2008. [pdf]

  • Emily Bender, Fei Xia, and Erik Bansleben. 2008. "Building a flexible, collaborative, intensive master's program in computational linguistics," In Proceedings of the Third ACL Workshop on Effective Tools and Methodologies for Teaching NLP and CL, Columbus, Ohio, June 19-20, 2008. [pdf]