Selected Publications (by research area)
1. Bridging NLP and Linguistics (The RiPLes project)
- Developing ODIN:
- William Lewis and Fei Xia, 2010.
Developing ODIN: A Multilingual Repository of Annotated Language Data for Hundreds of the World's Languages,
Journal of Literary and Linguistic Computing (LLC), 25(3):303-319.
[pdf]
- Fei Xia, Carrie Lewis and William D. Lewis, 2010.
The Problems of Language Identification within Hugely Multilingual Data Sets,
Proceedings of the 7th International Conference on Language Resources and
Evaluation (LREC 2010), pages 2790-2797, Valletta, Malta, May 19-21, 2010.
[pdf]
- Fei Xia, William Lewis and Hoifung Poon, 2009.
Language ID in the Context of Harvesting Language Data off the Web,
Proceedings of the 12th Conference of the European Chapter of the ACL
(EACL-2009), pages 870-878, Athens, Greece, March 30 - April 3, 2009.
[pdf]
- Fei Xia and William Lewis, 2008.
Repurposing Theoretical Linguistic Data for Tool Development and Search,
Proceedings of the Third International Joint Conference on Natural Language Processing (IJCNLP-2008), pages 529-536, Hyderabad, India, Jan 7-12, 2008.
[pdf]
- Building language profiles and comparing languages:
- Ryan Georgi, Fei Xia, and Will Lewis, 2010.
Comparing Language Similarity across Genetic and Typologically-Based Groupings,
Proceedings of the 23rd International Conference on Computational Linguistics
(COLING 2010), pages 385-393, Beijing, China, August 23-27, 2010.
[pdf]
- Fei Xia and William Lewis, 2009.
Applying NLP Technologies to the Collection and Enrichment of Language Data on the Web to Aid Linguistic Research,
Proceedings of the EACL 2009 Workshop on Language Technology and Resources for
Cultural Heritage, Social Sciences, Humanities, and Education (LaTeCH-SHELT\&R 2009), pages 51-59, Athens, Greece, 30 March 2009.
[pdf]
- William Lewis and Fei Xia, 2008.
Automatically Identifying Computationally Relevant Typological Features,
Proceedings of the Third International Joint Conference on Natural Language Processing (IJCNLP-2008), pages 685-690, Hyderabad, India, Jan 7-12, 2008.
[pdf]
- Structural projection:
- Ryan Georgi, Fei Xia, and William D. Lewis, 2012. Improving Dependency Parsing with Interlinear Glossed Text and Syntactic Projection, short paper, In Proceedings of COLING. Mumbai, India, Dec 2012.
[pdf]
- Ryan Georgi, Fei Xia, and William D. Lewis. 2012.
Measuring the Divergence of Dependency Structures Cross-Linguistically to Improve Syntactic Projection Algorithms,
In Proceedings of LREC, Istanbul, Turkey, May 22-25, 2012.
[pdf]
- Fei Xia and William Lewis, 2007.
Multilingual Structural Projection across Interlinearized Text,
Proceedings of NAACL HLT 2007, pages 452-459, Rochester, NY, April 22-27, 2007.
[pdf]
2. Treebank development
- Conversion from dependency structure to phrase structure:
- Rajesh Bhatt, Owen Rambow, and Fei Xia, 2012. Creating a Tree Adjoining Grammar from a Multilayer Treebank, in Proceedings of the 11th International Workshop on Tree Adjoining Grammars and Related Formalisms (TAG+11), pages 162-170, Paris, France, September 2012.
[pdf]
- Rajesh Bhatt and Fei Xia, 2012.
Challenges in Converting between Treebanks: a Case Study from the HUTB,
in Proceedings of META-RESEARCH Workshop on Advanced Treebanking, in conjunction with LREC-2012, Istanbul, Turkey.
[pdf]
- Rajesh Bhatt, Owen Rambow, and Fei Xia, 2011.
Linguistic Phenomena, Analyses, and Representations: Understanding Conversion between Treebanks,
In the Proc. of the IJCNLP, Chiang Mai,Thailand, Nov 9-13, 2011.
[pdf]
- Fei Xia, Owen Rambow, Rajesh Bhatt, Martha Palmer, and Dipti Misra Sharma, 2009.
Towards a Multi-Representational Treebank," the 7th International Workshop on Treebanks and Linguistic Theories (TLT 2009), pages 159-170, Groningen, Netherlands, Jan 23-24, 2009.
[pdf]
- Fei Xia and Martha Palmer, 2001.
Converting Dependency Structures to Phrase Structures,
Proceedings of the 1st Human Language Technology Conference (HLT-2001), San
Diego, Mar 18-21, 2001.
[pdf]
- The Hindi/Urdu Treebank Project:
- Archna Bhatia, Rajesh Bhatt, Bhuvana Narasimhan, Martha Palmer, Owen Rambow,
Dipti Misra Sharma, Michael Tepper, Ashwini Vaidya, Fei Xia, 2010.
Empty Categories in a Hindi Treebank,
Proceedings of the 7th International Conference on Language Resources and Evaluation (LREC 2010), pages 1863-1870, Valletta, Malta, May 19-21, 2010.
[pdf]
- Martha Palmer, Rajesh Bhatt, Bhuvana Narasimhan, Owen Rambow, Dipti
Misra Sharma, and Fei Xia, 2009.
Hindi Syntax: Annotating Dependency, Lexical Predicate-Argument
Structure, and Phrase Structure,
Proceedings of the 7th International Conference on Natural Language Processing (ICON-2009), pages 259-268, Hyderabad, India, Dec 14-17, 2009.
[pdf]
- Rajesh Bhatt, Bhuvana Narasimhan, Martha Palmer, Owen Rambow,
Dipti Misra Sharma, and Fei Xia, 2009.
A Multi-Representational and Multi-Layered Treebank for Hindi/Urdu,
Proceedings of the Third Linguistic Annotation Workshop (LAW 2009), ACL-IJCNLP 2009, pages 186-189, Singapore, 6-7 August 2009.
[pdf]
- The Chinese Penn Treebank Project:
- Nianwen Xue, Fei Xia, Fu-dong Chiou, and Martha Palmer, 2005.
The Penn Chinese Treebank: Phrase Structure Annotation of a Large Corpus,
Journal of Natural Language Engineering, 11(2): 207-238, 2005.
Cambridge University Press.
[pdf]
- Fei Xia, Martha Palmer, Nianwen Xue, Mary Ellen Okurowski, John Kovarik,
Fu-Dong Chiou, Shizhe Huang, Tony Kroch, and Mitch Marcus, 2000.
Developing Guidelines and Ensuring Consistency for Chinese Text Annotation,
Proceedings of the 2nd International Conference on Language Resources and Evaluation (LREC-2000), Athens, Greece, May 31 - June 2, 2000.'
[pdf]
- Fei Xia, 2000.
The Segmentation Guidelines for the Penn Chinese Treebank (3.0),
IRCS Report 00-06, University of Pennsylvania, Oct 2000.
[pdf]
- Fei Xia, 2000.
The Part-of-Speech Guidelines for the Penn Chinese Treebank (3.0),
IRCS Report 00-07, University of Pennsylvania, Oct 2000.
[pdf]
- Nianwen Xue and Fei Xia, 2000.
The Bracketing Guidelines for the Penn Chinese Treebank (3.0),
IRCS Report 00-08, University of Pennsylvania, Oct 2000.
[pdf]
3. Bio-NLP
- Phenotype detection:
- Cosmin Adrian Bejan, Lucy Vanderwende, Fei Xia, Meliha Yetisgen-Yildiz. Assertion modeling and its role in clinical phenotype identification, accepted by Journal of Biomedical Informatics.
[pdf]
- Cosmin Adrian Bejan, Fei Xia, Lucy Vanderwende, Mark M. Wurfel, and Meliha Yetisgen-Yildiz, 2012.
Pneumonia identification using statistical feature selection,
Journal of American Medical Informatics Association (JAMIA), 19(5): 817-823.
[pdf]
- Meliha Yetisgen-Yildiz, Bradford Glavan, Fei Xia, Lucy Vanderwende, and Mark Wurfel, 2011.
Identifying Patients with Pneumonia from Free-Text Intensive Care Unit Reports.
In Proc. of the ICML workshop on Learning from Unstructured Clinical Text, Bellevue, WA, July 2, 2011.
[pdf]
- Imre Solti, Colin Cooke, Fei Xia, and Mark Wurfel, 2009:
Automated classification of radiology reports for acute lung injury: Comparison of keyword and machine learning based natural language processing approaches,
Proceedings of the IEEE International Conference on Bioinformatics and Biomedicine Workshop (BIBM-2009), pages 314-319, Washington DC, November 1-4, 2009.
[pdf]
- Detecting critical recommendations:
- Meliha Yetisgen-Yildiz, Martin Gunn, Fei Xia, and Tom Payne, 2011.
Automatic Identification of Critical Follow-Up Recommendation Sentences in Radiology Reports.
In Proc. of the AMIA 2011 Annual Symposium, Washington DC, Oct 22-26, 2011.
[pdf]
- Clinical corpus annotation:
- Fei Xia and Meliha Yetisgen-Yildiz, 2012.
Clinical corpus annotation: challenges and strategies,
in Proceedings of the third Workshop on Building and Evaluating Resources for Biomedical Text Mining, in conjunction with LREC-2012, Istanbul, Turkey.
[pdf]
- Ozlem Uzuner, Imre Solti, Fei Xia, and Eithon Cadag, 2010.
Community Annotation Experiment for Ground Truth Generation for the i2b2 Medication Challenge,
Journal of the American Medical Informatics Association (JAMIA), 17:519-523.
[pdf]
- Meliha Yetisgen-Yildiz, Imre Solti, Fei Xia, and Scott Halgrim, 2010.
Preliminary Experiments with Amazon's Mechanical Turk for Annotating Medical Named Entities,
Proceedings of the NAACL HLT 2010 Workshop on Creating Speech and Language Data with Amazon's Mechanical Turk, pages 180-183, Los Angeles, June 2010.
[pdf]
- Extracting medication information (the 2009 i2b2 challenge):
- Scott Halgrim, Fei Xia, Imre Solti, Eithon Cadag, Ozlem Uzuner, 2011.
A cascade of MaxEnt classifiers applied to extracting medication information from discharge summaries,
Journal of Biomedical Semantics 2011, 2 (Suppl 3):S2.
[pdf]
- Fei Xia, Imre Solti, and Ozlem Uzuner, 2009.
UW Internal Annotation Guidelines for the 2009 i2b2 Challenge and UW Medication IE System, Manuscript.
[manuscript]
- Ozlem Uzuner, Imre Solti, and Fei Xia, 2009.
The i2b2 Medication Extraction Challenge Preliminary Annotation Guidelines,
Manuscript.
[manuscript]
- Ozlem Uzuner, Imre Solti, and Fei Xia, 2009.
The i2b2 Medication Extraction Challenge Evaluation Metrics,
Manuscript.
[manuscript]
- Other Bio-NLP topics:
- Louise Deleger, Katalin Molnar, Guergana Savova, Fei Xia, Todd Lingren, Qi Li, Keith Marsolo, Anil G. Jegga, Megan Kaiser, Laura Stoutenborough, and Imre Solti, 2013.
Large Scale Evaluation of Automated Clinical Note De-identification and its Impact on Information Extraction.
Journal of the American Medical Informatics Association (JAMIA),
20(1): 84-94.
[pdf]
- Michael Tepper, Daniel Capurro, Fei Xia, Lucy Vanderwende, and Meliha Yetisgen-Yildiz, 2012.
Statistical Section Segmentation in Free-Text Clinical Records.
In the Proceedings of the LREC, Istanbul, Turkey, May 22-25, 2012.
[pdf]
- Cuijun Wu, Fei Xia, Louise Deleqer, and Imre Solti, 2011.
Statistical Machine Translation for Biomedical Text: Are We There Yet?
In the Proc. of the AMIA 2011 Annual Symposium, Washington DC, Oct 22-26, 2011.
[pdf]
4. Chinese NLP
- Domain adaptation:
- Dong Wang and Fei Xia, 2012.
Effort of Genre Variation and Prediction of System Performance,
In Proceedings of LREC, Istanbul, Turkey, May 22-25, 2012.
[pdf]
- Yang Song and Fei Xia, 2012.
Using a Goodness Measurement for Domain Adaptation: A Case Study on Chinese Word Segmentation,
In Proceedings of LREC, Istanbul, Turkey, May 22-25, 2012.
[pdf]
- POS tagging:
- Alex Cheng, Fei Xia, and Jianfeng Gao, 2010.
A comparison of unsupervised methods for Part of Speech Tagging in Chinese,
Proceedings of the 23rd International Conference on Computational Linguistics
(COLING 2010), Poster Volume, pages 135-143, Beijing, China, August 23-27, 2010.
[pdf]
- The Chinese Penn Treebank Project (see the "Treebank Development" section)
5. Machine Translation
- Statistical MT:
- Fei Xia and Michael McCord, 2004.
Improving a Statistical MT System with Automatically Learned Rewrite Patterns", the 20th International Conference on Computational Linguistics (COLING 2004), Geneva, Switzerland, Aug 22-29, 2004.
[pdf]
- Christoph Tillmann and Fei Xia, 2003.
A Phrase-Based Unigram Model for Statistical Machine Translation,
Proceedings of the 3rd Human Language Technology Conference (HLT/NAACL 2003), Edmonton, Canada, May 27 -- June 2, 2003.
[pdf]
- Transfer-based MT:
- Hiyan Alshawi, Adam Buchsbaum, and Fei Xia, 1997.
A Comparison of Head Transducers and Transfer for a Limited Domain Translation,
Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics (ACL-1997), pages 360-365, Madrid, Spain, July 7-11, 1997.
[pdf]
6. Tree Adjoining Grammar
- Grammar Extraction (LexTract):
- Fei Xia and Martha Palmer, 2010.
From Treebank to Tree-Adjoining Grammar,
In Supertagging: Using Complex Lexical Descriptions in Natural Language Processing, edited by Srinivas Bangalore and Aravind K. Joshi, pages 35-72, MIT Press, 2010.
[pdf]
- Fei Xia, Chung-hye Han, Martha Palmer and Aravind Joshi, 2001.
Automatically Extracting and Comparing Lexicalized Grammars for Different Languages,
Proceedings of the 17th International Joint conference on Artificial Intelligence (IJCAI-2001), pages 1321-1326, Seattle, Aug 4-10, 2001.
[pdf]
- Fei Xia, Martha Palmer, and Aravind Joshi, 2000.
A Uniform Method of Grammar Extraction and Its Applications,
Proceedings of the Joint SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora (EMNLP/VLC-2000), pages 53-62, Hong Kong, Oct 7-8, 2000.
[pdf]
- Fei Xia and Martha Palmer, 2000.
Evaluating the Coverage of LTAGs on Annotated Corpora,
Proceedings of the Workshop on Using Evaluation within HLT Programs: Results and Trends, Athens, Greece, May 30, 2000.
[pdf]
- Fei Xia and Tonia Bleam, 2000.
A Corpus-based Evaluation of Syntactic Locality in TAGs,
Proceedings of the 5th International Workshop on Tree Adjoining Grammar and Related Formalisms (TAG+ 2000), pages 215-220, Paris, France, May 25-27, 2000.
[pdf]
- Fei Xia and Martha Palmer, 2000.
Comparing and Integrating Tree Adjoining Grammars,
Proceedings of the 5th International Workshop on Tree Adjoining Grammar and Related Formalisms (TAG+ 2000), pages 265-268, Paris, France, May 25-27, 2000.
[pdf]
- Fei Xia, 1999.
Extracting Tree Adjoining Grammars from Bracketed Corpora,
Proceedings of the 5th Natural Language Processing Pacific Rim Symposium (NLPRS-99), pages 398-403, Beijing, China, Nov. 1999.
[pdf]
- Grammar Generation (LexOrg):
- Fei Xia, Martha Palmer, and Vijay Shanker, 2010.
Developing Tree-Adjoining Grammars with Lexical Descriptions,
in Supertagging: Using Complex Lexical Descriptions in Natural Language Processing, edited by Srinivas Bangalore and Aravind K. Joshi, pages 73-110, MIT Press, 2010.
[pdf]
- Fei Xia, Martha Palmer and K. Vijay-Shanker, 2005.
Automatically Generating Tree Adjoining Grammars from Abstract Specifications,
Journal of Computational Intelligence, 21(3), 246-287, 2005.
[pdf]
- Fei Xia, Martha Palmer, and K. Vijay-Shanker, 1999.
Towards Semi-automating Grammar Development,
Proceedings of the 5th Natural Language Processing Pacific Rim Symposium (NLPRS-99), pages 96-101, Beijing, China, Nov. 1999.
[pdf]
- Fei Xia, Martha Palmer, K. Vijay-Shanker and Joseph Rosenzweig, 1998.
Consistent Grammar Development Using Partial-Tree Descriptions for LTAGs,
Proceedings of the 4th International Workshop on Tree Adjoining Grammar and Related Formalisms (TAG+ 1998), page 180-183, Philadelphia, Aug 1-3, 1998.
[pdf]
- Other Topics on LTAG:
- Anoop Sarkar, Fei Xia, and Aravind Joshi, 2000.
Some Experiments on Indicators of Parsing Complexity for Lexicalized Grammars,
In Proceedings of Efficiency in Large-Scale Parsing Systems Workshop, Luxembourg, Germany, Aug 5, 2000.
[pdf]
- Christy Doran, Beth Ann Hockey, Anoop Sarkar, B. Srinivas and Fei Xia, 2000.
Evolution of the XTAG System,
in Tree Adjoining Grammars: Formalisms, Linguistic Analysis and Processing,
a CSLI volume edited by Anne Abeille and Owen Rambow, pages 371-404, 2000.
[pdf]
7. Other topics:
- Morphological induction:
- Michael Tepper and Fei Xia, 2010.
Inducing Morphemes Using Light Knowledge,
Journal of ACM Transactions on Asian Language Information Processing (TALIP), 9(3): 1-38, 2010.
[pdf]
- Social media:
- Kelly Peterson, Matt Hohensee, and Fei Xia, 2011.
Email Formality in the Workplace: A Case Study on the Enron Corpus,
In Proceedings of the 2011 ACL Workshop on Language in Social Media (LSM 2011), Portland, Oregon, June 23, 2011.
[pdf]
- Teaching CL:
- Emily Bender, Fei Xia, and Erik Bansleben, 2008.
Building a flexible, collaborative, intensive master's program in computational linguistics,
Proceedings of the Third Workshop on Issues in Teaching Computational Linguistics (TeachCL-2008), pages 10-18, Columbus, Ohio, June 19-20, 2008.
[pdf]
Last
modified on Jan 23, 2013.