MEBI 591C - Biomedical & Health Informatics Research Colloquium
Data and Text Mining in Biomedical Informatics
Lojistics:
- Instructor: Meliha Yetisgen-Yildiz
- Time: Wednesdays, 3:30-4:20 p.m.
- Location: Health Sciences, Room E-212
- Email List: mebi591c_sp10@u.washington.edu
Description:
In this seminar class, we will survey the current issues in biomedical data and text mining. The students will pick either or both of the following two assignments:
- Presentation: Survey or paper presentation of a selected data/text mining research area,
- System Design: Participation to a text mining system design for 2010 i2b2 Shared task. If there is enough interest, we will form a team and discuss the problem, analyze the challenge data, and research potential solutions. Details of i2b2 challenge can be found at: https://www.i2b2.org/NLP/Relations/
Schedule:
- Week #1 - 03/31/2010: Introduction & Planning- presenter: melihay - Slides
- Week #2 - 04/07/2010: Text Mining/NLP Subproblems - presenter: melihay - Slides
- Week #3 - 04/14/2010: Secondary use of EMR: temporal interval abstractions - presenter: Daniel Capurro Nario - Slides
- Week #4 - 04/21/2010: An Overview of Biomedical Entity Recognition - presenter: Jeffry Scott - Slides - - Notes
- Week #5 - 04/28/2010: UTR motif discovery to investigate differential protein expression in Leishmanias - presenter: Carol Louise Farris - Slides
- Week #6 - 05/05/2010: Information Prescription: a New Trend in Biomedical Informatics to Manage Uncertainty in the Health Care System - presenter: Francisco Saaverda - Slides
- Week #7 - 05/12/2010: Predicting Survival in Patients from In- hospital Resuscitation: Machine Learning VS Logistic Regression - presenter: Tsung-Chien (Jonathan) Lu - Slides
- Week #8 - 05/19/2010: Wynona Black
- Week #9 - 05/26/2010: Karl Jablonowski
- Week#10 - 06/02/2010: Stella Podgornik
Suggested Text Mining & Natural Language Processing (NLP) Readings:
Biomedical:
- Jensen LJ, Saric J, Bork P. Literature mining for the biologist: from information retrieval to biological discovery. Nat Rev Genet. 2006 Feb;7(2):119-29. PubmedLink
- Cohen AM, Hersh WR. A survey of current work in biomedical text mining. Brief Bioinform. 2005 Mar;6(1):57-71. PubmedLink
- Krallinger M, Erhardt RA, Valencia A. Text-mining approaches in molecular biology and biomedicine. Drug Discov Today. 2005 Mar 15;10(6):439-45. PubmedLink
- Scherf M, Epple A, Werner T. The next generation of literature analysis: integration of genomic analysis into text mining. Brief Bioinform. 2005 Sep;6(3):287-97. PubmedLink
- Hirschman L, Park JC, Tsujii J, Wong L, Wu CH. Accomplishments and challenges in literature data mining for biology. Bioinformatics. 2002 Dec;18(12):1553-61. PubmedLink
Clinical:
- Chapman WW, Cohen KB. Current issues in biomedical text mining and natural language processing. J Biomed Inform. 2009 Oct;42(5):757-9.PubmedLink
- Demner-Fushman D, Chapman WW, McDonald CJ. What can natural language processing do for clinical decision support? J Biomed Inform. 2009 Oct;42(5):760-72.PubmedLink
- Hripcsak G, Austin JHM, Alderson PO, Friedman C. Use of natural language processing to translate clinical information from a database of 889,921 chest radiographic reports. Radiology 2002;224:157-63.
- Ware H, Mullet CJ, and Jagannathan V. Natural Language Processing Framework to Assess Clinical Conditions. JAMIA 2009;16:585-9. PDF.
- Friedman C, Alderson PO, Austin JH, Cimino JJ, Johnson SB. A general natural-language text processor for clinical radiology. J Am Med Inform Assoc. 1994 Mar-Apr;1(2):161-74. PubmedLink
Papers from Previous i2b2 Challenges:
2009-Obesity Challenge:
- Uzuner O. Recognizing Obesity and Comorbidities in Sparse Data.
J Am Med Inform Assoc. 2009 Jul-Aug; 16(4): 561-570. PubmedLink
- Childs LC, Enelow R, Simonsen L, Heintzelman NH, Kowalski KM, Taylor RJ. Description of a rule-based system for the i2b2 challenge in natural language processing for clinical data.J Am Med Inform Assoc. 2009 Jul-Aug;16(4):571-5.PubmedLink
- Mishra NK, Cummo DM, Arnzen JJ, Bonander J. A rule-based approach for identifying obesity and its comorbidities in medical discharge summaries. J Am Med Inform Assoc. 2009 Jul-Aug;16(4):576-9.PubmedLink
- Solt I, Tikk D, Gal V, Kardkovacs ZT. Semantic classification of diseases in discharge summaries using a context-aware rule-based classifier. J Am Med Inform Assoc. 2009 Jul-Aug;16(4):580-4. Epub 2009 Apr 23.PubmedLink
- Ware H, Mullett CJ, Jagannathan V. Natural language processing framework to assess clinical conditions. J Am Med Inform Assoc. 2009 Jul-Aug;16(4):585-9. PubmedLink
- Ambert KH, Cohen AM. A system for classifying disease comorbidity status from medical discharge summaries using automated hotspot and negated concept detection. J Am Med Inform Assoc. 2009 Jul-Aug;16(4):590-5. PubmedLink
- Yang H, Spasic I, Keane JA, Nenadic G. A text mining approach to the prediction of disease status from clinical discharge summaries. J Am Med Inform Assoc. 2009 Jul-Aug;16(4):596-600.PubmedLink
- Farkas R, Szarvas G, Hegedus I, Almasi A, Vincze V, Ormandi R, Busa-Fekete R. Semi-automated construction of decision rules to predict morbidities from clinical texts. J Am Med Inform Assoc. 2009 Jul-Aug;16(4):601-5. PubmedLink
2008-Smoking Challenge:
- Uzuner O, Goldstein I, Luo Y, Kohane I. Identifying patient smoking status from medical discharge records. J Am Med Inform Assoc. 2008 Jan-Feb;15(1):14-24. PubmedLink
- Savova GK, Ogren PV, Duffy PH, Buntrock JD, Chute CG. Mayo clinic NLP system for patient smoking status identification. J Am Med Inform Assoc. 2008 Jan-Feb;15(1):25-8.PubmedLink
- Wicentowski R, Sydes MR. Using implicit information to identify smoking status in smoke-blind medical discharge summaries. J Am Med Inform Assoc. 2008 Jan-Feb;15(1):29-31.PubmedLink
- Cohen AM. Five-way smoking status classification using text hot-spot identification and error-correcting output codes. J Am Med Inform Assoc. 2008 Jan-Feb;15(1):32-5. PubmedLink
- Clark C, Good K, Jezierny L, Macpherson M, Wilson B, Chajewska U. Identifying smokers with a medical extraction system. J Am Med Inform Assoc. 2008 Jan-Feb;15(1):36-9. PubmedLink
- Heinze DT, Morsch ML, Potter BC, Sheffer RE Jr. Medical i2b2 NLP smoking challenge: the A-Life system architecture and methodology. J Am Med Inform Assoc. 2008 Jan-Feb;15(1):40-3.PubmedLink