Course home

Objectives

Syllabus

Assignments

Grading

BHI Home Page

Knowledge Representation and Biomedical Applications
MEBI 550, Fall 2013

Project 3: Building and using a (toy) OWL ontology for cancer staging
Due: Nov 8 (individually) and 13 (group)

In this project, you will explore, in a toy, small example, the ability of OWL to use reasoning to answer questions. More specifically, the staging of solid tumor cancers is an important first step for a number of theraputic decisions (not to mention prognosis!). I have built a small OWL ontology that capture the set of real rules defined by American Joint Committee on Cancer (AJCC). Specifically, look at the AJCC Stage and TNM Definitions on the NCI page for non-small cell lung cancer. These are also summarized in a nifty Poster, "for professionals and patients alike". For lung cancer, there are 7 stages, from IA to stage IV, where the higher the stage, the worse the prognosis. Stages are formally defined by a set of rules that relate tumor size ("T"), lymph node involvement ("N"), and metastatic status ("M"). Thus, these rules are amenable to formal logic definitions of classes in OWL.

Learning objectives:

  • Understand how to formally encode some English rules into Description logic class definitions
  • Know how to apply description logic reasoning to use those class definitions to answer questions
  • Become (more) familiar with Protege 4 for OWL editing and use
  • Learn a bit about Oncology (and anatomy)

I also want you to work & learn in a small peer group. Thus, as described in deliverables, I want you to hand in an individually-created ontology, and then meet in groups of three students to build a "best" consensus solution ontology. Your grade will be a combination of the individual work and the peer work. I have emailed out the small groups.

To make this assignment more practical, I have provided a starter ontology that has most of the structure and classes you will need to accomplish this task. (Without this, there are too many possible ways of solving the problem!). As a disclaimer, this task is completely toy -- an experienced oncologist would not need or use any sort of automatic decision support tool to determine cancer stage, given T, N and M infomation. It's not that hard a task, and there aren't that many stages. However, the assignment demonstrates the ability of OWL and DLs to capture a set of rules and apply them to data.

In detail, here's what I'd like you to try:

  1. Read about tumor staging at the NCI web page for non-small cell lung cancer. Note that this is for "health care professionals"; you can certainly also read the "for patients" version, but as BHI students, I'd recommend the "professional" version, even if some of the clinical terminology is tough.
  2. Open (in Protege 4) and study the "starter" ontology I provide as your starting step. This ontology has 5 important top level classes:
    • Anatomical structure. This one is obvious. Alas, it doesn't make direct reference to the FMA.
    • CancerDiagnosis. This class contains two different sorts of subclasses. First, it contains a few definitions of Ns, Ms, and Ts. One of your main tasks will be to provide additional subclasses of CancerDiagnosis that define all of these that you will need to do staging. Second, it contains a class called "LungCancerExamps". This class contains 8 subclasses that stand in the place of "the data" to be classified. Each subclass can be thought to represent characteristics of a patient. (This isn't necessarily the best ontological design approach, but it helps to demonstrate Description Logics and classification.) The task is to decide what stages each patient is in; see also step 3, below.
    • StagedLungCancer. This is where you'll put the seven definitions of stages. Only the first two (1A and 1B) are provided; you'll define the rest.
    • Tumour (British spelling). This class includes size and location subclasses for the primary tumor.
    • Metastasis. Perhaps a misnomer -- its subclasses connect the anatomy to particular sorts of metastases. This class includes all of these sorts of subclasses you should need. You shouldn't need to look closely at these definitions.
  3. Within Protege, invoke a DL reasoner. As soon as you do this, an inferred hierarchy will be filled in, and a number of inferred facts will appear for each of the subclasses of "LungCancerExamps". In particular, the stage 1a example will be classified correctly. You job is to complete the ontology so that all of these subclasses get classified as subclasses of one of the StagedLungCancer definitional classes.

Finally, a few words about oncology: The primary difference between N2 and N3 is contralateral versus ipsalateral nodal metastases -- i.e. whether the involved node is in the same lung (ipsalateral) or in the opposite lung (contralateral) from the primary tumor. Also, the AJCC wording for N1 is atrocious, from a logical point of view: " Metastasis to ipsilateral peribronchial and/or ipsilateral hilar lymph nodes, and intrapulmonary nodes . . . ". Please interprete the above to mean: Metastases to ipsalateral {peribrochial, or hilar, or intrapulmonary} nodes. (Basically, I think that all uses of the word "and" in this AJCC sentence are misleading.)

Credit goes to Olivier Dameron, now a professor in medical informatics at Renne University in France, and OWL guru, for creating the original TNM ontology. He did this work a long time ago (10 years?) so I've had to bring it up-to-date. Although he's obviously French, I assume that's where the British spelling of "tumour" came from...

Resources:

As noted on the assignments page, a good reference material for OWL is Matt Horridge's OWL tutorial for Protege4.

Individual deliverable (due 5pm, Fri, Nov 8th):

1. The OWL ontology. Obviously, your version of the ontology should be able to classify all of the subclasses of "SpecificTumorExamps". Please hand in using turtle syntax, and give me a version with the Reasoner set to "none", so that I can turn it on and verify that the inference happens.

FInal deliverable (due pm, Wed, Nov 13th):

2. The team-based OWL ontology.

3. An individual essay that describes the differences between the team-based ontology and your individual version (deliverable #1). You should comment on what you learned from your classmates, and how the group built this "consensus" answer. Rather than self-grading the whole project, I'd like you to evaluate how "far off" your individual ontology is compared to the group ontology. E.g., if the group ontology gets a 4.0, what might yours get? (of course, you can / should use approximate grades.) The essay is also an opportunity to discusss un-answered questions, or to vent a bit about the complexities of OWL and Protege. As always, your essay should be clear and well-organized.

Please hand in these documents via the course Catylst drop box, as usual.

Grading rubric: (to be expanded...)

 
3.0 -- 3.2
3.3 - 3.7
3.8 -- 4.0
Ontology: Individual construction Correctly classifies all (or nearly all) LungCancerExamps All defined classes are complete and correct. Student successfully designs some additional, clinically relevant classes
Ontology: Team solution Team successfully develops a "best" solution that improves on (or at least matches) all individual solutions Team solution includes additional, relevant classes and classification examples  
Essay: self-reflection Student describes how the team solution was developed and how their ontology differs from the team solution. Essay shows insight into the group dynamic; describes pros and cons of different solutions.  
Essay: open questions Essay coherently raises some frustrations and questions about the process. Essay displays the student's learning in oncology and/or OWL modeling. Student raises some interesting issues Essay raises research-caliber issues around choices in description logic modeling. Student demonstrates some outside research

Last Updated:
Oct, '12

Contact the instructor at: gennari@u.washington.edu