Mebi 550 Project 4

Knowledge Representation and Biomedical Applications
MEBI 550, Fall 2013

Project 4: Demonstrating inference and KR choices
Due: Nov 20th (checkpoint), Dec 6th (written report), & Dec 10th (oral presentation)

In this open-ended project, the goal is for you and a partner to explore different levels of expressivity that are inherent in knowledge representation choices. To do this exploration, you must create an ontology, and demonstrate inference (or query answering) using more than one representational formalism. The main learning objective of the assignment is to demonstrate the tradeoffs amoung representational choices. It would be my expectation that some choices will be "easier" to capture and store the knowledge you care about (building an ontology), whereas others may have more effective query-answering capabilities or performance characteristics.

This project is designed to be done in pairs, where each person on a team selects and implements one of the four choices listed below. However, the team must agree on the domain, and must coordinate work for both the final essay and the oral presentation. This assignment is completely open-ended in the choice of domain. Each team should select some domain within Biomedical & Health Informatics that you are passionate about, build two different (relatively small) ontologies or knowledge bases, and then demonstrate what you can do with these knowledge bases (aka, demonstrate inference).

As I see it, here are the set of knowledge represenation (KR) formalisms we've covered in class. You may select any two of these for your project. Each formalism may limit what you can say about your domain of interest and how you can reason about that knowledge:

RDFS (frames), capturing knowledge as a hierarchy of classes.
Rules -- SWRL or Jess to encode constraints or logical rules for the knowledge in your ontology.
Description logics (OWL) and reasoning via subsumption (using tools like Pellet).
Probabilistic knowledge representations (most pragmatically, using Bayes Nets)

If you choose to include KR framework #1 or #2, you will probably want to use of Protege-frames (v. 3.5). Note that the JESS plug-in is designed for Protege 3, not 4. If you choose to include #4, you will probably want to download and use the Netica software. If you use rules within OWL, I would recommend the Pellet reasoner, and not FACT++

For all choices (except for perhaps frames) I might recommend some additional reading. Ask for recommendations, but as an example, for OWL modeling, you might look at chapters 12-14 in the 2nd edition of the Allemang & Hendler text.

Choosing your domain:

An important aspect of this assignment is that it is open-ended with respect to the BHI domain that you will model via ontologies and knowledge bases. I'd strongly recommend that the domain be one that is personally interesting to you, i.e., in your area of research interest. However, you should also consider pragmatic aspects required for this assignment -- your research passion may not fit well with the requirement to build two versions of an ontology. Of course, since this is a team-project, you'll have to be flexible and/or find a partner who has compatible interests.

Although I want the domain to be "real-world", it should be obvious that you can't build a real-world ontology (or two!) de novo in 3 weeks. For this assignment, I am more interested in (a) your own choices in ontology development, (b) demonstration of some form of inference, and (c) a clear demonstration that you understand the differences among KR formalisms. Thus, you may want to focus your ontology development effort in areas where there is a difference in reasoning or uses of that ontology.

Of course, many BHI domains already have ontologies (see the BioPortal resource). You should consider these, but I also want to know how you would make ontology development choices. For example, if your domain of interest is Biochemical reactions, you should certainly look at the BioPAX ontology. However, you should be able to find some aspect of that ontology that seems non-obvious, or strange, or inappropriate for your particular competency questions. Therefore, you can pick that sub-area, and develop your own ontology that better suits your purposes.

Demonstrating inference / query answering:

In addition to building two versions of your ontology, you must also consider the use of both versions of the ontology. I.e., how would users or applications leverage or benefit from you ontology? What are the "competency questions" that it can answer? Of course, your two systems will probably be able to answer completely different sorts of competency questions (and that is okay). It is also okay to have a very minimalist demonstration of inference in your frame-based system, if that is one of your selected KR formalisms.

Because the two versions may be answering somewhat different competancy questions, the two versions will not contain exactly the same knowledge. So it is fine to include information in for example, a Bayes net representation that is completely missing from your Frame-based ontology.

Of course, real-world use of an ontology would include development of some specific user-interface and perhaps a programmatic use of reasoning systems. For this assignment, you should mostly ignore issues of useability and user-interface design. (Although you might want to discuss useability in your essay.) However, you must demonstrate at least some toy use of query-answering or reasoning or inference. For a frame-based system, the query answering should leverage taxonomic knowledge. For rules, you should use Jess (Protege 3) or Pellet (Protege 4) and demonstrate some simple question-answering (that involves rule-chaining, perhaps). For OWL, you must show some classification task, or DL reasoning task with your ontology.

Learning objectives:

This project aims at synthesis across the entire course: I will be looking for a good demonstration of what you've learned about KR in this course. More specifically, the objectives are

To demonstrate and explain (in your essay) the limits and capabilities of different KR choices
To become more familiar with inference
To gain experience as a novice "ontologist" -- i.e. building your own ontology
To communicate these lessons succintly in both written and oral form.

Checkpoint -- Nov. 20th, 1pm

To help give you some early feedback, I'm asking for a preliminary deliverable on Wed., Nov 20th. The first challenge is to establish your teams, and then meet to talk about the domain. Each team has to agree on two points: What is the domain and sorts of competency questions, and which of the four knoweldge representation each person will choose. So for this checkpoint, I'd like both a 1-2 paragraph description of the domain and scope of your ontologies. Then, a statement that says which person will be in charge of which KR formalism. Optionally, you can list some questions or example inferences you hope to demonstrate for each of the two KR choices.

This checkpoint deliverable will NOT be graded. It's also not a binding contract, so it is okay to change course about what you are doing after the checkpoint. However, I will give you written feedback by Friday, Dec 2. The better the checkpoint material, the more I can provide as feedback. Hand in via Catalyst (project 4a).

Deliverables #1: The ontologies / knowledge bases

This deliverable will have to be multiple files. At a minimum, I'm expecting (a) two different knowledge bases, and (b) some transcript or brief technical description of how at least one of your knowledge bases can answer a question, or perform some inference. I do not expect these documents to stand completely on their own -- some portion of the essay (see below) should further explain both the constructs in your ontologies / knowledge bases, and the inferences your system is carrying out or competency questions it can answer. However, part (b) should be sufficient for me to replicate your inference behavior (at least after I read your essay).

(I am using the phrase "ontologies / knowledge bases" to be clear that, for example, if you choose a rule-based system, you might need to hand in both a Protege ontology with taxonomic knowledge AND a Jess rule-base that constitutes the rest of the knowledge base.)

Deliverable #2: The essay

An important part of your work will be a formal essay (5-8pp long). This document will hopefully build on your checkpoint document (but need not, if you choose to change course!). It must include:

A description of your domain and what you wanted to capture about that domain in your ontology, including the sorts of competency questions you'd like to be able to answer.
Discussion of your ontology / modeling decisions for your two versions -- a guide to deliverable #1a
Discussion of the inferences and query answering capabilities you were able to create (and also perhaps some discussion of ones you were NOT able to create!). This is a guide to deliverable #1b.
Discussion of the differences, strengths and weaknesses in general of the two KR formalisms you used

Your essay may include screenshots of your ontology or system "in action". I will certainly grade this essay on clarity / organization, as well as content, so please pay attention to this aspect of the project. You should write this document collaboratively, with your partner. Please hand in these documents via the course Catylst drop box, as usual.

Deliverable #3: An oral presentation

On Tuesday of Finals week (12/10), I will ask each team to give a 20 minute presentation of your work. During the last week of classes, I will provide a more detailed explanation of my expectations (and biases) about scientific oral presentations.

Please hand in (by Monday) the PPT file or whatever presentation technology you might use to Catalyst as Project 4c.

Grading rubric: Each team (all members) will get a grade as follows:

	3.0 -- 3.2	3.3 - 3.7	3.8 -- 4.0
Ontologies	Both ontologies capture a reasonable portion of knowledge about the domain. In both ontologies/frameworks, I can test and demonstrate some level of inference.	The ontologies are impressively rich and large. More than one type of competency question (per framework) is answerable via inference.	Ontologies leverage existing work and best practices in ontology design. Inferences are plausibly useful to end-users.
Essay	Coherent and well-written. Domain and compentency questions are reasonable. The student demostrates understanding of the strengths and limitations of their two selected paradigms.	In addition, the student argues persuasively for one framework as being the better choice for their domain and set of competency questions.	The student identifies research-level issues, and possible solutions / workarounds. The essay is a plausible early draft for a conference paper.
Presentation	Well organized slides that capture the main message learned from the project	Strong and compelling oral delivery of information. Innovative or esthetically pleasing slides.	It is hard for me to find any faults in the presentation.

Last Updated:
Nov, '13

Contact the instructor at: gennari@u.washington.edu