|
Knowledge Representation and Biomedical Applications | |||||||||||||||||||
Project 1: Modeling with RDF In this assignment, you will explore some web resources that provide biomedical informatics data and knowledge. Your goal is to develop your own small model of some tiny portion of the data, and then instantiate this model using the RDF triple-store formalism. Although open-ended, I am strongly suggesting one of four possible starting points:
Each of these resources contains data, and, in entirety, the data are large and complex enough that there is no single, simple "right" answer for knowledge representaion.These resources are also vetted -- they've been around a long time, are well used and robust. Starting from one of these four pages, read and browse. Additionally, unless you are already an expert in one of these topics, I'd strongly recommend using related Wikipedia pages. Learn something. Then, your goal is to formalize and store a small portion of this knowledge as a set of RDF triples. Although I'm asking you to look at "real" domain knowledge -- each of these sites has incredibly deep and detailed knowledge -- you should only capture a very small, toy amount of knowledge in this exercise. Furthermore, although it may be tempting (and appropriate in the real world) to start by looking at other knowledge bases or ontologies, for learning purposes, I want you to try and build an RDF knowledge base from scratch. For example, the BioPAX ontology provides an excellent framework for storing knowledge about biochemical reactions and pathways such as are stored in Reactome. However, although you are welcome to look at BioPax (and perhaps you should), for this assignment I want you to re-create those ideas in your own RDF store. In addition, since a learning goal is to use and understand RDF, you may actually "make up" additional facts that aren't explicitly listed your data source, if those facts help you "connect the dots" in your RDF graph. These new facts should be reasonable, not nonesensical fact -- e.g. that a patient has a particular primary care physician, even though your data source (MIMIC of PGP) might not list such a fact. For this very small, toy RDF store, you can just use turtle syntax and plain text -- using whatever text editor of choice. I would recommend that you validate your RDF syntax via one (or more) of the tools listed below. Although I expect these RDF knowledge bases to be small, they should be "big enough" to explore some interesting relationships and a variety of bits of information you may have learned and extracted from the web bio-informatics knowledge resources. Under "deliverables", I've defined some (somewhat arbitrary) boundaries for what is "big enough". Resources:
Deliverables (due Noon, Wednesday, Oct 9th):
Both documents should be handed in via the course Catylst drop box, as will be true for all assignments. Grading rubric:
|
||||||||||||||||||||
Last Updated: |
Contact the instructor at: gennari@u.washington.edu
|