Validating Clustering for Gene Expression Data
Ka Yee Yeung, David R. Haynor,
Walter L. Ruzzo
Supplementary Web Site
- Technical Report UW-CSE-00-01-01 (January 2000)
- Dec 2000 version
2001, volume 17, number 4, pages 309-318)
- Most recent written description of this work (Dec 2001)
in Chapter 3 of Ka Yee's thesis.
- Enlarged and colored figures in Dec 2000 version
- Additional figures not shown in Dec 2000 version
- Pdf file for the details of clustering algorithms implemented
- Pdf file for the details of simulated data sets
- Gene Expression data sets used:
- The original web site from which the rat CNS data
(Wen et al. 1998) was available for
download is no longer working. Here is the
raw data in tab-delimited text format
(without any normalization) we used.
the original web site containing the yeast cell cycle data
by Cho et al. 1998 no longer seems to work.
If you are interested in the full data, you can get the processed
Spellman et al. or the raw data from
SMD. In both cases, you would need to select the experimental
conditions that you need.
Due to popular demand, we are making the subset of 384 genes we used in
Ka Yee's dissertation available (as text-delimited file).
We are also making the subset of 237 genes (corresponding to 4 MIPS
categories) used in Ka Yee's dissertation availabel (as text-delimited file).
- Human hematopoietic differentiation data set Tamayo et al. 1999
- Barrett's esophagus data Barrett et al. 2002
- The ovary data set is not publicly available yet. Unfortunately, we do
not have permission to distribute this data.
- TM4: Mev: a free and
open-source implementation of FOM and CAST (by the Quackenbush Lab at
Due to popular demand, this web site was updated on 3/24/2006.
If you have any questions or comments on this paper, feel free to email Ka Yee
Back to Ka Yee's research page.