I'm a statistician with broad interests in statistical machine learning and high-dimensional data. I use tools from convex optimization to tackle large-scale problems, and I'm particularly interested in developing statistical machine learning techniques for problems in genomics and neuroscience.
For an overview of my research interests and background, [read my bio]. For a full list of publications, [see my CV].
I am currently PI of the following grants: NIH R01 EB026908, NIH R01 GM123993, NIH R01 DA047869, NSF CAREER DMS-1252624, and a Simons Investigator Award in Mathematical Modeling of Living Systems. In the past, my research has been funded by an NIH Director's Early Independence Award and a Sloan Research Fellowship.
- Chen S, Shojaie A, Shea-Brown E, and D Witten (2017) The multivariate Hawkes process in high dimensions: beyond mutual excitation. [arxiv]
Statistical Methods: Selected Publications
- Gao LL, Bien J, and D Witten (2019) Are clusterings of multiple data views independent? To appear in Biostatistics. [arxiv]
- Jewell S, Hocking TD, Fearnhead P, and D Witten (2019) Fast nonconvex deconvolution of calcium imaging data. To appear in Biostatistics. [arxiv] [software]
- Jewell S and D Witten (2018) Exact spike train inference via l0 optimization. Annals of Applied Statistics 12(4): 2457-2482. [arxiv] [software]
- Petersen A, Simon N, and D Witten (2018) SCALPEL: Extracting neurons from calcium imaging data. Annals of Applied Statistics 12(4): 2430-2456. [arxiv] [r library]
- Chen S, Witten D, and A Shojaie (2017) Nearly assumptionless screening for the mutually-exciting multivariate Hawkes process. Electronic Journal of Statistics 11(1): 1207-1234.
- Chen S, Shojaie A, and D Witten (2017) Network reconstruction from high-dimensional ordinary differential equations. Journal of the American Statistical Association 112(520): 1697-1707.
- Morrison J, Simon N, and D Witten (2017) Simultaneous Detection and Estimation of Trait Associations With Genomic Phenotypes. Biostatistics 18(1): 147-164.
- Tan KM, Ning Y, Witten D, and H Liu (2016) Replicates in high dimensions, with applications to latent variable graphical models. Biometrika 103(4): 761-777.
- Petersen A, Simon N, and D Witten (2016) Convex regression with interpretable sharp partitions. Journal of Machine Learning Research 17(94): 1-31. [pdf] [r library]
- Petersen A, Witten D, and N Simon (2016) Fused lasso additive model. Journal of Computational and Graphical Statistics 25(4): 1005-1025. [arxiv] [Check out the Shiny app!] [r library]
- Haris A, Witten D, and N Simon (2016) Convex modeling of interactions with strong heredity. Journal of Computational and Graphical Statistics 25(4): 981-1004. [arxiv] [r library]
- Tan KM and D Witten (2015) Statistical properties of convex clustering. Electronic Journal of Statistics 9(2): 2324-2347.
- Chen S, Witten D, and A Shojaie (2015) Selection and estimation for mixed graphical models. Biometrika 102(1):47-64. [arxiv]
- Tan KM, London P, Mohan K, Lee SI, Fazel M, and D Witten (2014) Learning graphical models with hubs. Journal of Machine Learning Research 15:3297-3331. [arxiv] [r library]
- Tan KM and D Witten (2014) Sparse biclustering of transposable data. Journal of Computational and Graphical Statistics 23(4):985-1008. [pdf] [r library]
- Voorman A, Shojaie A, and D Witten (2014) Graph estimation with joint additive models. Biometrika 101(1):85-101. [arxiv] [r library] A. Voorman won a 2013 David Byar Travel Award for a preliminary version of this paper.
- Mohan K, London P, Fazel M, Witten D, and SI Lee (2014) Node-based learning of multiple Gaussian graphical models. Journal of Machine Learning Research 15:445-488. [arxiv]
- Danaher P, Wang P, and D Witten (2014) The joint graphical lasso for inverse covariance estimation across multiple classes. Journal of the Royal Statistical Society, Series B 76(2): 373-397. [arxiv] [r library]
- Mohan K, Chung M, Han S, Witten D, Lee SI, and M Fazel (2012) Structured sparse learning of multiple Gaussian graphical models. Advances in Neural Information Processing Systems (NIPS). [pdf]
- Li J, Witten DM, Johnstone I, and R Tibshirani (2012) Normalization, testing, and false discovery rate estimation for RNA-sequencing data. Biostatistics 13(3):523-38. [pdf] [r library]
- Clemmensen L, Hastie T, Witten D, and B Ersboll (2011) Sparse Discriminant Analysis. Technometrics 53(4): 406-413.
- Witten DM (2011) Classification and clustering of sequencing data using a Poisson model. Annals of Applied Statistics 5(4): 2493-2518. [r library]
- Witten DM, Friedman JH, and N Simon (2011) New insights and faster computations for the graphical lasso. Journal of Computational and Graphical Statistics 20(4): 892-900. [pdf] [r library]
- Witten DM and R Tibshirani (2011) Penalized classification using Fisher's linear discriminant. Journal of the Royal Statistical Society, Series B 73(5): 753-772. [pdf] [r library] Winner of 2011 David Byar Young Investigator Award.
- Witten DM and R Tibshirani (2010) A framework for feature selection in clustering. Journal of the American Statistical Association 105(490): 713-726. [r library]
- Witten DM and R Tibshirani (2010) Survival analysis with high-dimensional covariates. Statistical Methods in Medical Research 19(1): 29-51.
- Witten DM, Tibshirani R, and T Hastie (2009) A penalized matrix decomposition, with applications to sparse principal components and canonical correlation analysis. Biostatistics 10(3): 515-534.[pdf] [r library]
- Witten DM and R Tibshirani (2009) Covariance-regularized regression and classification for high-dimensional problems. Journal of the Royal Statistical Society, Series B 71(3): 615-636. [pdf] Note that this version of the manuscript contains a clarification to Section 3.2: if scout is performed using an alternative covariance estimator, then that estimator should be positive definite. [r library]
- Witten DM, Hastie T, and R Tibshirani (2009) Discussion of "On consistency and sparsity of principal components analysis in high dimensions" by Johnstone and Lu. Journal of the American Statistical Association 104(486): 698-699.
- Witten DM and R Tibshirani (2008) Testing significance of features by lassoed principal components. Annals of Applied Statistics 2(3): 986-1012. [pdf] [r library]
Statistical Applications: Selected Publications
- Kircher M*, Witten DM*, Jain P, O'Roak BJ, Cooper GM, and J Shendure (2014) A general framework for estimating the relative pathogenicity of human genetic variants. Nature Genetics. (* denotes equal contribution)
- Brunner AL, Beck AH, Edris B, Sweeney RT, Zhu SX, Li R, Montgomery K, Varma S, Gilks T, Guo X, Foley JW, Witten DM, Giacomini CP, Flynn RA, Pollack JR, Tibshirani R, Chang HY, van de Rijn M, and RB West (2012) Transcriptional profiling of lncRNAs and novel transcribed regions across a diverse panel of archived human cancers. Genome Biology 13(8):R75.
Christine M. Micheel, Sharyl J. Nass, Gilbert S. Omenn, Editors; Committee on the Review of Omics-Based Tests for Predicting Patient Outcomes in Clinical Trials; Board on Health Care Services; Board on Health Sciences Policy; Institute of Medicine (2012) Evolution of Translational Omics: Lessons Learned and the Path Forward. National Academy of Sciences Press. 300 pages.
- Patwardhan RP, Hiatt JB, Witten DM, Kim MJ, Smith RP, May D, Lee C, Andrie JM, Lee SI, Cooper GM, Ahituv N, Pennacchio LA, and J Shendure (2012) Massively parallel functional dissection of mammalian enhancers in vivo. Nature Biotechnology 30(3):265-270.
- Witten DM and WS Noble (2012) On the assessment of statistical significance of three-dimensional colocalization of sets of genomic elements. Nucleic Acids Research 40(9):3849-3855.
- Witten D, Tibshirani R, Gu SS, Fire A, and WO Lui (2010) Ultra-high throughput sequencing-based small RNA discovery and discrete statistical biomarker analysis in a collection of cervical tumours and matched controls. BMC Biology 8(1): 58.
- Beck AH, Weng Z, Witten DM, Zhu S, Foley JW, Lacroute P, Smith C, Tibshirani R, van de Rijn M, Sidow A, and RB West (2010) 3'-end sequencing for expression quantification (3SEQ) from archival tumor samples. PLoS ONE 5(1): e8768.
- Beck AH, Lee CH, Witten DM, Gleason BC, Edris B, Espinosa I, Zhu S, Li R, Montgomery KD, Marinelli RJ, Tibshirani R, Hastie T, Jablons DM, Rubin BP, Fletcher CD, West RB, and M van de Rijn (2010) Discovery of molecular subtypes in leiomyosarcoma through integrative molecular profiling. Oncogene 29: 845-854.
- Somervaille TCP, Matheny CJ, Spencer GJ, Iwasaki M, Rinn JL, Witten DM, Chang HY, Shurtleff SA, Downing JR, and ML Cleary (2009) Hierarchical maintenance of MLL myeloid leukemia stem cells employs a transcriptional program shared with embryonic rather than adult stem cells. Cell Stem Cell 4(2): 129-140.
- Macpherson JM, Gonzalez J, Witten DM, Davis JC, Singh ND, Hirsh AE, and DA Petrov (2008) High error rate in detecting partial selective sweeps in Drosophila under assumptions of panmixis. Molecular Biology and Evolution 25(6): 1025-1042. [pdf]