Statistical
Genetics Methods papers
These papers provide new methods, or (where noted) transferring statistical
technology to genetics research. For applications see Publications.
- Li X, et al. A statistical framework for multi-trait
rare variant analysis in large-scale whole-genome sequencing studies. Nat Comput
Sci. 2025 Feb;5(2):125-143. doi: 10.1038/s43588-024-00764-8.
PMID: 39920506; NIHMSID:NIHMS2065948
- Huang YJ, Kurniansyah N,
Levey DF, Gelernter J, Huffman JE, Cho K, Wilson PWF, Gottlieb DJ, Rice
KM, Sofer T. A semi-empirical Bayes approach for calibrating weak instrumental
bias in sex-specific Mendelian randomization studies. medRxiv.
2025 Jan 2;. doi:
10.1101/2025.01.02.25319889. PubMed PMID: 39802770; PubMed Central
PMCID: PMC11722449
- Little A, et al. General Kernel Machine Methods for Multi-Omics
Integration and Genome-Wide Association Testing with Related Individuals. Genet Epidemiol. 2025
Jan;49(1):e22610. doi:
10.1002/gepi.22610. PubMed PMID: 39812506
- Li X, et al. Powerful, scalable and resource-efficient
meta-analysis of rare variant associations in large whole genome
sequencing studies. Nat Genet. 2023
Jan;55(1):154-164. doi:
10.1038/s41588-022-01225-6. PubMed PMID: 36564505.
- Li Z et al A framework for detecting noncoding
rare-variant associations of large-scale whole-genome sequencing studies. Nat Methods. 2022
Dec;19(12):1599-1611. doi:
10.1038/s41592-022-01640-x. PubMed PMID: 36303018; NIHMSID:NIHMS1862492.
- Sofer T et al. BinomiRare: A
robust test for association of a rare genetic variant with a binary
outcome for mixed models and any case-control proportion. HGG Adv. 2021 Jul
8;2(3). doi: 10.1016/j.xhgg.2021.100040. PubMed
PMID: 34337551; PubMed Central PMCID: PMC8321319.
- Sofer T, Zheng X, Laurie CA, Gogarten SM, Brody JA,
Conomos MP, Bis JC, Thornton TA, Szpiro A, O'Connell JR, Lange EM, Gao Y, Cupples LA, Psaty BM, Rice KM. Variant-specific
inflation factors for assessing population stratification at the
phenotypic variance level. Nat Commun.
2021 Jun 9;12(1):3506. doi:
10.1038/s41467-021-23655-2. PubMed PMID: 34108454; PubMed Central
PMCID: PMC8190158.
This paper shows how non-constant
variance (heteroskedasticity) of outcomes can invalidate some forms of
cross-population analysis, and provides methods that correct this problem.
- Li et al (2020) Dynamic incorporation of multiple in
silico functional annotations empowers rare variant association analysis
of large whole-genome sequencing studies at scale. Nat Genet.
Sep;52(9):969-983. doi:
10.1038/s41588-020-0676-4. PubMed PMID: 32839606; PubMed Central
PMCID: PMC7483769.
- Gogarten SM, Sofer T, Chen H, Yu C, Brody JA, Thornton
TA, Rice KM, Conomos MP. Genetic association testing using the GENESIS
R/Bioconductor package. Bioinformatics.
2019 Jul 22;. doi:
10.1093/bioinformatics/btz567. PubMed PMID: 31329242.
- Sofer T, Zheng X, Gogarten SM, Laurie CA, Grinde K, Shaffer JR, Shungin
D, O'Connell JR, Durazo-Arvizo RA, Raffield L,
Lange L, Musani S, Vasan
RS, Cupples LA, Reiner AP; NHLBI Trans-Omics for
Precision Medicine (TOPMed) Consortium, Laurie
CC, Rice KM. A fully adjusted two-stage procedure for rank-normalization
in genetic association studies. Genet Epidemiol. 2019
Apr;43(3):263-275. doi: 10.1002/gepi.22188. Epub 2019 Jan 17. PMID: 30653739
Inverse-Normal transformations are widely used in analysis of quantitative
phenotypes – but this paper shows how making them work when covariate
adjustment is also needed requires care. We propose methods that let these
approaches work harmoniously.
- Chen H, Huffman JE, Brody JA, Wang C, Lee S, Li Z,
Gogarten SM, Sofer T, Bielak LF, Bis JC, Blangero
J, Bowler RP, Cade BE, Cho MH, Correa A, Curran JE, de Vries PS, Glahn DC, Guo X, Johnson AD, Kardia S, Kooperberg C,
Lewis JP, Liu X, Mathias RA, Mitchell BD, O'Connell JR, Peyser PA, Post
WS, Reiner AP, Rich SS, Rotter JI, Silverman EK, Smith JA, Vasan RS, Wilson JG, Yanek LR; NHLBI Trans-Omics for
Precision Medicine (TOPMed) Consortium; TOPMed Hematology and Hemostasis Working Group,
Redline S, Smith NL, Boerwinkle E, Borecki IB, Cupples LA, Laurie CC, Morrison AC, Rice KM, Lin X.
Efficient Variant Set Mixed Model Association Tests for Continuous and
Binary Traits in Large-Scale Whole-Genome Sequencing Studies. Am J Hum Genet. 2019 Feb
7;104(2):260-274. doi:
10.1016/j.ajhg.2018.12.012. Epub 2019 Jan
10. PMID: 30639324
- Lumley T, Brody J, Peloso G, Morrison A, Rice K. FastSKAT: Sequence kernel association tests for very
large sets of markers. Genetic
epidemiology. 2018; NIHMSID: NIHMS972871 PMID: 29932245 PMCID:
PMC6129408
This paper (and the grant that
supported it) transfers recent results in random matrix theory to genetic
association work, speeding up calculations by orders of magnitude
- Sondhi A, Rice KM. Fast permutation tests and related
methods, for association between rare variants and binary outcomes. Ann Hum Genet. 2017 Dec
18; PubMed PMID: 29250767.
- Brody JA, et al. Analysis commons, a team approach to
discovery in a big-data environment for genetic epidemiology. Nature genetics. 2017;
49(11):1560-1563. PMID: 29074945
- Sofer T, Heller R, Bogomolov
M, Avery CL, Graff M, et al. A powerful statistical framework for
generalization testing in GWAS, with application to the HCHS/SOL. Genet Epidemiol. 2017
Apr;41(3):251-258. PubMed PMID: 28090672; NIHMSID: NIHMS829142; PubMed
Central PMCID: PMC5340573.
- Castaldi PJ, Cho MH, Liang L,
Silverman EK, Hersh CP, et al. Screening for interaction effects in gene
expression data. PLoS One. 2017;12(3):e0173847.
PubMed PMID: 28301596; PubMed Central PMCID: PMC5354413
- Rich SS, Wang ZY, Sturcke A, Ziyabari L, Feolo M,
O'Donnell CJ, Rice K, Bis JC, Psaty BM. Rapid evaluation of phenotypes,
SNPs and results through the dbGaP CHARGE
Summary Results site. Nature genetics.
2016; 48(7):702-3. PMID: 27350599
- Sitlani CM, Dupuis J, Rice KM, Sun F, Pitsillides AN, Cupples LA,
Psaty BM. Genome-wide gene-environment interactions on quantitative traits
using family data. European
Journal of Human Genetics. 2016; 24(7):1022-8. PMID: 26626313 PMCID:
PMC5070904
- Sung YJ, Winkler TW, Manning AK, Aschard
H, Gudnason V, et al. An Empirical Comparison of
Joint and Stratified Frameworks for Studying GxE
Interactions: Systolic Blood Pressure and Smoking in the CHARGE
Gene-Lifestyle Interactions Working Group. Genet Epidemiol.
2016 Jul;40(5):404-15. PubMed PMID: 27230302; NIHMSID: NIHMS781298;
PubMed Central PMCID: PMC4911246.
- Chen H, Wang C, Conomos M, Stilp A, Li Z, Sofer T,
Szpiro A, Chen W, Brehm J, Celedon J, Redline S,
Papanicolaou G, Thornton T, Laurie C, Rice K, Lin X: Control for
population structure and relatedness for binary traits in genetic association
studies using logistic mixed models. American Journal of Human
Genetics. 2016; 98(4):653-66. PMID: 27018471 PMCID: PMC4833218
- Wang S, Zhao JH, An P, Guo X,
Jensen RA, Marten J, Huffman JE, Meidtner K,
Boeing H, Campbell A, Rice KM, Scott RA, Yao J, Schulze MB, Wareham NJ, Borecki IB, Province MA, Rotter JI, Hayward C, Goodarzi MO, Meigs JB, Dupuis J. General Framework for
Meta-Analysis of Haplotype Association Tests. Genetic
epidemiology. 2016; 40(3):244-52. NIHMSID: NIHMS789332 PMID: 27027517
PMCID: PMC4869684
- Sitlani CM, Rice KM, Lumley T, McKnight B, Cupples LA, Avery CL, Noordam R, Stricker BH, Whitsel EA, Psaty BM. Generalized estimating equations
for genome-wide association studies using longitudinal phenotype data. Stat Med. 2015 Jan
15;34(1):118-30. doi: 10.1002/sim.6323. Epub 2014 Oct 9. PubMed PMID: 25297442.
- Li S, Mukherjee B, Taylor JM, Rice KM, Wen X, Rice JD,
Stringham HM, Boehnke M. The role of
environmental heterogeneity in meta-analysis of gene-environment
interactions with quantitative traits. Genet Epidemiol.
2014 Jul;38(5):416-29. doi: 10.1002/gepi.21810.
Epub 2014 May 6. PubMed PMID: 24801060; PubMed
Central PMCID: PMC4108593.
- Gogarten SM, Bhangale T,
Conomos MP, Laurie CA, McHugh CP, Painter I, Zheng X, Crosslin
DR, Levine D, Lumley T, Nelson SC, Rice K, Shen J, Swarnkar
R, Weir BS, Laurie CC (2012) GWASTools: an R/Bioconductor package for quality control and
analysis of Genome-Wide Association Studies. Bioinformatics. Oct 10.
PMID:23052040
- Voorman A, Rice K, Lumley T.
(2012) Fast computation for genome-wide association studies using boosted
one-step statistics. Bioinformatics.
Jul 15;28(14):1818-22. PMID: 22592383 PMCID: PMC3389774
- Voorman A, Lumley T, McKnight
B, Rice K (2011) Behavior of QQ-Plots and Genomic Control in Studies of
Gene-Environment Interaction. PloS
ONE 6(5): e19416. doi:10.1371/journal.pone.0019416
PMID: 21589913 PMCID: PMC3093379
- Divers J, Redden DT, Rice KM, et al (2011) Comparing
self-reported ethnicity to genetic background measures in the context of
the Multi-Ethnic Study of Atherosclerosis (MESA) BMC Genetics 12(3)
Article Number 28. PMID: 21375750 PMCID: PMC3068121
- Manning A, LaValley M, Liu C,
Rice K, An P, Liu Y, Miljkovic I, Rasmussen-Torvik L, Harris T, Province M, Borecki
I, Florez J, Meigs J, Cupples
L, Dupuis J (2011) Meta-analysis of Gene-Environment interaction: joint
estimation of SNP and SNP x Environment regression coefficients, Genetic Epidemiology,
35(1) 11-18. PMID: 21181894
- Buzkova P, Lumley R, Rice K
(2011) Permutation and parametric bootstrap tests for gene-gene and
gene-environment interactions, Annals of Human Genetics,
75(1) 36-45. doi: 10.1111/j.1469-1809.2010.00572.x. PMID: 20384625
This article uses testing theory to
show how, contrary to several published claims, there is no exact
permutation approach for testing certain hypotheses about interactions.
Different approaches for better (but non-exact) permutation tests are also
considered.
- Laurie C, Doheny K, Mirel D,
et al (2010) Quality control and quality assurance in genotypic data for
genome-wide association studies Genetic
Epidemiology, 34(6) 591-602. PMID: 20718045, PMCID: PMC3061487
- Lumley T, Rice K (2010) Potential for Revealing
Individual-Level Information in Genome-wide Association Studies. JAMA
303(7) 659-660. PMID: 20159874
This commentary uses concepts from within-sample
prediction, scaled up to genome-wide scale, to show how publishing summary
results can “leak” individual-level phenotype information, potentially
violating study participants’ informed consent.
- French, B; Lumley, T; Monks, SA; Rice, KM; Hindorff, LA; Reiner, AP; Psaty, BM (2006) Simple
estimates of haplotype relative risks in case-control data. Genetic Epidemiology 30
(6): 485-494. PMID: 16755519
- Rice K, Holmans P. (2003)
Allowing for genotyping error in analysis of unmatched case-control
studies. Annals of
Human Genetics, 67(2), 165-174. PMID: 12675691