Here are some references that discuss regression model selection/specification: BIOLOGY Posada D, Crandall KA (2001) "Selecting the best-fit model of nucleotide substitution" Syst Biol, 50(4): 580-601. Link WA, Barker RJ (2006) "Model weights and the foundations of multimodel inference" Ecology 87(10): 2626-35. Whittingham MJ, Stephens PA, Bradbury RB, Freckleston RP (2006) "Why do we still use stepwise modelling in ecology and behaviour?" Journal of Animal Ecology, 75(5): 1182-9. Hoeting JA, Davis RA, Merton AA, Thompson SE (2006) "Model selection for geostatistical models" Ecol Appl, 16(1): 87-98. Johnson JB, Omland KS (2004) "Model selection in ecology and evolution", 19(2): 101-8. Alfaro ME, Huelsenbeck JP (2006) "Comparative performance of Bayesian and AIC-based measures of phylogenetic model uncertainty" Syst Biol, 55(1): 98-96. Berrar D, Bradbury I, Dubitzky W (2006) "Avoiding model selection bias in small-sample genomic datasets" Bioinformatics, 22(10): 1245-50. Leuenberger C, Wegmann D (2009) "Bayesian computation and model selection without likelihoods" Genetics, (Epub ahead of print). PSYCHOLOGY Wasserman L (2000) "Bayesian model selection and model averaging" J Math Psychol, 44(1): 92-107. Myung IJ (2000) "The importance of complexity in model selection" J Math Psychology, 44(1): 190-204. Pitt MA, Myung IJ, Zhang S (2002) "Toward a method of selecting among computational models of cognition" Psychol Rev, 109(3): 472-491. Forster MR (2000) "Key concepts and model selection: performance and generalizability" J Math Psychol, 44(1): 205-231. Pitt MA, Myung J (2002) "When a good fit can be bad" Trends in Cognitive Sciences, 6(10): 421-425. Bozdogan H (1987) "Model selection and Akaike's Information Criterion (AIC): The general theory and its analytical extensions" Psychometrika, 52(3): 345-370. EPIDEMIOLOGY Greenland S (1989) "Modeling and variable selection in epidemiologic analysis" American Journal of Epidemiology, 79(3): 340-349. Mickey RM, Greenland S (1989) "The impact of confounder selection criteria on effect estimation" American Journal of Epidemiology, 129(1): 125-137. Bagley SC, White H, Colomb BA (2001) "Logistic regression in the medical literature: standards for use and reporting, with particular attention to one medical domain" Journal of Clinical Epidemiology, 54(10): 979-985. Sun G-W, Shook TL, Kay GL (1996) "Inappropriate use of bivariable analysis to screen risk factors for use in multivariable analysis" Journal of Clinical Epidemiology, 49(8): 907-916. Maldonado G, Greenland S (1993) "Simulation study of confounder-selection strategies" American Journal of Epidemiology, 138(11): 923-936. Budtz-Jorgensen E, Keiding N, Grandjean P, Weihe P (2007) "Confounder selection in environmental epidemiology: assessment of health effects of prenatal mercury exposure" Annals of Epidemiology 17(1): 27--35. Royston P, Sauerbrei W (2005) "Building multivariable models with continuous covariates in clinical epidemiology" Methods Inf Med, 44(4): 561-71. Greenland S (1993) "Methods for epidemiologic analyses of multiple exposures: a review and comparative study of maximum-likelihood, preliminary-testing, and empirical-Bayes regression" Statistics in Medicine, 12(8): 717-36. Harbord, Whiting P, Sterne JA, Egger M, Deeks JJ, Shang A, Bacjmann LM (2008) "An empirical comparison of methods for meta-analysis of diagnostic accuracy showed hierarchical models are necessary" J Clin Epi 61(11): 1095-1103. APPLIED STATISTICS Chatfield C (1995) "Model uncertainty, data mining, and statistical inference" J R Statistical Soc Series A, 158: 419-466. Pan W (2001) "Model selection in estimating equations" Biometrics, 57(2): 529-34. Pocock SJ, Assmann SE, Enos LA, Kasten LE (2002) "Subgroup analysis, covariate adjustment and baselin comparisons in clinical trial reporting: current practice and problems" Statistics in Medicine, 21: 2917-2930. Collett S, Stepniewska K (1999) "Some practical issues in binary data analysis" Statistics in Medicine, 18: 2209-2221. Steyerberg EW, Eijkemans MJC, Harrell FE, Habbema JDF (2000) "Prognostic modelling with logistic regression analysis: a comparison of selection and estimation methods in small data sets" Statistics in Medicine, 19: 1059-1079. Zheng B, Agresti A (2000) "Summarizing the predictive power of a generalized linear model" Statistics in Medicine, 19: 1771-1781. Hurvich CM, Tsai C-L (1990) "The impact of model selection on inference in linear regression" American Statistician, 44(3): 214-217. Touloumi G, Samoli E, Pipikou M, LeTertre A, Atkinson R, Katsouyanni K (2006) "Seasonal confounding in air pollution and health time-series studies: effect on air pollution effect estimates" Statistics in Medicine, 25(24): 4164-78. Raftery AE (1995) "Bayesian model selection in social research" Sociological Methodology, 25: 111-163. Chen MH, Huang L, Ibrahim JG, Kim S (2008) "Bayesian variable selection and computation for generalized linear models with conjugate priors" Bayesian Analysis, 3(3): 585-614. Hu J, Johnson VE (2008) "Bayesian model selection using test statistics" JRSS-B, 71(1): 143-158. Wang D, Zhang W, Bakhai A (2004) "Comparison of bayesian model averaging and stepwise methods for model selection in logistic regression. Statistics in Medicine, 23(22): 3451-67. Ni X, Zhang D, Zhang HH (2009) "Variable selection for semiparametric mixed models in longitudinal studies" Biometrics, (Epub ahead of print) Li R, Liang H (2008) "Variable selection in semiparametric regression modeling" Annals of Statistics, 36(1): 261-286. Alber SA, Weiss RE (2009) "A model selection approach to analysis of variance and covariance" Statistics in Medicine, 28(13): 1821-40. last updated 06 Jan 2010