## My useful and productive page

Teaching --- Methodology --- Talks---Real Statistics --- Statistical Computing

### Methodological Research

Greater precision should come from weighted estimation that takes advantage of simple models for the correlation structure, but still uses model-free variance estimators to protect the validity of inference. This is very similar to MQL; the main advance is being able to do it without inverting or storing big matrices.

I'm also interested in empirical process theory and semiparametrics for dependent data, but as Barbie famously said "Math Is Hard".

Indirect comparisons. Suppose you compare A to B and B to C. What can you conclude about A vs C? I've been looking at two aspects of this. The first is meta-analysis of clinical trials taking into account indirect as well as direct comparisons. The hard part is getting the estimation to fail when it should. A worked example of this network meta-analysis is now available. The second aspect is the question of when two-sample tests are transitive. For the t-test, if the t-statistic comparing A and B and the t-statistic comparing B and C are both positive the t-statistic comparing A and C is also positive. For the Wilcoxon test this need not happen. The problem is to characterize the transitive two-sample tests.

Source apportionment Given multivariate time series of chemical or size composition of air pollution particles, the source apportionment problem is to work out how much of the pollution comes from which source. There are a number of methods, but the statistical properties of all of them are unknown.

### Real Statistics

i.e., stuff with live data. A bit of particulate air pollution stuff as above, and at the Northwest Center for Particulate Matter and Health. I also work at the Cardiovascular Health Study, a big study of the risk factors for cardiovascular disease in older people, and at the Cardiovascular Health Research Unit, which is interested in drug-gene interactions and related topics.

### Statistical computing and graphics

This is mostly Free (open-source) software

XLISP-Stat has nice dynamic graphics and is quite fast. It is allegedly dead, but that doesn't make it any less useful.

I have an implementation of Generalised Estimating Equation models for XLISP-Stat. It includes diagnostic plots and a wide range of link, variance and correlation options. I even have documentation for it. If you think I should be implementing random intercept models instead, then see why I disagree . There's also Cox regression

R is a free interpreter for a dialect of the S language. The initial system came from Ross Ihaka and Robert Gentleman in Auckland, but now has a core development team of about a dozen people spread across the world (including me). It is available for Windows and for Unix systems from the Comprehensive R Archive network.

One of my main projects in R is a package for complex survey analysis. It's fairly general, but of course is a bit slow, especially if you don't have enough memory.

Another project is a system for implementing and evaluating anomaly detection algorithms in syndromic surveillance.

I have a few thoughts on cryptography here.

Other things I have worked on can be deduced from my cv

Teaching --- Methodology--- Talks ---Real Statistics --- Statistical Computing

Thomas Lumley