The (formula) function takes a quoted list describing a model as its argument and returns (design-matrix names block-names block-indices) where design-matrix is a design matrix, names is a list of names for its columns, block-names is a list of names for multicolumn model terms such as categorical predictors and block-indices is a list of lists of integers describing which columns of the design matrix comprise which blocks.
The canonical form for input is like this
(formula '((factor a) (factor b) (term x) (interaction (list a b)) (interaction (list a x) :is-factor (list t nil))))
The function extracts names from the quoted list so if you try something clever to save typing like storing (interaction (list a x) :is-factor (list t nil)) in a variable you may get silly results. Don't do that, then.
You can change the default variable names (though not the default block names) using the relevant keyword arguments in factor and interaction. See the documentation for the functions (below) and the example to find out how. You need to specify the new name everywhere in order to make the labels come out consistent. It's probably easier to use the right variable name to begin with.
Note that the :is-factor keyword in (interaction) is optional but that the default is to assume everything is a factor. If one of the variables is continuous and you forget :is-factor the results will be obviously wrong (even if you don't run out of stack space). This is the best I can do with a function that accepts lists of numbers as input.
(formula-display) takes a vector of coefficient estimates, a covariance matrix and the output of (formula) and displays the Wald chisquare test for each block (factor, term or interaction), with or without the separate estimates and p-values for each variable.
(formula-display beta cov-mat model-formula &key (block-only t) (intercept t)) beta is the coefficient estimates, cov-mat is the covariance matrix, model-formula is the output of the original formula command. Set block-only to nil to get tests for each separate variable. If the model does not have an intercept set intercept to nil. It is obviously better to have this installed as a method for your regression objects.
Example: > (def a (repeat (iseq 1 2) 30)) A > (def b (repeat (iseq 1 3) 20)) B >(def x (normal-rand 60)) X > (def y (+ x a (normal-rand 60))) Y > (def result (formula '((term x) (factor a) (factor b) (interaction (list a b))))) RESULT
> (def model1 (regression-model (first result) y :predictor-names (second result))) Least Squares Estimates: Constant 0.867491 (0.308863) ((A 2) (B 2)) -0.481680 (0.621703) ((A 2) (B 3)) -0.666604 (0.624665) (B 2) 0.782557 (0.441825) (B 3) 0.538866 (0.444595) (A 2) 1.16702 (0.437606) X 1.30912 (0.155051) R Squared: 0.634662 Sigma hat: 0.976272 Number of cases: 60 Degrees of freedom: 53 MODEL1
> (def model1.cov (* (^ (send model1 :sigma-hat) 2) (send model1 :xtxinv))) MODEL1.COV > (formula-display (send model1 :coef-estimates) model1.cov result :block-only nil) Block Wald Chisq p-value X 71.287 0.0000 Variable Estimate Std.Err. p-value X 1.3091 (0.155051) 0.0000 A 7.1119 0.0077 Variable Estimate Std.Err. p-value (A 2) 1.1670 (0.437606) 0.0077 B 3.3431 0.1880 Variable Estimate Std.Err. p-value (B 3) 0.53887 (0.444595) 0.2255 (B 2) 0.78256 (0.441825) 0.0765 (INTERACTION A B) 1.2326 0.5399 Variable Estimate Std.Err. p-value ((A 2) (B 3)) -0.66660 (0.624665) 0.2859 ((A 2) (B 2)) -0.48168 (0.621703) 0.4385 > (formula-display (send model1 :coef-estimates) model1.cov result) Block Wald Chisq p-value X 71.287 0.0000 A 7.1119 0.0077 B 3.3431 0.1880 (INTERACTION A B) 1.2326 0.5399 NIL
Now we recode the predictors and relabel.
> (def a (repeat (list "Male" "Female") 30)) A > (def b (repeat (list "Never" "Past" "Current") 20)) B > (def result1 (formula '((term x :namebase "Age") (factor a :namebase "Sex") (factor b :namebase "Smoking") (interaction (list a b) :namebases (list "Sex" "Smoking"))))) RESULT1 > (formula-display (send model1 :coef-estimates) model1.cov result1) Block Wald Chisq p-value Age 71.287 0.0000 Sex 7.1119 0.0077 Smoking 3.3431 0.1880 (INTERACTION Sex Smoking) 1.2326 0.5399 NIL
> (formula-display (send model1 :coef-estimates) model1.cov result1 :block-only nil) Block Wald Chisq p-value Age 71.287 0.0000 Variable Estimate Std.Err. p-value Age 1.3091 (0.155051) 0.0000 Sex 7.1119 0.0077 Variable Estimate Std.Err. p-value (Sex Male) 1.1670 (0.437606) 0.0077 Smoking 3.3431 0.1880 Variable Estimate Std.Err. p-value (Smoking Past) 0.53887 (0.444595) 0.2255 (Smoking Never) 0.78256 (0.441825) 0.0765 (INTERACTION Sex Smoking) 1.2326 0.5399 Variable Estimate Std.Err. p-value ((Sex Male) (Smoking Past)) -0.66660 (0.624665) 0.2859 ((Sex Male) (Smoking Never)) -0.48168 (0.621703) 0.4385 NIL