[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
See Concept Index for: ibd-based tests.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
lm_ibdtests
and civil
See References, for details of the cited papers.
The program lm_ibdtests
uses identity-by-descent (ibd) based and
likelihood-ratio based statistics to construct linkage detection tests. The
current version allows only discrete trait data (affected or unaffected or
unknown phenotypic status).
The ibd scoring approach involves construction of an ibd measure (T) that is a function of the inheritance vectors and affectation status of the individuals in pedigrees. The program uses realizations of the inheritance vectors conditional only on the marker data (Y) to compute a Monte Carlo estimate of the test statistic E(T|Y). Four different ibd measures are implemented in the program. Two of these measures, T=Slambda and T=Saffunaff (developed by Saonli Basu), allow incorporation both of affected and of unaffected individuals in the analysis. The test statistic is used to test the null hypothesis of no linkage between the trait and a set of markers. For this approach, two different testing options have been implemented; one is a normality-based test and the other is a permutation test. The permutation test keeps the observed marker data unchanged and permutes the affectation status. In the normality-based test, test statistics (T=Spairs, for example) are computed for each realization and averaged over realizations. The program then reports the p-values from each test at the marker loci. For more details of these methods, see [Bas08].
A new (lambda,p) model has been implemented in lm_ibdtests
. The
(lambda,p) model models the trait-dependent segregation of inheritance vectors
at a locus given the trait data on individuals and constructs a chi-square test
for linkage detection. The (lambda,p) model incorporates both affected and
unaffected individuals in the analysis. The delta model is also implemented in
the program. The current version of lm_ibdtests
only allows the ibd
measure T=Spairs in the delta model set-up. The program returns the p-values of
the likelihood-ratio statistics under each of these two models. For a detailed
description of the (lambda,p) and delta models, see [Bas10]. For a real data
analysis using lm_ibdtests
, see [Sie05].
The program civil
is due to Yanming Di, see [DT09].
It is still in beta-test version.
The program performs marginal and conditional inheritance vector tests
for linkage detection and localization. The name civil
is an acronym for
Conditional Inheritance Vector test In Linkage analysis.
In an inheritance vector test, the test statistic is a score that measures the
connection between the observed trait values and the inheritance vector at the
test position. Excess such connection provides evidence for genetic linkage.
civil
implemented two such scores: a variance component type score
(the vc-score) and a score developed by Yanming Di (the w-score).
civil
computes marginal and conditional test p-values using Monte Carlo
method: to approximate the null test statistic distributions, the program will
hold trait values fixed and resample the inheritance vectors. The inheritance
vectors along a chromosome should follow a Markov Chain distribution in
genomic regions absent of causal genetic variants. In a marginal test, the
null inheritance vectors are sampled from the marginal distribution of the
Markov Chain, which is uniform over the set of all possible inheritance
vectors (see Introduction to lm_auto gl_auto and lm_pval).
In a conditional inheritance vector test, the inheritance vectors
are sampled from the conditional distribution the inheritance vector at the
test position given the observed inheritance vectors at the two conditioning
positions, as determined by the Markov Chain distribution.
A significant conditional test result provides linkage localization information: it suggests that linkage signal exists in the region bounded by the two conditioning positions, and the conditional p-value gives the false positive probability. A significant marginal test result does not allow such interpretation. For conditional tests, there is a trade-off between power and precision. When the two conditioning positions are more far apart, the conditional test will be more powerful, but a significant conditional test result will provide less precise localization information.
See Concept Index for:
lm_ibdtests
introduction,
civil
introduction,
vc-score and w-score.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
lm_ibdtests
parameter fileThe example parameter file for lm_ibdtests
, ‘ped73_ibdt_IBD.par’,
may be found in the ‘TraitTests’ subdirectory of ‘MORGAN_Examples’.
Several lines in the example parameter file have been explained in previous
sections of the tutorial, only the sections requiring additional explanation are
shown below.
sample by scan set L-sampler probability 0.5 set burn-in iterations 1000 check progress MC iterations 1000 compute ibd statistics set ibd measures Spairs Srobdom set ibd tests norm permu set ibd permutations 999 compute scores every 100 iterations |
The statement ‘sample by scan’ indicates that all loci or all meioses are updated successively in an order determined by random permutation. The alternative ‘sample by step’ updates only one locus (L-sampler) or one meiosis (M-sampler) in each iteration. The ‘set L-sampler probability’ statement specifies that an L-sampler step/scan will be used at each MCMC iteration with probability 0.5: otherwise the single-meiosis M-sampler will be used. The ‘set burn-in iterations’ statement specifies 1000 iterations to be performed initially, with one trait locus (if any) unlinked to the marker map. The ‘check progress’ statement instructs the program to print the current iteration number to ‘stdout’ every 1000 iterations.
The ‘compute ibd statistics’ statement must be included in the parameter
file when running lm_ibdtests
. The next line instructs the program to use
Spairs and Srobdom to perform the ibd tests. The ‘set ibd tests’
command calls for both normal and permutation tests to be run. The next line is
needed since permutation test were requested in the previous line; it specifies
how many permutations are to be used in the calculations. In this case, the
default (999) is specified; it is recommended that at least 50 permutations are
used. The last line in the parameter file is used to specify when to compute
scores, the default is every MCMC iteration.
See Concept Index for:
sample parameter file for lm_ibdtests
.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
lm_ibdtests
outputUnder the subdirectory ‘TraitTests’, run the example with the following command
./lm_ibdtests ped73_ibdt_IBD.par |
The part of the output that tabulates test statistics and p values is shown below. The upper table provides the permutation-test p-values for each of the two test statistics Spairs and Srobdom at each of the 10 marker-locus positions, these positions being given for both the male and female genetic maps. It is apparent that there is no significant association of the trait with any of these marker positions; the p-values at markers 5 and 6 are somewhat smaller, but do not achieve (e.g.) a 0.05 significance level. The lower table gives the same result, but this time using a Normal distribution approximation to obtain the p-value. In this case the standardized (N(0,1)) value of the test statistic is given, as well as the corresponding p-value. Again there are no significant results in this small example. There is a broad qualitative correspondence between the p-values of the two tables, but the results are not close. This may be due to the small number of permutations used, or, more likely, due to the inadequacies of the Normal approximation.
************************************ p Value for Permutation Test for IBD ************************************ pos(Haldane cM) Spairs Srobdom locus male female p-value p-value marker-1 0.000 0.000 0.9020 0.9300 marker-2 10.000 10.000 0.8780 0.8450 marker-3 20.000 20.000 0.8130 0.7800 marker-4 30.000 30.000 0.5080 0.5190 marker-5 40.000 40.000 0.2550 0.2480 marker-6 50.000 50.000 0.2950 0.2510 marker-7 60.000 60.000 0.3850 0.5090 marker-8 70.000 70.000 0.5100 0.6660 marker-9 80.000 80.000 0.6610 0.7750 marker-10 90.000 90.000 0.5640 0.7470 ******************************* p Value for Normal Test for IBD ******************************* pos(Haldane cM) locus male female Spairs p-value Srobdom p-value marker-1 0.000 0.000 -0.7843 0.7951 -0.2867 0.6167 marker-2 10.000 10.000 -0.9574 0.8166 -0.3841 0.6567 marker-3 20.000 20.000 -1.1825 0.8816 -0.2260 0.5692 marker-4 30.000 30.000 -0.6437 0.7381 -0.1272 0.5552 marker-5 40.000 40.000 0.2478 0.4103 0.0986 0.4743 marker-6 50.000 50.000 -0.2270 0.5752 -0.3275 0.6252 marker-7 60.000 60.000 -0.1503 0.5612 -0.3514 0.6437 marker-8 70.000 70.000 -0.3096 0.6372 -0.3587 0.6557 marker-9 80.000 80.000 -0.4877 0.6902 -0.2706 0.6037 marker-10 90.000 90.000 -0.2924 0.6222 -0.1136 0.5662 |
Your values may be different due to different random seeds in your seed file.
For more details about the lm_ibdtest
methods, see [Bas08].
See Concept Index for:
lm_ibdtests
sample output.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
civil
parameter filecivil
bases its tests on
the inheritance vectors at the test or conditioning positions.
Since these are not observable,
a randomized-test strategy is used to deal with this
issue. To perform marginal and conditional tests using civil
, the user
must first run the MORGAN program gl_auto
to draw an MCMC sample of the
inheritance vectors jointly at all involved genomic positions: including all
possible test positions and conditioning positions. For either the marginal
or the conditional test, at each test position, civil
will compute N test
statistic values and N p-values, one for each MCMC realization of inheritance
vectors, where N is the size of the MCMC sample. The collection of the N
p-values provides an empirical distribution of a randomized (or latent)
p-value.
Typically, 5 files are required for running civil,
*.par *.xtra *.ped *.markers *.oscor
and an optional seed file can also be used.
The parameter file ‘*.par’ for civil
should be based on the
one used by
gl_auto
to generate the MCMC realizations of the segregation indicators. It
should include MORGAN statements about pedigrees, quantitative traits, markers
and sampler seeds. Additional informations on the gl_auto output file and
marginal, conditional test setup are specified in an extra parameter file
‘*.xtra’ and
provided to civil
through the ‘input extra file’ statement.
For example, in the civil
parameter file
‘Autozyg/Gold/civil.vc.par’, the
pedigree and marker informations are specified as
input pedigree file 'civil.ped' input marker data file 'civil.markers' select all markers |
The pedigree and marker information should be the same as those in the
gl_auto par file, except that civil
requires a quantitative trait to be
specified, so a column of quantitative trait values need to be added to the
input pedigree file if it is not already there.
In the same par file, a quantitative trait is specified as
select trait 2 set trait data quantitative input pedigree record trait 2 real 3 set trait 2 tloc 12 set trait 2 for tloc 12 genotype means 0.2000000, 4.9000000, 9.6000000 set trait 2 additive variance 2.0 set trait 2 residual variance 15.0 set tloc 12 allele freqs 0.3 0.7 map test tloc 12 all interval proportions 0.3 0.7 map test tloc 12 external recomb fracts 0.1 0.3 0.45 |
The two ‘map test tloc’ statements are required by MORGAN,
but the numbers in
those lines will not be used by civil
.
The values of ‘additive variance’
and ‘residual variance’ specified here
will be used by civil
only when ‘use_sample_variance’
is set to ‘no’ in the extra parameter file (see below).
The ‘genotype means’ will
be used only if ‘use_sample_mean’ is set to ‘no’
in the extra parameter file.
Additional informations about marginal and conditional test setup are provided
to civil
through an ‘extra file’.
input extra file 'civil.vc.xtra' |
The outline of the extra file is as follows (for an example, see ‘Autozyg/Gold/civil.vc.xtra’):
## inheritance vector file name (.oscor file) civil.oscor ## output file directory . ## output file keyword civil ## info on the oscor file ... n_mcmc 10 order 0 ## trait model parameters ... pD 0.3 use_sample_mean yes mu 0 use_sample_sd yes ## marginal test parameters test_statistic vc n_mc 9999 n_pos 101 test_pos 0 4 8 12 ... ## conditional test parameters test_statistic vc n_mc 999 n_pos 81 test_pos 40 44 48 ... test_pos_l 0 4 8 ... test_pos_r 80 84 88 ... |
The first 6 lines provide the name of the
gl_auto
output file (line 2), the
name of the output directory (line 4), and a keyword for naming the output
files (line 6). civil
will create four output files, suffixed by
‘*.miv.p.out’, ‘*.miv.t.out’, ‘*.civ.p.out’,
and ‘*.civ.t.out’, in the output
directory. The four files store marginal and conditional test statistic values
and p-values.
The section following ‘## info on the oscor file ...’
specifies the number of
MCMC scans in the gl_auto
output file and whether the output is arranged by
component or not, with 1 meaning yes and 0 no. If the lines in the
sgl_auto
output is arranged by component, the lines will be rearranged so that they are
ordered by MCMC scan and a new file will be created to store the rearranged
output file.
The section following
‘## trait model parameters ...’ specifies the rare
allele frequency of the putative causal variant and specifies how to estimate
mean trait value and residual standard error for the trait values: if
‘use_sample_mean yes’, then civil
will use the
raw sample mean to estimate the
mean trait value, otherwise the mean value specified in the next line will be
used. If ‘use_sample_sd yes’, then civil
will use
the sample sd to estimate
residual standard error, otherwise residual standard error will be estimated
by sqrt(residual variance + additive variance) using values provided in the
main civil
parameter file.
The section following
‘## marginal test parameters’ specifies the test
statistic, the number of Monte Carlo runs for simulating the null distribution
(not to be confused with the count of MCMC realizations in the
gl_auto
output scores file),
the number of tests requested and the indices to the test positions for the
marginal tests. Currently, two test statistic options
‘vc’ and ‘w’ are available.
In this example par file, we ask civil
to perform 101 marginal
tests at positions indexed by 0, 4, 8, ..., 404.
The section following
‘## conditional test parameters’ specifies the test
statistic, the number of Monte Carlo runs for simulating the null
distribution, the number of tests requested, indices to the test positions,
indices to the left and right conditioning positions (one line for each set of
positions) for conditional tests.
In this example par file, we ask civil
to
perform 81 condition tests. The first conditional test will be at position
indexed by 40 and be conditioned on positions 0 and 80.
Note that the test positions have to be a subset of marker positions. The idea
is to run gl_auto
using a set of dense markers that should include all potential test
and conditioning positions, although not necessarily all markers
in the marker data file. When performing marginal and conditional tests,
less dense marker positions can be used.
Currently, this extra file has rigid format requirement. Comment lines (starting with ##) can be modified, but no line should be deleted or added, nor should existing lines be broken into multiple lines. The example xtra file ‘Gold/civil.vc.xtra’ can be used as a template for creating new xtra file.
See Concept Index for:
sample parameter file for civil
,
latent p-values,
randomized p-values.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
civil
outputSince civil
is still a beta-test program, it does not have
an example in the ‘MORGAN_Examples’ directory. Instead, reference
is made to the gold standard examples in the main MORGAN source
directory, in the subdirectory ‘Autozyg/Gold’.
Before running the program civil
,
the user needs to run gl_auto
to obtain an
MCMC sample of whole chromosome realizations of meiosis indicators.
See Running gl_auto example and sample output, for details.
Under the directory ‘Autozyg/Gold’, the output file
‘civil.oscor’
from a previous
gl_auto
run is provided for demonstration and testing purpose.
Before running civil
, an output subdirectory must exist.
If vc
is specified as the test statistic, create a subdirectory named ‘vc’
for storing temporary files in the user specified output file directory; if
w
is specified as the test statistic, create a subdirectory named
‘w’.
To run the example in ‘Autozyg/Gold’ make sure
the following files are present there:
civil.vc.par, civil.vc.xtra, civil.ped, civil.markers, civil.oscor
.
In the ‘Autozyg/Gold’ directory, run civil
by typing
../civil civil.vc.par > civil.vc.out |
Information on the progress of the program will be printed to
stdout
, together
with summary information about the pedigrees, markers, trait values, and
marginal and conditional test setup. For a large number of pedigrees,
civil
can take several hours to finish. Once the program is finished, four
output files,
*.miv.?.t.out, *.miv.?.p.out, *.civ.?.t.out, *.civ.?.p.out
,
will be written to the specified output file directory:
‘*’ is the output file
keyword specified in the xtra file and ‘?’
is the name of the specified test
statistic (‘w’ or ‘vc’).
They store marginal test statistic values, marginal
test p-values, conditional test statistic values, conditional test p-values.
The upper left portion of a marginal test p-values file ‘Autozyg/Gold/civil.miv.m.p.out’ is shown below:
test_pos test_map pval0 pval1 pval2 ... 0 0.000000 0.214400 0.098700 0.357800 ... 4 1.000000 0.305700 0.108900 0.142800 ... 8 2.000000 0.327400 0.133200 0.132700 ... ... |
In this output file, the first row is the header. Each of the remaining rows corresponds to one marginal test. The first two columns are the index and the map position of the test position. The columns 3 to N + 2 are the test p-values, one for each MCMC realization of the meiosis indicators. The layout of the marginal test statistic file is similar.
The conditional test p-values file ‘Autozyg/Gold/civil.civ.m.p.out’ has more columns. For each test, the first 6 columns now correspond to indices to conditional test position, left conditioning position and right conditioning position; then map positions of the conditional test position, left conditioning position and right conditioning position. Starting from column 7 are the N p-values, one for each MCMC realization.
Many temporary files will also be created under the subdirectories ‘vc’ or ‘w’ of the output directory. These files store intermediate results for computing the test scores. These results will be reused to save time when more tests need to be performed: for example, the user may want to perform more marginal and conditional tests at different test or conditioning positions.
However, if pedigree structures or trait values in the pedigree file, or trait
parameters in the
‘extra file’
file have changed since last run, these temporary files
should not be reused and should be deleted
before running civil
. If
pedigree structures have changed, gl_auto
also need to be rerun.
Use the overwrite
option for the gl_auto
output scores file, to overwrite the previous file, and/or rename the
previous file if you wish to retain it.
See Concept Index for:
civil
sample output.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
lm_ibdtests
and civil
statementsThe programs lm_ibdtests
and civil
use
the pedigree,
and genetic map and marker statements of previous sections.
lm_ibdtests
see MCMC parameter statements.
gl_auto
statements
used by civil
see Autozyg statements.
The following statements are specific to lm_ibdtests
:
compute (ibd | likelihood-ratio) statistics
Required: one of the two options must be specified.
output (sampler | permutation) seeds only
The program lm_ibdtests
uses random seeds for its permutation testing
in addition to the usual MCMC sampler seeds.
If an output seed file is named, both ending permutation and sampler
seeds will be saved unless only one or the other is requested.
set ibd measures [Spairs] [Srobdom] [Saffect] [Slambda]
Optional. lm_ibdtests
uses 1 to 4 measures to perform ibd tests for
linage; these are specified in the order [Spairs] [Srobdom] [Saffect] [Slambda].
Spairs, Srobdom, and Slambda may be specified for both normal and permutation
tests; Saffect may not currently be specified with the normal tests option.
set ibd tests [normal] [permutation]
Optional. Normal and/or permutation tests may be specified.
set ibd permutations I
Optional. Need to be specified when the permutation test is requested through ‘set ibd tests’. The default is 999. It is recommended that at least 50 permutations are used.
set likelihood-ratio lambda-p model gridpoints I1 I2
When the lambda_p measure is used for the chi-square likelihood-ratio test), the number of gridpoints may be specified. The number I1 is the number of gridpoints in the interval for the lambda-parameters of the model, and I2 is the number of gridpoints in the interval for p. The default is 6 and 9, respectively.
set likelihood-ratio measures [delta][lambda_p]
When computing the chi-square likelihood-ratio test, the choice of measures is delta and/or lambda_p, in the order [delta] [lambda_p]. The default is ‘delta’.
set likelihood-ratio tests
When computing likelihood-ratio statistics, chi-squared tests are performed. Thus, this statement is presently redundant, as there is no choice in tests.
set permutation seeds H1 H2
The program lm_ibdtests
uses random seeds for its permutation testing
in addition to the usual MCMC sampler seeds. The seeds may be specified in
the ‘input seed file’ or in the parameter file: otherwise default
seeds will be used.
The program civil
has no program-specific parameter statements.
Instead information is provided to civil
using the
input extra file statement
:
input extra file filename
Required
For information about the contents of the extra file see Sample civil parameter file.
See Concept Index for:
lm_ibdtests
statements,
civil
statements,
ibd measures,
likelihood-ratio measures.
[ << ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
This document was generated by Elizabeth Thompson on September 6, 2019 using texi2html 1.82.