[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
lm_map
parameter file
The two sample parameter files for lm_map
can be found in the directory
`/MORGAN_Examples/Map'. The two files are `map_G.par' and `map_P.par',
along with the corresponding marker data files `map_G.markers' and `map_P.markers'.
Thus there are two examples, one for genotypic markers (G)
and one for phenotypic markers (P). "G" denotes that marker genotypes
are observed without error. "P" denote the possibility of error, so
that the observed marker phenotype is not the same as the underlying
true marker genotype. This example uses the pedigree file `map.ped'.
`map_G.par' and `map_P.par' have the following statements in common:
input pedigree file './map.ped' input marker data file './map_[G|P].markers' select all markers set marker 1 2 3 freqs .2 .2 .2 .2 .2 set marker names DS123 DS456 DS789 map gender F marker recomb fract .18 .18 # true F map (cM): 20 20 map gender M marker recomb fract .08 .08 # true M map (cM): 10 10 limit recomb fracts .001 use sequential imputation for setup use 100 sequential imputation realizations for setup set burn-in iterations 100 sample by scan set L-sampler probability .8 set MC iterations 50 # The initial number of MCMC scans per step limit EM iterations 10 # The total number of MCEM steps |
As seen in previous examples, the `select all markers' statement instructs the program to use all markers on the chromosome for computation. The alternative is to use only selected markers for computation, which can be achieved by using the `select markers' statement (see Autozyg computing requests). The `set marker 1 2 3 freqs .2 .2 .2 .2' statement specifies the marker allele frequencies for markers 1, 2, and 3. This statement, as constructed, requires markers 1, 2, and 3 to each have five alleles with frequencies of 0.2 for each allele. If the number of alleles per marker varies from marker to marker, or if the allele frequencies vary from marker to marker, a separate `set marker freqs' statement is needed for each marker (see markerdrop population model parameters ). The `set marker names' statement overrides the default behavior, which labels markers consequtively: marker-1, marker-2, etc.
The two `map gender [] marker recomb fract' statements specify the marker map in terms of recombination fractions.
The `limit recomb fracts 0.001' statement is optional and places lower and upper bounds on the estimated recombination fractions of the map. For markers that are separated by little or no recombination, the MCEM algorithm may yield estimated recombination fractions of zero which could lead to a severe bias in the results. As a safeguard against such events, this statement places a lower bound 0.001 and an upper bound 0.5 - 0.001 on the estimated recombination fractions of the map.
The statement `use sequential imputation for setup' instructs
lm_map
to initialize the set of maternal and paternal meiosis
indicators for all members of the pedigree who are not founders; this
is done prior to the Monte Carlo simulation. The default behavior is
specified in this statement, with the alternative being to
`use locus-by-locus sampling for setup'. The statement
`use 100 sequential imputation realizations for setup' is optional
and modifies the default behavior for setup by sequential imputation
(which is 10% of the MC iterations). The next three lines in the
parameter files contain statements introduced in the Autozyg
examples of this tutorial. For explanation of `set burn-in iterations',
`sample by scan', and `set L-sampler probability' see
Autozyg MCMC parameters and options. The statement
`set MC iterations 50' indicates how many MC iterations are to be
performed at each step. The statement `limit EM iterations' was
introduced in the multivar
example and puts an upper bound on
the number of MCEM iterations.
Now we'll take a look at the remaining statements in `map_G.par':
output maps gender averaged specific set map estimation model with no mistyping set EM convergence .01 use MCEM and SA for maximization set SA curvature iterations 10 set SA ascent iterations 10 set SA gradient iterations 10 set SA convergence .001 |
The `output maps gender averaged specific' statement specifies the type
of map to be estimated by lm_map
. In this example, the default
behavior is specified, which instructs lm_map
to automatically compute
the likelihood ratio test statistic for testing the null hypothesis of a
sex-averaged map. The statement `set map estimation model with no mistyping'
instructs lm_map
to assume that the genotypes are observed without error.
The `set EM convergence' statement instructs lm_map
to stop the
MCEM algorithm if all recombination fraction updates are within 0.01 of their
previous values.
The statement `use MCEM and SA for maximization' instructs lm_map
to
attempt to refine its MCEM-based estimate of the MLE by performing additional SA
steps. The alternative is to `use MCEM only for maximization', with no
further refining. There are several statements that allow additional control
of the SA algorithm. First, an estimate of the curvature of the likelihood
is needed to initiate the SA algorithm. The statement `set SA curvature iterations 10'
instructs lm_map
to use at least 10 MCMC realizations to estimate the
curvature of the likelihood. Also, lm_map
will not initiate the SA
algorithm with a step that decreases likelihood. So, when the SA algorithm
is used for refining the likelihood estimate, the statement
`set SA ascent iterations 10' instructs lm_map
to use at least 10
MCMC realizations to determine whether a proposed first step increases the
likelihood. The SA algorithm also requires an estimate of the gradient of the
likelihood at each SA step. The statement `set SA gradient iterations 10'
instructs lm_map
to use at least 10 MCMC realizations to estimate the
gradient of the likelihood. Finally, the map estimate obtained from the final
step of the MCEM algorithm is used to seed the SA algorithm. The
`set SA convergence 0.001' statement instructs lm_map
to terminate
the SA algorithm when the absolute change in successive map estimates is less
than 0.001 for each recombination fraction in the map.
Now we'll take a look at the remaining statements in `map_P.par':
output maps gender averaged set map estimation model with mistyping set genotyping error rate .02 use MCEM only for maximization |
In this parameter file, a gender averaged map is specified by using the
`output maps gender averaged' statement. Unlike in the previous
parameter file, `map_P.par' does not assume the genotypes are recorded
without error; this is indicated by the statement
`set map estimation model with mistyping'. When `with mistyping'
is chosen, one has the option of specifying an estimate of the error rate
with the statement `set genotyping error rate E'. In this example,
the error rate is set at 0.02. Finally, the statement
`use MCEM only for maximization' instructs lm_map
not to use the
SA algorithm to further refine the MCEM-based estimate of the MLE. Since the
SA algorithm will not be used, none of the `SA' statements are used in
`map_P.par'.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |