[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

7. Estimating a priori ibd Probabilities by Monte Carlo

7.1 Introduction to ibddrop  
7.2 Sample ibddrop parameter file  
7.3 Running ibddrop example and sample output  
7.4 ibddrop statements  

See Concept Index for: a priori ibd probabilities, identity by descent, ibd.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

7.1 Introduction to ibddrop

ibddrop estimates probabilities of gene identity by descent, ibd, (such as kinship, inbreeding, or multi-gene identities) by Monte Carlo in the absence of data. Given the pedigree and a genetic map, ibddrop simulates meioses indicators and scores them to estimate the ibd probabilities among a set of gametes.

The simplest example of estimation of ibd probabilities among a set of gametes is the computation of an individual's inbreeding coefficient. In this example, the set of gametes in question are the maternal and paternal gametes that make up the individual. A set of two gametes can be either ibd or not-ibd. To keep track of ibd status among the gametes, we can label the paternal allele `1'. If the two alleles are ibd, the maternal allele would also be labeled `1', and the resulting ibd pattern would be `1 1'. If the two alleles are not ibd, the maternal allele would be labeled `2' and the resulting pattern would be `1 2'. The individual's inbreeding coefficient is the probability that the two alleles follow the `1 1' pattern.

If there are three gametes in the set, there are five potential ibd patterns: `1 1 1' (all three gametes are ibd), `1 1 2' (the first two are ibd and the third is not), `1 2 1' (the first and third are ibd) , `1 2 2' (the last two are ibd), and `1 2 3' (none are ibd). ibddrop can estimate probabilities of ibd patterns among up to 10 gametes in a set. ibddrop outputs a probability for each ibd pattern at each marker.

Gene identity can be scored either for each locus separately, in which patterns of identity among up to ten haplotypes can be scored, or it can be scored jointly over a moving window of several loci. If the moving window option is selected, genedrop calculates the probability that the specified pair of gametes are ibd at all loci in the window. As a result, it is then possible to determine the probability that all or some of the gametes are ibd for a particular haplotype.

See Concept Index for: ibddrop introduction, ibd pattern, meiosis indicators, inheritance indicators.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

7.2 Sample ibddrop parameter file

Files for ibddrop may be found in the `IBD' subdirectory of `MORGAN_Examples'. The sample parameter file for ibddrop is `jv_rep_ibd.par'.

 
set printlevel 5
input pedigree file 'jv_rep.ped'

simulate markers   
simulate tloc 1

map          markers   distances 44.6 44.6 11.2 11.2
map tlocs 1  marker 2  distances 22.3

set component 1  proband gametes 331 0 333 1
set component 2  proband gametes 541 0 541 1 341 0 343 1

input seed file '../sampler.seed'

set MC iterations 20000

The parameter file specifies the pedigree file name `jv_rep.ped' and then asks for five markers and one trait locus. Since there are no data, the distinction between marker and trait doesn't mean anything -- it is just a way to specify a set of loci, one of which may be unlinked. `jv_rep.ped' contains data on 30 individuals, including gender and one trait. The reason for this specification is that the same specification may then be used in lm_auto, where simulation is conditional on marker and (optionally) trait data. See Estimating Conditional IBD Probabilities by MCMC.

The two `map' statements specify the genetic map. From the first statement, the genetic distances between the markers are 44.6, 44.6, 11.2 and 11.2 centiMorgans. From the second statement, the trait lies between markers 2 and 3, at 22.3 centiMorgans with marker 2.

The `set proband gametes' statements tell ibddrop which gametes to score: that is, the gametes among which the ibd probabilities will be estimated. In this example, we selected, from component 1 (the first family in the data set), the maternal (0) gamete of `331' and the paternal (1) gamete of `333'. The next statement selected four gametes to score from family 2. Note that characters are allowed in the names of individuals.

The `input seed file' statement enables the file to use the seeds from file `sampler.seed'. The `output overwrite seed file' statement allows the program to replace the contents of the seed file with the newly generated seeds. If this options were omitted, when the program finished running, new seeds would be appended to the end of the file. Seeds can also be set using the `set sampler seeds' statement (see ibddrop statements).

The number of Monte Carlo iterations is set to be 20,000 by the `set MC iterations' statement.

Note that if one would like to compute a multilocus ibd probability, the statement `set locus window' can be used to specify number of loci to score jointly. ibddrop has limited functionality for computing multilocus probabilities, it can only examine two gametes to determine whether or not the two are ibd. For instructions on how to implement windows in this example, see the parameter file. For additional options, including specific patterns over two or more gametes, see Sample lm_auto parameter file: lm_auto has the option of scoring more general patterns of gene indentity over multilocus windows.

See Concept Index for: ibddrop sample parameter file, Haldane map function, proband gametes, seeds for sampler, seed file.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

7.3 Running ibddrop example and sample output

The syntax for running this MORGAN program is:
 
<./program> <parameter file> [ > <output file name> ]

where , optionally, `>' redirects the standard output (<stdout>) to an output file instead of to the screen.

The `ibddrop' example can be run under the subdirectory `IBD/' with the following command:

 
./ibddrop jv_rep_ibd.par > ibddrop.out

The genetic map specified by the statements `map markers distances' and `map tlocs 1 marker 2 distances' is below. Note the position of the trait locus (T1) with respect to the marker loci.

 
 Distances (cM):

              T1                       
 --------------+---------------------  
   44.6   22.3   22.3   11.2   11.2    
 +------+-------------+------+------+  
M1     M2            M3     M4     M5  

Since the parameter file contains two `set proband gametes' statements, ibddrop will produce two sets of results in the output file (here `ibddrop.out').

The exact probability estimates will, of course, depend on the random seed used. Some example results for the second component are detailed below.

 
Summary for component 2:

    Probabilities of IBD patterns

       Proband gamete set 1:  541 0  541 1  341 0  343 1

       pattern marker-1 marker-2   tloc-1 marker-3 marker-4 marker-5    label

       1 1 1 1    .0290    .0293    .0285    .0284    .0295    .0298        0
       1 1 1 2    .0271    .0298    .0285    .0294    .0288    .0283        1
       1 1 2 1    .0144    .0126    .0130    .0146    .0135    .0140        3
       1 1 2 2    .0095    .0107    .0106    .0093    .0092    .0089        4
       1 1 2 3    .0249    .0258    .0278    .0273    .0280    .0268        5
       1 2 1 1    .0693    .0644    .0664    .0654    .0659    .0633        6
       1 2 1 2    .0063    .0053    .0056    .0060    .0055    .0052        7
       1 2 1 3    .0599    .0605    .0585    .0585    .0597    .0585        8
       1 2 2 1    .0693    .0693    .0698    .0696    .0708    .0712        9
       1 2 2 2    .0495    .0479    .0489    .0490    .0490    .0471       10
       1 2 2 3    .1406    .1384    .1338    .1372    .1363    .1392       11
       1 2 3 1    .1376    .1368    .1401    .1364    .1374    .1391       12
       1 2 3 2    .0251    .0263    .0297    .0255    .0265    .0279       13
       1 2 3 3    .0956    .0958    .0961    .0976    .0954    .0958       14
       1 2 3 4    .2418    .2472    .2427    .2459    .2447    .2451       15

The probabilities are summarized by the ibd pattern. Each integer in the pattern represents one of the gametes that ibddrop was asked to score. Same numbers indicate gametes that are ibd. For instance, `1 1 1 1' means all four gametes are ibd; `1 2 1 1' means gametes 1, 3, and 4 are ibd, while gamete 2 is not ibd with the others; `1 2 3 4' means all four gametes are not ibd.

The ibd patterns are scored for each locus separately; there is a column for each of the five markers and one for the trait locus.

To compute multilocus ibd probabilities, say for 3 loci, follow the instructions to use `set locus window 3' in the parameter file and re-run the example using the same command line. The interesting part of the output is:

 
Summary for component 2:

    Probabilities of IBD patterns for windows of 3 loci

       Proband gamete set 1:  541 0  541 1

         IBD  wndw 1 wndw 2 wndw 3 wndw 4

       0 0 0   .7291  .7443  .7657  .7881
       0 0 1   .0698  .0655  .0482  .0478
       0 1 0   .0640  .0532  .0365  .0266
       0 1 1   .0279  .0252  .0369  .0284
       1 0 0   .0806  .0696  .0703  .0493
       1 0 1   .0087  .0080  .0067  .0049
       1 1 0   .0135  .0238  .0177  .0268
       1 1 1   .0063  .0105  .0180  .0281

This time, ibddrop was asked to compute ibd probabilities in windows of three loci at a time. This was done using the `set locus window' statement. Since the trait locus is unlinked to the marker loci in this example, it is placed to the left of the five marker loci on the map. Thus the first window, `wndw 1' in the table above, includes the trait locus and the first two marker loci, `wndw 2' includes the first three marker loci, `wndw 3' includes marker loci 2, 3 and 4, etc. The values in the `ibd' column at the left of the table represent `ibd' patterns. The pattern `0 0 0' means that the selected gametes are not ibd at the three loci in each window. The pattern `0 0 1' means that the selected gametes are not ibd at the first two loci in the window, but are ibd at the third. The values in the columns give the probability of the ibd pattern at the left for each of the four windows. For example, the probability that the maternal and paternal gametes of individual 541 are ibd at marker loci 3 and 5, but not at marker locus 4 is 0.0049.

Note that there are two additional example parameter files in the `IBD/' subdirectory; these examples are not discussed in the tutorial but are there for the interested user.

See Concept Index for: running ibddrop example, ibddrop sample output, ibd pattern.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

7.4 ibddrop statements

Note that ibddrop does not simulate or use marker or trait data. The statements are used only to specify the map of the loci at at which descent is to be simulated and ibd scored. The locations of loci are specified in this way so that direct comparisons can be made between output of ibddrop and of lm_auto (see Running lm_auto example and sample output), where simulation is conditional on marker and trait data.

The additional ibddrop statements are:

simulate markers
This statement specifies that markers are to be simulated. The number of markers is inferred from the marker map.

simulate tloc L
This statement, which typically follows the simulate markers statement, establishes the trait locus to be simulated. Note that this trait locus must be mapped onto the chromosome selected for marker simulation.

map tlocs L1 ... unlinked
This statement specifies a trait to be simulated that is not linked to markers. Only one trait can be simulated and this trait will be placed to the left of all markers.

set [component M] proband gametes N1 K1 N2 K2...

In this statement, the user specifies which gametes ibddrop is to score. Each statement must contain gametes from a single component, as the components are assumed to be independent, i.e. the probability of ibd between gametes from different components is zero. Pairs consisting of an individual's name and a meiosis indicator are listed, with `0' indicating the individual's maternal gamete and `1' indicating their paternal gamete.

In the current version of MORGAN, the number of proband gametes in a set is limited to 10.

set [chromosome I] locus window K

This statement gives the window size (number of loci) for which the multilocus ibd probabilities are scored. If no size is given, each locus is scored separately.

set sampler seeds H1 H2

This statement initializes a pair of seeds for the random number generator. The seeds must be positive and no greater than `0xFFFFFFFF', with the first seed (congruential seed) odd, and the second seed (Tausworthe seed) nonzero. If no seeds are specified, default seeds are used.

set MC iterations I
Required. This statement specifies the total number of Monte Carlo iterations.

See Concept Index for: ibddrop statements, proband gametes, meiosis indicators, inheritance indicators, seeds for sampler.


[ << ] [ >> ]           [Top] [Contents] [Index] [ ? ]

This document was generated by Elizabeth Thompson on July, 7 2013 using texi2html