This lab will cover two MORGAN programs. The first is the small program, kin, which computes the kinship coefficient between specified pairs of individuals, and the inbreeding coefficient of specified individuals. The second (ibddrop) simulates the descent of DNA at linked loci down a pedigree to estimate the probabilities of more complicated patterns of ibd.
To run kin you will need a parameter file and a pedigree file. You may find these files here:
To run the program kin on the example file, type:
% kin jv_3rep_kin.par
or, if you would like to send your output to a file, such as
kin.out, type
% kin jv_3rep_kin.par > kin.out
or, if you decided to call the pedigree file something different
such as my-pedfile
% kin kin.par ped my-pedfile
(The ped key on the command line overrides whatever is in the
parameter file.)
Look at your output, or output file. It should be self-explanatory.
kin has been fixed to allow computation of kinship coefficient of an individual with himself. However kin is also supposed to ignore duplicate requests which it sems it no longer does. OK, always something to be fixed.
Also note that for your single-component pedigrees you just leave out the component 1 bit of the statement. You can ask for any number of pairs for kinship in one statement (as the two pairs in the example), and any number of individuals for inbreeding (as the two in the example). You should have (for each component if you have more than 1), just one kinship request statement and just one inbreeding request statement in your parameter file.
Your output should include the following:
(i) Calculations of the kinship coefficients between members of three pairs of individuals in your pedigree. Try to find pairs with different kinship coefficients, or with different relationships having the same kinship coefficient. At most one of your pairs should have kinship coefficient 0, and at least one of the pairs should be a pair of bilateral relatives other than siblings.
(ii) Calculations of the inbreeding coefficients for three non-founder individuals in your pedigree. At most one should have inbreeding coefficient 0. Try to find individuals with different inbreeding coefficients if you have them.
Turn in a sheet of paper with your results, with brief explanation of the relationships between the individuals you have chosen (for kinship) or between their parents (for inbreeding).
This one takes me longer to write and explain, than it will take you to do it!
The MORGAN ibddrop program estimates ibd probabilities by simulating descent on pedigrees. You can find out more about ibddrop in Chapter 7 of the Tutorial.
One important thing is that the ibddrop talks about marker loci and trait loci, but for ibddrop these are just locations on the chromosome -- it does not care about genetic markers or traits or genetic data for markers or traits on the individuals. We do it this way, so that the same parameter files can be used for a later MORGAN program where we do Monte Carlo realizations of ibd conditonal on marker and trait data.
We will use the same pedigree file jv_3rep.ped, which is 3 copies of the JV pedigree.
In addition to jv_3rep.ped download the following three files, to the directory from which you are running this week's lab programs:
Now we are ready to start on the first ibddrop example. Look at the parameter file ibddrop1.par First we specify the pedigree file and seed files as described above. Next we tell it that it is to simulate markers (it will figure that there should be 5 markers from the marker map) and to simulate 1 tloc which it arbitrarily names "11". (Remember only the first 4 letters of each key word are significant.) A tloc is MORGAN's abbreviation for a trait locus. Note from above, to ibddrop, 5 markers and one tloc just means 6 loci, since there are no data, but it is convenient to use the same statements that we will use for other MORGAN programs.
You do not need to worry about details of maps, map specification etc. yet -- the information here is just for those who already are more familiar with genetic maps.
Next we specify the marker map, in terms of recombination fractions between adjacent markers. Since we don't give the map a gender, it assumes it is both the` male and the female marker map. Then we tell it where to put the tloc (tloc 11) relative to the markers: "marker 2 recomb frac .112". This means put the tloc to the right of marker 2, at recombination fraction .112 to marker 2. (If we wanted it to the left of all the markers, we would say "marker 0 reco frac 0.112", and in this special case this would mean the recombination fraction to marker 1 is 0.112.) In fact, I have given a very slightly different recombination fraction for the male and female meioses, just to show we can!
Next, we have to tell ibddrop what ibd patterns to score. In this version, we give it some sets of gametes, and it will score all the possible patterns of ibd among them, scoring locus by locus. (This will probably be clearer when you look at the output.) We specify a gametes by the ID "name" of the individual, and by a 0 (for maternal) or 1 (for paternal) indicator. Thus grandma 0 3v3 1 means score the ibd between the maternal gamete of grandma and the paternal gamete of 3v1. To keep things manageable, we are here scoring ibd only on the second pedigree component. We score ibd between these two gametes, and then ibd among a set of 4 gametes consisting of the two of the final individual and these two. (Recall, there will be up to 15 possible patterns among 4 ordered gametes.) Finally, we tell ibddrop how many realizations to do: that is, how many times it will simulate descent at these 6 linked loci, down the pedigree. The number 40000 specified here does not take long, and is enough to get good estimates.
Now run ibddrop by typing
% ibddrop ibddrop1.par > ibd1.out
This may give you some warning messages to the screen -- maybe about seeds,
but will send the main output to ibd1.out. It generates quite a
bit of output, so it is probably easier to look at it in a file.
Now look at the output. A lot of MORGAN output is involved in it telling you what it understood you to tell it to do. This can be a bit tedious, but is well worth checking!! Most of the silly errors we have made in running MORGAN were because there was something a little bit wrong with a parameter file, and it did what the parameter file said.
First it prints little pictures of the maps, showing the markers and tlocs. It converts recombination fractions to centiMorgans (we will meet this very soon). Since we told it a slightly different male from female map it prints both. Next it tells us what proband gametes we told it to score, on each pedigree component -- and it will check that the specified individuals are indeed in the right component pedigree. Then it reminds us that we asked for 40000 realizations (it calls them MC iterations). This is the end of its checking of parameter statements -- and it tells us so!
Now it opens the pedigree file, and does the by now familiar checking and summary of the pedigree. Then it reopens it to start its simulations: this is because of the way pedcheck works, producing a reordered pedigree for other MORGAN programs, if necessary. Next it tells us how many meioses (remember the meiosis indicators are 0/1 "switches") it will simulate: 18 on each copy of the JV pedigree. You may find this a bit odd: there are 10 non-founders, each with a maternal and paternal meiosis. The reason there are only 18, not 20, is because it does not bother with one from any founder who has only one kid. (We may discuss this, some time.) It tells us its seeds (useful if we ever want to rerun the exact same thing), its map again (this time in recombination fractions, which is what it actually uses to simulate, and since it has found we asked for sets of 2 and of 4 gametes, it reminds us that for 2 there are 2 "patterns" (ibd or not) while for 4 there are 15.
Then, it does its stuff, and prints out the results, just counting up what it has simulated, locus by locus. The pair of individuals are cousins -- the true probability of ibd at each locus is 0.25. You will see some Monte Carlo variation, but you should see that at every locus the probability of ibd (the pattern 1 1) is about 0.25, and of non-ibd (the pattern 1 2) is about 0.75. Now look at the set of 15 patterns among the four gametes. Remember the inbreeding coefficient of the final individual is 7/64 -- how would you find the Monte Carlo estimate of this from this table? Which loci do you expect to have most similar results? Why?
This version of ibddrop shows a different way of scoring the multilocus ibd. In B1.1 we scored locus-by-locus, and so ended up with probabilities that were the same (on average) at each locus, and didn't really show the dependence between linked loci.
Look at the parameter file ibddrop2.par. Down to the scoring the information is the same, although the marker map information looks a bit different. I gave the male map in (Haldane) centiMorgans rather than recombination fractions, but if fact it's the same map (22.3 cM is a recombination fraction of 0.18, and 11.15 cM is recombination fraction 0.1). The request for 40000 Monte Carlo realizations at the bottom of the file is also as before.
The difference is in the scoring patterns: now we request the program to score over overlapping windows of 3 consecutive loci. Since our loci are ordered on the chromosome as (M1,M2,T,M3,M4,M5) these windows will be (M1,M2,T), (M2,T,M3), (T,M3,M4), and (M3,M4,M5). What it will score will be Yes(1)/No(0) events, and so simplest is just to give it pairs of proband gametes and ask it to score ibd (Yes=1) or not (No =0). For simplicity, we here give it just one air of gametes, the two of the final individual fred.
This will be clearer on looking at the output, so
now run ibddrop on this second parameter file by typing
% ibddrop ibddrop2.par > ibd2.out
The output
consists of a list of possible ibd patterns for the three loci and four columns
of relative frequencies. Remember that we are looking at 3 loci at a time. The
ibd pattern 0 1 0, for example, is the event that the individual's two
alleles
are not ibd at the first locus, are ibd at the second, and aren't ibd at the
third, where first, second, and third refer to each set of 3 loci in a
window.
Here are some things you could think about to see whether the output makes sense:
B2.1 Choose the two gametes of an inbred individual, and the four gametes of two bilateral relatives who are not sibs, and run the first version of ibddrop on these two sets of gametes (size 2 and size 4). Submit the estimated ibd probability tables from your output, and comment on whether the results make sense, given the relationships of the individuals you have given it.
B2.2 Choose just the two gametes of an inbred individual, and run the second version of ibddrop to find the probabilities of autozygosity/not over windows of three loci. Submit the estimated ibd probability table from your output, and comment on whether the results make sense, given the inbreeding coefficient of your chosen individual and the various distances or recombination probabilities among the 6 genetic loci.