Page contents:
|
Biology/ Genome Sciences 414, Winter
Molecular Evolution Section
Tests for positive selection by dN/dS analysis.
PAML Exercise: For PAML you edit the control file (codeml.ctl) to change all parameters. The program then reads this control file and writes the output files, one file is the name you specify the other is called ââ¬Årstââ¬ï¿½. Remember the dN/dS rate ratio is abbreviated as omega (Ãâ° or w). Download the mhcexample folder, which contains the codeml program, aligned sequences, treefile and control file. Use the terminal application to use the command line. To get to the folder type: cd Desktop/mhcexample then, to execute the codeml program, type: ./codeml codeml.ctl where codeml.ctl is the control file you have edited. 1. Perform a parirwise comparison of sequences in the mhc.phy file to get ML estimates of dN and dS. Do this by changing the codeml.ctl file to runmode = -2. To assess significane, calculate the likelihood from the same datafile setting the omege ratio = 1. a. What are the estimates of dN , dS and dN/dS? b. How many degrees of freedom between these two models? c. Which model fits the data better, and why? d. From this analysis would you conclude the genes have been subjected to positive selection? Why or why not? 2. Perform an analysis for variation in the dN/dS between sites using the mhc.phy sequence file and the mhc.tre treefile. Change runmode = 0, Nssites = 0 1 2 7 8. This will run the models 0, 1, 2, 7, and 8. The parameter ncatG should be set automatically. Be sure to check convergence by performing the analysis with different starting omega values (probably will not have time for this). Compare the likelihood of the different models (M1 vs. M2, M7 vs. M8). a. How many degrees of freedom for the comparison of M1 vs M2 and M7 vs M8? b. Which model fits the data better, M1 or M2? M7 or M8? c. What are the parameter estimates for the models with the highest likelihood? d. From this analysis would you conclude the genes have been subjected to selection? e. Which sites, if any, have been subjected to positive selection in the comparison M1 vs M2? f. Which sites, if any, have been subjected to positive selection in the comparison M7 vs M8? 3. Perform an analysis for variation in the dN/dS between lineages using the mhc.phy sequence file and the mhc.tre treefile. Set NSsites = 0. Run PAML with model = 1. This will estimate a different dN/dS ratio for each lineage. Compare the likelihood to the model 0 above (one dN/dS ratio for each linage, no variation between sites). a. How many degrees of freedom between these models? b. Which model fits the data better? c. From this analysis, would you conclude the gene has been subject to positive selection? |
Send mail to:
wswanson@gs.washington.edu Last modified: 2/21/2007 11:32 AM |