Author: Charles Y K Cheung new changes in v1.06.1: Because MORGAN's gl_auto version 3.2 uses a new output Inheritance Vectors file format, we have made changes so GIGI is now compatible with this file format. The default behavior of GIGI v1.06 is to use this new Inheritance Vectors file format. In addition, because this new file format no longer requires us to provide GIGI with the meiosis indexes that we used to need from the console output of gl_auto, GIGI can now directly use the MORGAN pedigree file instead. Therefore, for convenience, users can either use the pedigree file or the pedigree meiosis file that users have to parse from the console output of gl_auto. That said, we are aware of the importance of backward compatibility. If you intend to use gl_auto's output from the pre v3.2 and the pedigree meiosis file, you may continue to do so. GIGI will check which version of IV file you are using. Note: User is required to use the pedigree meiosis file instead of the pedigree file if the old IV format is used. new change in v1.05: In version 1.05, I fixed the bug to account for the condition associated with inbred pedigree: if the IV infers that an inbred individual gets a pair of the same FGL and if the observed genotype for this individual is heterozygous, this IV must be inconsistent with the observed data. Thanks to Dr. Jae-Hoon Sul for identifying this bug! new changes in v1.04: Important new function: -Now GIGI can read dense markers in long format (rows are markers and columns are individuals, similar to the BEAGLE's genotype file format - except there is no "I" column here.) (See the documentation of the specification). This change allows GIGI to handle very, very dense files in memory efficient manner. see: example/param_longFormat.txt and "dense.genotypes.t" - I converted the original example dense marker file from the old format (rows are individuals) to the long format using the script in the utilities diretory: convertGenotypesfromWideToLongFormat.R - to tell GIGI that the dense genotype file is in the long format, use the -long flag : see documentation. New changes in v1.03: Bug fix: - max ped size was limited to 160... now the number is changed to 5000. - in the check that that the provided allele frequencies of each marker sums up to 1, if(sumAF==1) is replaced by if( (sumAF-1) > 0.0000001) - if a line in the allele frequency only has 1 allelic type (monomorphic marker), added a dummy allelic type with frequency 0 to prevent the program from breaking. - in main(), close the input streams before deallocating some of the variables to ensures output files get written first. Other change: - the call method in the example folder "param.txt" is now set to confidence-based calling (t1=0.8, t2=0.9) instead of the most likely genotype. See manuscript. Rationale: This change is to remind users that calls based on the most likely genotype may not be accurate. For example, if a parent has a rare allele, GIGI will correctly assign a 50% chance that the child has the rare allele IN THE SITUATION when we cannot figure out which chromosome is transmitted. If we use the most likely genotype call method, it will make a call for each genotype despite potential high uncertainty in genotype configuration. Since calls made using the most likely genotypes may be dangerous to use, we change the default call method to confidence-based calling. Analyses that account for the uncertainty in the imputed results may be more appropriate. eg. use the imputed probabilities directly or use a summary of imputed probabilities such as dosage. - A dosage file is generated if all markers to be imputed are di-allelic markers. Here, dosage is defined as the expected percent of 1 alleles in a genotype: dosage of a genotype = 1*P(genotype is 1/1) + 0.5*P(genotype is 1/2) - a binary GIGI file is included in the main uncompressed directory. New changes in v1.02: - warn user in the case when the Inheritance Vector file is empty. e.g. in trios rationale: Since we cannot infer recombination in trios, gl_auto generates an empty inheritance output. This is normal and is correctly stated in the pedigree meiosis file. GIGI will still run, but GIGI will impute only based on the pedigree structure and minor allele frequencies. Hence, Linkage Disequilibrium-based method can potentially be more powerful than GIGI for Trios. - include the perl script extractPedMeiosis.pl in the program to extract the pedigree meiosis file from gl_auto's output. - expand the FAQ section in the documentation file New changes in v1.01 - make new example files - improve the user interface :in main() :summarize relevant information about each input file after reading :print progress - convert to a new format of parameter file :fewer lines :in the code: add readImputeParameterFile_GIGI_v1_01() - implement some error checking routines on input files - add license - modify the documentation file - bug fixes: :call method #1 now works again :fix callThreshold_multiAllelic() :the bug is in the if else statement of method==2. We want the if (method 1), else if (method 2), else ... instead - add various flags - see documentation file - add license Code changes: readDenseMarkers_byComponent(): check that the number of columns are correct readMarkerPos_v2(): ensure positions are in ascending order readAF() has include new changes - ensure each row sums to 1; deallocate variable at the end. shorten the function because it duplicates what is done in readAllelicTypeCount() readAllelicTypeCount(): deallocate variable at the end