SNP genotyping file data from SAM files containing exome sequence data

Sometimes, it may be desirable to extract the genotype information present in a whole exome sequencing data set. For example, it may be desired to filter sequence variants according to whether they lie within an autozygous region, even if SNP array genotyping may not previously have been performed.

AgileGenotyper genotypes over 0.5 million SNP sites in or close to protein-coding exons, using the data in an ordered SAM file derived from an exome-enriched sequencing run. These SNPs have been previously identified by the 1000 Genome Project. For AgileGenotyper to work correctly, the sequence entries in the SAM file must be ordered by chromosome number and map coordinate, a task that if necessary can be performed using AgileSamFileSorter.

AgileGenotyper uses an Access database containing information on 538,332 SNPs to perform its genotyping. (It is not necessary to have Microsoft Access installed on the computer, to use this database.) The exome sequencing reads should be aligned to the hg19 reference sequence. All the SNPs have been previously identified by the 1000 Genomes Project, and are located either in protein- coding exons or within 50 bp of a splice junction. Positions that cannot be genotyped are called as “Nocalls”.

Guide to use AgileGenotyper

The AgileGenotyper user guide can be found here.


The AgileGenotyper program can be downloaded here.

