User guide

Introduction

AgileExomeFilter enables the filtering, screening and sorting of variants derived from an exome sequencing experiment to allow the rapid detection of possible deleterious variants. The variants must first be annotated using Alamut-HT and users may find it useful to read about Alamut-HT before continuing.

Entering data

AgileExomeFilter Screenshot 1

Figure 1: AgileExomeFilter file selection user interface

The Alamut-HT annotated variant data file is imported by pressing the Select button (Figure 1) and selecting the required text data file.

The variant display interface

AgileExomeFilter displays the variant data as a grid of 40 columns containing the first 100 variants in the data set. If the filtered dataset includes more than 100 variants it is possible to navigate through them as a series of pages (100 variants per page), using the Page number value in the Filter variants panel in the bottom right of the window (Figure 2).

AgileExomeFilter Screenshot 2

Figure 2: The variants are displayed as a grid.


The data for each variant are shown on a single row, with the columns containing the data from the Alamut-HT data file. To make the data easier to view some of the Alamut-HT fields are combined into a single column, while others containing duplicated data are ignored. The variant data view is described in Table 1 below:

Column NameDescriptionExample or possible valuesLink to more information
IDThe variant’s RS numberRS6672356dbSNP
Gene (accession numbers)Name of the gene plus the transcript and protein accession numberNOC2L (NM_015658.3 | NP_056473.2)HGNC and RefSeq
ChromosomeThe chromosome containing the variant1-22,X,Y
Position (bp)Location of the variant in base pairs877831
Variant typeType of mutationdeletion, duplication, insertion or substitution
Variant effectEffect of the variantframeshift, in-frame, missense, synonymous
LocationLocation of the variant within the genedownstream (Exon: 19)
cDNA position bp (transcript length)Location of the variant within the transcript/cDNA and its total length2243 (2781)
Genomic nomenclatureVariant annotated relative to genome coordinatesg.877831T>C (GRCh37)HGVS
cDNA nomenclatureVariant annotated relative to transcript/cDNA coordinatesc.2243A>GHGVS
Codon changeReference and variant codonTCT>TCC
Protein nomenclatureDescription of variant in the protein sequencep.Ser269Ser, p.Phe1182*HGVS
OMIM IDOmim reference ID of the gene containing variant610770OMIM
SSF splice site ratio (var/wt)The ratio of the SSF splice site score for the nearest splice site with the variant sequence divided by the value of the reference sequence. Should be 1 for no effect.-infinity to infinitySSF
MaxEnt splice site ratio (var/wt)The ratio of MaxEnt splice site score for the nearest splice site with the variant sequence divided by the value of the reference sequence. Should be 1 for no effect.-infinity to infinityMaxEntScan for human 5' sites
NNS splice site ratio (var/wt)The ratio of NNS splice site score for the nearest splice site with the variant sequence divided by the value of the reference sequence. Should be 1 for no effect.-infinity to infinity
GS splice site ratio (var/wt)The ratio of GS splice site score for the nearest splice site with the variant sequence divided by the value of the reference sequence. Should be 1 for no effect.-infinity to infinity
HSF splice site ratio (var/wt)The ratio of HSF splice site score for the nearest splice site with the variant sequence divided by the value of the reference sequence. Should be 1 for no effect.-infinity to infinityHSF
Nearest splice site changeLocation of nearest splice site to the variant.
Splicing effect in vicinityDescribes the possibility of the variant creating a splice sitecryptic acceptor strongly activated, cryptic donor strongly activated, new acceptor site, new donor site
RS is validatedIf the variant has a RS number, has the variant been validatedyes or nodbSNP
Clinical SignificanceThe variant's clinical significance as stated by dbSNPunknown, untested, non-pathogenic, probable-non-pathogenic, probable-pathogenic, pathogenic, drug-response, histocompatibility, otherdbSNP
Minor allele dbSNP frequencyMinor allele frequency if variant is in dbSNP (List of alleles in different populations)e.g. 0.005 (C,C,C)dbSNP
Minor allele dbSNP countNumber of times alleles found in dbSNP datae.g. 2188dbSNP
African American allele data (ESP)Allele frequency in African American population0 to 1NHLBI Exome Sequencing Project
European American allele data (ESP)Allele frequency in European American population0 to 1NHLBI Exome Sequencing Project
All population allele data (ESP)Allele frequency in all populations0 to 1NHLBI Exome Sequencing Project
Average read depth (ESP)Average read depth of data used to identify SNP in ESP data sete.g. 122NHLBI Exome Sequencing Project
PhastCons ScorePhastCons severity score of variante.g. 1PhastCons
PhyloP scorePhyloP score severity score of variante.g. 1.981PhastCons
BLOSUM 45, 62 and 80 scoresBLOSUM 45, 62 and 80 matrix severity score of the variant based on the evolutionary rate of amino acid substitution i.e. -3,-3,-6Wikipedia
SIFT prediction scoreSIFT severity score for the variantdeleterious, toleratedSIFT
SIFT weightVariant's SIFT weight score0 to 1SIFT
SIFT medianSIFT scores median valueSIFT
Grantham distanceThe severity of the variant based on the Grantham distance> 0Grantham distance
VCF quality scoreThe VCF quality score for the variant> 0
GenotypeGenotype of the varianthomozygous, heterozygous
Total read depthNumber of reads mapping to position> 0
Allele read depthsThe number of reads containing the variant and reference base>0, >0
Pathogenicity classUser-defined classification of the variant if previously annotated. Value must start with 'Class ' and the a value of 1 to 5.Class 1, Class 2, Class 3, Class 4, Class 5

Table 1: Description of the columns in the data grid.

Filtering variants based on their genomic location

It is possible to filter the variants based on their position in the genome and/or location in a gene.

If variants are filtered by both regions and genes, a variant is retained if it is present in either a region or a gene.

Filter by position

To filter a set of variants based on their location, enter a region in the Filter by position panel on the Filter by variant location tab at the bottom of the window (Figure 3). This is done by entering the region's chromosome, start and end positions in the text boxes and pressing Add. The sex chromosomes are entered as either X or Y, while the mitochondrial chromosome is entered as an M. To delete a region, select it from the Current regions list and press Delete.
It is important that these regions should not overlap.

AgileExomeFilter Screenshot 3

Figure 3: The Filter by variant location tab


It is also possible to import regions saved in a text file with the format shown in Table 2, using File > Open interval list file.... For example, using such a file it is possible to import data relating to all the exons in a pull-down experiment. Since the exon regions specified in such a file may not include flanking regions that may also be of interest, it is possible to extend the regions into the flanking sequence by up to 50 bp, using the Extend region intervals by: option (located above the Add button). Regions imported from a file do not appear in the Current regions list and so cannot be deleted; to delete them first close the current window and then reenter the variant data file.

ChromosomeSeparatorStart point (bp)SeparatorEnd point (bp)
chr1:105000000-153500000
chr4:5000000-6500000
chrX:85000000-93500000

Table 2: The format of a file used to import region data.


Filter by gene

To filter the variants by the name of the gene within which they are located, enter the gene name in the Gene name text box in the Filter by gene name panel on the Filter by variant location tab at the bottom of the window (Figure 3). To delete a gene name, select it from the Current genes list and press the Delete button. It is also possible to enter a list of gene names using the File > Open gene list file... and selecting a text file in which each line contains a single gene's name.

Filter by variant characteristic

It is possible to filter variants against a number of different parameters, using the options in the Filter by variant properties options panel on the Filter by variant parameter tab at the bottom of the window (Figure 4).

AgileExomeFilter Screenshot 4

Figure 4: The Filter by variant parameter tab allows variants to be filtered against a number of different characteristics.


Each option is described below:

Filtering the data

Once the parameters have been selected, the variant data can be filtered by pressing the Filter button on the Filter variants panel in the bottom-right corner of the window (Figure 2). The number of variants remaining after filtering will be displayed in the window's title bar (Figure 5).

AgileExomeFilter Screenshot 5

Figure 5: The number of variants not excluded by the filtering process is shown in the title bar.


Sorting variants

The Sort variants by panel on the Order variants tab (Figure 6) allows variants to be sorted based on 12 different parameters, in either ascending or descending order. The Select the field to sort by: list contains the fields by which the variants may be sorted, while the Ascending and Descending options set the sorting order.

AgileExomeFilter Screenshot 6

Figure 6: The Order variants tab allows variants to be sorted on different parameters.


Importing and exporting filtering parameters

AgileExomeFilter Screenshot 7

Figure 7: The Import/Export filter options tab allows filter options to be reused.


The Import/Export filter options tab (Figure 7) describes how to save a set of variant filter options, so that the same parameters can be rapidly and consistently used to screen multiple datasets. Once the filtering options have been set, it is possible to save them by pressing the Save button. This appends the filter options to the selected file, allowing multiple filter criteria sets to be present in one file. If multiple filter criteria sets are imported from a single file, the variants are filtered against each set of criteria in turn, with variants retained if they past at least one of the filter sets. This allows a clinical disorder such as deafness (which may be dominantly or recessively inherited) to be screened using multiple filtering sets; one set of parameters suited for variants in genes with a recessive mode of inheritance, while the second set of criteria is tailored for variants in genes with dominant inheritance pattern.

Filter options saved to a text file are imported using File > Open filter set file... . When imported, these filter options remain distinct from any parameters specified using the user interface, which will be ignored during this filtering process. Conversely, if the Filter button described in the Filtering the data section is used, the imported filter parameters are ignored. To refilter the variants using the imported values, press the Re-filter button in the Filter using a filter file panel (Figure 7).

Saving and exporting variants

AgileExomeFilter Screenshot 8

Figure 8: The Save and export variants tab allows filter options to be reused.


It is possible to save either all variants that have passed the filtering process, or a smaller user-selected set of variants. This can be done either via the Variants menu or using the Save and export variants tab (Figure 8). To save all variants, either press the Save button in the Save all variants panel or select the Variant > Save all menu option and enter the name of the target file. To save a user-selected list of variants, select the variants using the data grid in the main window (in the same way that you would select data rows in Excel) and then either press the Add button in the Save selected variants panel or select the Variant > Add selected variants menu option. This process can be repeated multiple times to create a collection of variants. Once the variants have been added, it is possible to refilter the data using a different set of parameters and then add more variants to the saved variants. The saved variants are the exported to a text file by either pressing Save in the Save selected variants panel or selecting the Variant > Save selected variants menu option. Once the variants have been saved, the user will be asked if the stored variants should be cleared. To clear the stored variants without first saving them, either press Clear in the Save selected variants panel, or select the Variant > Clear selected variants menu option.