AgileROH

The guide and source code etc have all moved to a GitHub repository that can be found here: https://github.com/msjimc/AgileROH

The text below is for legacy reasons only and should not be used. The download page has been edited so only the data and python script can be accessed

Identification of autozygous regions in exome VCF data

The user guide etc for AgileMultiIdeogram is here.

AgileROHFinder and AgileROHFilterer

Note: While AgileROH is primarily designed to process whole genome exome data, it will also identify autozygous regions in microarray genotype SNP data formated as either Affymetrix birdseed files or the older Affymetrix '*.xls' tab-delimited files. However AgileROHFilterer will not filter these files.

Compiling from source code.

The source code for both programs relies on a common set of source code files with only the 'main' function differing between the applications. Consequently, the AgileROH function can easily be included in new projects just by adding the "AffyEngine", "methods", "SNP" and "Region" *.cpp and *h files. The "methods" class contains a number of generic functions which can be found in a wide range of C++ libraries such as the boost libraries. However, they are included here to reduce the number of dependiences the code has and make it more portable. Consequently the same code will compile on Linux computers (Centos 7 with g++) or Windows with Visual Studio. To compile the applications on Linux, navigate to the folder containing the source code in the terminal and then enter:

g++ -g AffyEngine.cpp AgileROHFinder.cpp methods.cpp Region.cpp SNP.cpp -o AgileROHFinder.exe 2> errors.txt

to build AgileROHFinder or

g++ -g AffyEngine.cpp AgileROHFilterer.cpp methods.cpp Region.cpp SNP.cpp VCFFilter.cpp -o AgileROHFilterer.exe 2> errors.txt

to build AgileROHFilterer. If the build fails the error messages form g++ will be saved in the errors.txt file.

To build a windows application create a new c++ Windows Console Application, disable the use of precompiled header files and add the header (*.h) and source code (*.cpp) files to the project. Then build the application, no modification of the code should be required to produce either 32 or 64 applications (tested with VS2010, VS2017 and VS2019).

Guide to use of AgileROHFinder:

AgileROHFinder is run using the following command:

$AgileROHFinder $VCF $Results $option

Where:
$AgileROHFinder is the filename and path of the AgileROHFinder program.

$VCF is the filename and path of the VCF to be processed.

$Results is the filename and path of the file to which the results will be saved.

$option can be ‑a, ‑b or ‑t. The ‑b options will export the autozygous regions as a string that can be used in the genome browser (i.e. chr12:10000000-20000000). The ‑t option will produce a tab delimited files with each region formatted as chromosome, region start, region end and region length. The option is required, but if the program doesn't recognise it, it will default to option ‑a which produces a files containing both formats.

Guide to use of AgileROHFilterer:

AgileROHFilterer is run using the following command:

$AgileROHFilterer $VCF $FilteredVCF $Results $n $option

Where:
$AgileROHFilterer is the filename and path of the AgileROHFilterer program.

$VCF is the filename and path of the VCF to be processed

$FilteredVCF is the filename and path of the filtered VCF files

$Results is the filename and path of the file to which the results will be saved

$n retains variants with in n bp of a region (n = 0 to 1000000). Since exome variant data is error prone, autozygous regions may be truncated, therefore use the value to extend regions to ensure variants at the end of a region are retained.

$option can be ‑a, ‑b or ‑t. The ‑b options will export the autozygous regions as a string that can be used in the genome browser (i.e. chr12:10000000-20000000). The ‑t option will produce a tab delimited files with each region formatted as chromosome, region start, region end and region length. The option is required, but if the program doesn't recognise it, it will default to option ‑a which produces a files containing both formats.

Manipulating data produced by AgileROH programs

Both AgileROHFinder and AgileROHFilterer can export the autozygous regions as a tab-delimited text file which can then be used as part of a downstream variant filtering step. To show how this can be done the python script p_FindCommonRegions.py was written to read all the region text files (option -t) in a folder and identify regions common to all the files , which are then save to a text file.

Download

The precompiled AgileROHFinder and AgileROHFilterer program binaries can be downloaded here.