QVregion

Quality Value for a sequence region of interest

This program was produced as a way to measure the quality of DNA sequences created on an Applied Biosystems (ABI) sequencer. The program uses the *.phd.1 files generated by the ABI base-caller and calculates the average quality value (QV—defined in the same way as a Phred score) across a stretch of DNA that matches a “region of interest”. This region of interest is defined by a user-supplied reference sequence. If the test read cannot be aligned across the whole length of the reference sequence, the program will determine the average quality value of the region that could be aligned. If a read cannot be aligned to the entire region of interest, despite having a high quality value, this suggests that the sequence may contain a heterozygous indel (or else diverges for some other reason from the supplied sequence).

The program asks for a user-defined cut-off value. If the average QV is lower than this cut-off, the sequence fails the quality control. Similarly, if the program cannot align the query sequence in the *.phd.1 quality file to the whole region of interest, that trace file is deemed to have failed.

The program analyses all the files in a folder and creates a tab-delimited output file that contains the average QV scores for the alignment, the average QV score for the 10 worst bases and a summary of whether the sequence passes the quality control steps. While using the program it became apparent that the QV score of the 10 worst nucleotides in the sequence is a better measure for the presence of false positives.

Download here
Note: this program requires the .NET framework version 1.1 or later to be installed. (Available through Windows Update.)