User guide

Introduction

The user interface of AgileQualityFilter is shown in Figure 1, and consists of a number of user-selectable inputs. A choice must be made for each input, except the optional Filter by tags, before the data can be processed.

AgileQualityFilter Screenshot 1

Figure 1: AgileQualityFilter user interface

Selecting data

The upper panel Select the data file(s) allows the user to select the sequence data as either a single file (*_qseq.txt or *.fastq) or a folder of such files. If a folder is selected which contains both types of file, the program will prompt the user to choose which file type to use; AgileQualityFilter will not combine data from different file types. If a folder of sequence data is to be filtered, tick the Folder box next to the Select button (labelled A in Figure 1). Then select the data by pressing the Select button and navigate to the folder. If only a single file is to be filtered, untick the Folder box, press Select and choose the data file.

Setting the filter options

The Quality options field allows the user to set the filtering parameters used by AgileQualityFilter.

Processing data containing 5′ tags

If the data contains reads that are “tagged” with a 5′ sequence tag, tick the Filter by tags box (labelled H in Figure 1) to activate this option. Once activated, press the Select button (I in Figure 1) and choose a tab-delimited plain text file that contains the tag sequences and tag “IDs”. The tags can be between 3 and 16 bp in length, but a shorter tag must not be a suffix to a longer tag. If a read contains a tag, the tag is removed and the read is exported to a file named after the tag’s “ID” value. The tag file format is shown in Table 1; using this tag file would result in the creation of four files, named APC, SATBII, PXDN and ATOH7. All reads that started with the sequence TAA, TAGG or TATTT, would be placed in the appropriate APC, PXDN or ATOH7 file, respectively. Any sequence that started with TAAC, TACC , TAGC or TATC would be placed in the SATBII file.

Tag IDDelimiterTag sequence
APC<tab space>TAA
SATBII<tab space>TANC
PXDN<tab space>TAGG
ATOH7<tab space>TATTT

Export file location

The location to which exported data are saved is selected using the Export button (J in Figure 1), in the Export file options field. This field also contains the Save as fasta file and Save as fastq file options, which allow the format of the exported data to be selected.

Processing the data

Once the data file and export file have been entered, the Convert button (K in Figure 1) will be activated. Pressing this button will start the filtering process and the creation of the export file(s), along with two summary files. The DataLog<current date>(<current time>).xls file contains information on the files processed and the cut-off values used in the filtering. The ReadStats.xls file contains a summary of the number of reads at each position and their average quaility score before and after filtering. Examples of these files can be found here.