File conversion

Microarray SNP analysis can be performed using either Affymetrix or Illumina platforms. However, each creates output files in different formats, which may have the effect of tying analysis software to a particular platform. To allow use of non-Affymetrix data with AutoSNPa, Sample, DominantMapper and IBDfinder, the conversion programs Illumina2Affy versions 1 to 3, deCode2Affy and PLink2Affywere developed. They can be used to convert certain file formats to the Affymetrix format.

These programs were produced in reponse to requests from other labs wanting to use AutoSNPa or IBDfinder with Illumina data. If they do not meet your needs, feel free to contact us and we will try to help.

Illumina file conversion

These programs can be Download here.

Illumina2Affy v.1

This program converts Illumina files containing ONE individual’s genotype data of the format shown below into an Affymetrix-style file. In these files, the column order must be as shown below:

Col 1 Col 2 Col 3 Col 4 Col 5 Col 6 Col 7
SNP Sample Chr Position Allele 1 Allele 2 GC content
RS1093840010112016609AB0.887
RS5467840010112503078BB0.912

Table 1

Illumina2Affy v.2

This program converts Illumina files containing MULTIPLE individuals’ data, of the format shown below, into Affymetrix-style files, one file per person. The order of the first three columns is invariant; however, the program identifies genotype data by the presence of the suffix “.GType” in a column header. Therefore, the column order after Col 3 is not important:

Col 1Col 2Col 3Col nCol m
SNP NameChrPosition0001.GType0002.GType
SNP 111000000AABB
SNP 212000000ABAA
SNP 313000000BBAA
SNP 414000000BBAB

Table 2

Illumina2Affy v.3

This program converts Illumina data files when the positional data is stored in a different file to the genotype data and the genotype file contains data for MULTIPLE individuals. The format of the SNP map file is shown in Table 3 and the format of the genotype file in Table 4.

Col 1Col 2Col 3Col 4Col 5
[Header]
BSGT Version03/02/1932
Processing Date6/16/2009 11:46 AM
ContentHumanHap550v3_A.bpm
Num SNPs561466
Total SNPs561466
Num Samples4
Total Samples4
[Data]
HG_WUE_NRAAHG_WUE_NRFNHG_WUE_NRORHG_WUE_NR-Fet
MitoA10045GAAAAAAAA
MitoA10551GAAAAAAAA
MitoA11252GBBBBAABB
MitoA11468GAAAAAAAA
MitoA11813G--AAAA--

Table 3: Genotype data file

Col 1Col 2Col 3Col 4Col 5Col 6Col 7Col 8Col 9
IndexNameChromosomePositionGenTrain ScoreSNPILMN StrandCustomer StrandNormID
1MitoA10045GM100450.7355[T/C]BotTop0
2MitoA10551GM105510.7128[A/G]TopTop0
3MitoA11252GM112520.7452[T/C]BotTop0
4MitoA11468GM114680.7345[T/C]BotTop0

Table 4: SNP map file

Download here
Note: these programs require the .NET framework version 2.0 to be installed.

deCode file conversion

This program converts deCode files containing ONE individual’s genotype data of the format shown below, into an Affymetrix-style file. Since the deCode data files use the SNPs alleles to describe the genotype rather than using 'A' or 'B' the program must identify the different alleles and then designate them as 'A' or 'B', consequently all the files used in an analysis must be converted in the same batch, otherwise the genotype designation may vary between files converted in different batches. In these files, the column order must be as shown below:

Col 1 Col 2 Col 3 Col 4 Col 5 Col 6 Col 7
Name Variation Chromosome Position Strand Your code
RS10938A/G12016609+AG
RS54678C/T12503078+CT

Table 5

Download here
Note: this program requires the .NET framework version 2.0 to be installed.

PLink pedigree and map file conversion

This program converts PLink pedigree (*.ped) and map (*.map) files, into an Affymetrix-style files. While Affymetrix files label the alleles as 'A' or 'B', Plink pedigree files use either the numbers 1 and 2 or the actual alleles (A, C, G or T). If the pedigree file contains the actual alleles, Plink2Affy must first identify the different alleles and then designate them as 'A' or 'B', consequently it is possible that the resultant Affymetrix files originating from different pedigree files may have different genotype designations. The Plink *.ped and *.map files must be in the file format shown below:

More information about PLink files here

Col 1 Col 2 Col 3 Col 4 Col 5 Col 6 Col 7 Col 8 Col 9
Family Individual Paternal ID Maternal IDSexPhenotype SNP1 Allele 1SNP1 Allele 2etc...
FAM001 1 0 0 1 2 A A ...
FAM001 2 0 0 1 2 A A ...

Table 6: PLink pedigree file (*.ped)

Col 1 Col 2 Col 3 Col 4
ChromosomeSNP IDGenetic positionPhysical position
1rs1234561.231234555
1rs2345671.271237793
1rs2245341.321237697
1rs2335561.351337456

Table 7: PLink map file (*.map)

Download here (This is version 4; updated on 13th March 2013)
Note: this program requires the .NET framework version 2.0 to be installed.

23andMe file conversion

This program converts 23andMe files containing ONE individual’s genotype data of the format shown below, into an Affymetrix-style file. Since the 23andMe data files use the SNPs alleles to describe the genotype rather than using 'A' or 'B' the program must identify the different alleles and then designate them as 'A' or 'B', consequently all the files used in an analysis must be converted in the same batch, otherwise the genotype designation may vary between files converted in different batches. In these files, the column order must be as shown below:

Col 1 Col 2 Col 3 Col 4
NameChromosomePositionGenotype
RS1093812016609AG
RS5467812503078CT

Table 6. The start of the file contains a data description where each line begins with a '#' symbol.

Download here (This is version 1; updated on 21th June 2013)
Note: this program requires the .NET framework version 2.0 to be installed.