Summary information on the genetic information used in the genotype–phenotype association analyses
Information | Data | Format |
Germline genetic sequencing data | Raw sequencing data | FASTQ or BAM files |
Identified genetic variants | VCF files | |
Documentation of sequencing procedure and analyses performed to allow trackability of downstream analyses | CSV or TXT files to allow long-term readability | |
Quality measures | Read depth, score to evaluate correct variant calling, etc. | |
Underlying genetic condition | Heritable underlying condition | Description of underlying condition |
Gene name of affected gene | HGVS name | |
Identified causal variant | HGVS notation: genomic location, coding DNA and expected effect on protein; rs number if available |
BAM, compressed binary file used to represent aligned sequences; CSV, comma separated values; DNA, deoxyribonucleic acid; FASTQ, text file containing the sequence data from the clusters that pass filter on a flow cell; HGVS, Human Genome Variation Society; rs, reference single nucleotide polymorphism identifier; TXT, standard unformatted text document; VCF, variant call format.