Table 2

Summary information on the genetic information used in the genotype–phenotype association analyses

InformationDataFormat
Germline genetic sequencing dataRaw sequencing dataFASTQ or BAM files
Identified genetic variantsVCF files
Documentation of sequencing procedure and analyses performed to allow trackability of downstream analysesCSV or TXT files to allow long-term readability
Quality measuresRead depth, score to evaluate correct variant calling, etc.
Underlying genetic conditionHeritable underlying conditionDescription of underlying condition
Gene name of affected geneHGVS name
Identified causal variantHGVS notation: genomic location, coding DNA and expected effect on protein; rs number if available
  • BAM, compressed binary file used to represent aligned sequences; CSV, comma separated values; DNA, deoxyribonucleic acid; FASTQ, text file containing the sequence data from the clusters that pass filter on a flow cell; HGVS, Human Genome Variation Society; rs, reference single nucleotide polymorphism identifier; TXT, standard unformatted text document; VCF, variant call format.