Best/Optimal Method

  • Receive FastQ files from sequencing machines.
  • Use FastQ files to do the alignment, mark PCI duplicates and mark low quality reads.
  • This will generate BAM files.  1 per each chromosome, done in parallel using the CCR cluster.
  • Use the created BAM file with GATK to make variant calls and generate the VCF file.
  • The VCF files will then be annotated and put into the GDW.

Minimum needed

  • VCF files will need to be provided and CCR can do the annotations.
  • Annotations can be done in – ANNOVAR or Ensembl Variant Effect Predictor (VEP) (others in development).
  • If the user would like to use IGV-lite or PyBamView features of the GDW, BAM files would need to be provided.

Once CCR receives the required files, the files will be parsed and indexed.  This will typically take 2 business days, unless unrecognized fields are present in the VCF files.

VCF files will be uploaded using SFTP onto CCR’s cloud environment: Lake Effect.