Advertisement

Validation of NGS for mitochondrial DNA casework at the FBI Laboratory

Open AccessPublished:October 06, 2019DOI:https://doi.org/10.1016/j.fsigen.2019.102151

      Highlights

      • Validation studies were performed in accordance with SWGDAM guidelines for NGS.
      • Results show the efficacy, accuracy, reproducibility and reliability of the assay.
      • Recovery of full CR data increased by 20-60x compared to current Sanger protocols.
      • Overall, the NGS assay simplifies and improves the efficiency of the mtDNA workflow.
      • The FBI Laboratory plans near-term implementation of NGS for all mtDNA casework.

      Abstract

      As a first step towards integrating next generation sequencing (NGS) technology into the FBI Laboratory’s operational casework, the PowerSeq™ CRM Nested System, an NGS-based mitochondrial DNA (mtDNA) control region assay, was developmentally and internally validated. The validation studies were conducted in accordance with the Scientific Working Group on DNA Analysis Methods (SWGDAM) Validation Guidelines for Forensic DNA Analysis Methods, and the FBI’s Quality Assurance Standards (QAS) for Forensic DNA Testing Laboratories. The assay was shown to be highly reproducible, with variant frequencies across intra and inter-run replicates of the same sample differing, on average, by just 0.3% for substitutions and point heteroplasmies and 1.5% for insertions and deletions. The assay was also shown to be extremely sensitive, yielding complete control region sequence data from as few as 2000 copies of mtDNA. This is a more than 20-fold increase in sensitivity when compared to the FBI Laboratory’s current Sanger sequencing-based protocols and, based on mtDNA quantitation values of samples routinely encountered in mtDNA casework, suggests that the percentage of questioned samples from which full control region data can be recovered will increase from our current 20% to approximately 90% success with NGS technology. In addition, the assay requires on average only 30% of the extract volume typically required to develop control region profiles from degraded samples via Sanger sequencing. Overall, these studies establish the reliability of the PowerSeq™ CRM Nested System for accurate mtDNA control region typing and can serve as a model for laboratories seeking to validate NGS protocols for forensic mtDNA analysis.

      Keywords

      1. Introduction

      Mitochondrial DNA (mtDNA) testing in forensic casework is generally performed with amplicons targeting the non-coding control region (CR) and capillary electrophoresis-based Sanger sequencing. However, over the past five or so years, next generation sequencing (NGS) has been shown to be an equally robust technology for the development of forensic quality mitochondrial DNA sequence data [
      • McElhoe J.A.
      • Holland M.M.
      • Makova K.D.
      • Su M.S.
      • Paul I.M.
      • Baker C.H.
      • et al.
      Development and assessment of an optimized next-generation DNA sequencing approach for the mtgenome using the Illumina MiSeq.
      ,
      • King J.L.
      • LaRue B.L.
      • Novroski N.
      • Stoljarova M.
      • Seo S.B.
      • Zeng X.
      • et al.
      High-quality and high-throughput massively parallel sequencing of the human mitochondrial genome using the Illumina MiSeq.
      ,
      • Parson W.
      • Huber G.
      • Moreno L.
      • Madel M.B.
      • Brandhagen M.D.
      • Nagl S.
      • et al.
      Massively parallel sequencing of complete mitochondrial genomes from hair shaft samples.
      ,
      • Bodner M.
      • Iuvaro A.
      • Strobl C.
      • Nagl S.
      • Huber G.
      • Pelotti S.
      • et al.
      Helena, the hidden beauty: resolving the most common West Eurasian mtDNA control region haplotype by massively parallel sequencing an Italian population sample.
      ,
      • Peck M.A.
      • Brandhagen M.D.
      • Marshall C.
      • Diegoli T.M.
      • Irwin J.A.
      • Sturk-Andreaggi K.
      Concordance and reproducibility of a next generation mtGenome sequencing method for high-quality samples using the Illumina MiSeq.
      ,
      • Zhou Y.
      • Guo F.
      • Yu J.
      • Liu F.
      • Zhao J.
      • Shen H.
      • et al.
      Strategies for complete mitochondrial genome sequencing on Ion Torrent PGM platform in forensic sciences.
      ,
      • Holland M.M.
      • Pack E.D.
      • McElhoe J.A.
      Evaluation of GeneMarker((R)) HTS for improved alignment of mtDNA MPS data, haplotype determination, and heteroplasmy assessment.
      ,
      • Eduardoff M.
      • Xavier C.
      • Strobl C.
      • Casas-Vargas A.
      • Parson W.
      Optimized mtDNA control region primer extension capture analysis for forensically relevant samples and highly compromised mtDNA of different age and origin.
      ,
      • Marshall C.
      • Sturk-Andreaggi K.
      • Daniels-Higginbotham J.
      • Oliver R.S.
      • Barritt-Ross S.
      • McMahon T.P.
      Performance evaluation of a mitogenome capture and Illumina sequencing protocol using non-probative, case-type skeletal samples: Implications for the use of a positive control in a next-generation sequencing procedure.
      ,
      • Peck M.A.
      • Sturk-Andreaggi K.
      • Thomas J.T.
      • Oliver R.S.
      • Barritt-Ross S.
      • Marshall C.
      Developmental validation of a Nextera XT mitogenome Illumina MiSeq sequencing method for high-quality samples.
      ,
      • Ma K.
      • Zhao X.
      • Li H.
      • Cao Y.
      • Li W.
      • Ouyang J.
      • et al.
      Massive parallel sequencing of mitochondrial DNA genomes from mother-child pairs using the ion torrent personal genome machine (PGM).
      ]. NGS not only offers the capability to reduce workflows, but it also allows for the generation of larger, more informative genetic data sets at higher throughput and overall lower cost per nucleotide than capillary electrophoresis-based methods [
      • Mardis E.R.
      A decade’s perspective on DNA sequencing technology.
      ,
      • Shendure J.
      • Balasubramanian S.
      • Church G.M.
      • Gilbert W.
      • Rogers J.
      • Schloss J.A.
      • et al.
      DNA sequencing at 40: past, present and future.
      ]. In the forensic context, mitochondrial DNA testing presents a logical first step in the routine casework implementation of NGS technology because mtDNA analysis is already sequence-based. As a result, technical concepts, laboratory workflows and final data formats are familiar and require little modification. Essentially, only the sequencing step, the last stage of the overall process, materially changes.
      The recovery of entire mitochondrial genome (mtGenome) data and the additional discriminatory power those data offer is arguably the greatest benefit that NGS brings to forensic mtDNA analysis. MtGenome data development via NGS is technically feasible [
      • McElhoe J.A.
      • Holland M.M.
      • Makova K.D.
      • Su M.S.
      • Paul I.M.
      • Baker C.H.
      • et al.
      Development and assessment of an optimized next-generation DNA sequencing approach for the mtgenome using the Illumina MiSeq.
      ,
      • King J.L.
      • LaRue B.L.
      • Novroski N.
      • Stoljarova M.
      • Seo S.B.
      • Zeng X.
      • et al.
      High-quality and high-throughput massively parallel sequencing of the human mitochondrial genome using the Illumina MiSeq.
      ,
      • Parson W.
      • Huber G.
      • Moreno L.
      • Madel M.B.
      • Brandhagen M.D.
      • Nagl S.
      • et al.
      Massively parallel sequencing of complete mitochondrial genomes from hair shaft samples.
      ,
      • Peck M.A.
      • Brandhagen M.D.
      • Marshall C.
      • Diegoli T.M.
      • Irwin J.A.
      • Sturk-Andreaggi K.
      Concordance and reproducibility of a next generation mtGenome sequencing method for high-quality samples using the Illumina MiSeq.
      ,
      • Marshall C.
      • Sturk-Andreaggi K.
      • Daniels-Higginbotham J.
      • Oliver R.S.
      • Barritt-Ross S.
      • McMahon T.P.
      Performance evaluation of a mitogenome capture and Illumina sequencing protocol using non-probative, case-type skeletal samples: Implications for the use of a positive control in a next-generation sequencing procedure.
      ,
      • Peck M.A.
      • Sturk-Andreaggi K.
      • Thomas J.T.
      • Oliver R.S.
      • Barritt-Ross S.
      • Marshall C.
      Developmental validation of a Nextera XT mitogenome Illumina MiSeq sequencing method for high-quality samples.
      ,
      • Chaitanya L.
      • Ralf A.
      • van Oven M.
      • Kupiec T.
      • Chang J.
      • Lagace R.
      • et al.
      Simultaneous whole mitochondrial genome sequencing with short overlapping amplicons suitable for degraded DNA using the ion torrent personal genome machine.
      ,
      • Park S.
      • Cho S.
      • Seo H.J.
      • Lee J.H.
      • Kim M.Y.
      • Lee S.D.
      Entire mitochondrial DNA sequencing on massively parallel sequencing for the korean population.
      ,
      • Strobl C.
      • Eduardoff M.
      • Bus M.M.
      • Allen M.
      • Parson W.
      Evaluation of the precision ID whole MtDNA genome panel for forensic analyses.
      ], and may be more cost-effective than current Sanger-based control region sequencing. As a result, many laboratories, including the FBI Laboratory, are evaluating mtGenome assays for casework implementation. At this time, however, high-quality mtGenome reference population datasets of size similar to those available for the control region, while under development, are not yet available [
      • Huber N.
      • Parson W.
      • Dur A.
      Next generation database search algorithm for forensic mitogenome analyses.
      ]. In addition, many law enforcement databases, such as the Combined DNA Index System used in the United States and 51 other countries, do not currently support complete mtGenome storage and searches. As a result, the FBI Laboratory elected to validate a CR assay as a first step towards broader NGS implementation.
      Fortunately, NGS brings substantial benefit to the analysis of CR data as well, with one of the greatest being a substantial simplification of current laboratory workflows and overall improved efficiency of the mtDNA analysis process. Currently, a number of workflows defined by the size, and therefore number of amplicons, required to recover as much CR data as possible are employed by labs performing Sanger-based mtDNA analysis. While reference, or high-quality samples can often be handled uniformly, questioned or low-quality samples generally cannot. In these cases, as many as eight separate amplification reactions and 16 separate Sanger sequencing reactions may be needed to generate even partial CR data [
      • Gabriel M.N.
      • Huffine E.F.
      • Ryan J.H.
      • Holland M.M.
      • Parsons T.J.
      Improved MtDNA sequence analysis of forensic remains using a "mini-primer set" amplification strategy.
      ,
      • Edson S.M.
      • Ross J.P.
      • Coble M.D.
      • Parsons T.J.
      • Barritt S.M.
      Naming the dead confronting the realities of rapid identification of degraded skeletal remains.
      ]. For the most degraded samples, even the best scenario requires two separate amplifications and 16 sequencing reactions [
      • Eichmann C.
      • Parson W.
      ’Mitominis’: multiplex PCR analysis of reduced size amplicons for compound sequence analysis of the entire mtDNA control region in highly degraded samples.
      ]. Without a uniform process for low-quality samples and because of the complexity of automating these types of dynamic, sample-dependent and scientist-driven (i.e. subjective) processes, mtDNA testing for these specimen types is generally performed manually, can be labor intensive, and may consume most or all of the DNA extract.
      The PowerSeq™ CRM Nested System (Promega Corporation, Madison, WI, USA) employs a PCR strategy that amplifies the entire mtDNA CR in a single multiplex of 10 overlapping amplicons. The amplicons, based on the Eichmann and Parson design [
      • Eichmann C.
      • Parson W.
      ’Mitominis’: multiplex PCR analysis of reduced size amplicons for compound sequence analysis of the entire mtDNA control region in highly degraded samples.
      ], range in size from 147 to 237 base pairs (bp). For NGS-based typing, the PowerSeq CRM Nested System utilizes a nested amplification protocol consisting of a single PCR to both amplify targets and incorporate indexed adapters for Illumina MiSeq™ (Illumina, Inc., San Diego, CA, USA) sequencing. This design reduces the amount of DNA extract used to target the full CR, reduces the number of laboratory steps and time required to generate NGS libraries ready for sequencing, and is amenable to automation. As these features address the needs for adoption and routine use of an NGS-based mtDNA typing method in a forensic laboratory, we evaluated the assay for its operational utility.
      The developmental and internal validation studies described herein combined the PowerSeq CRM Nested System with the Illumina MiSeq instrument for sequencing, and the GeneMarker® HTS software package (SoftGenetics, State College, PA) for mtDNA data analysis. The studies were performed according to the Scientific Working Group on DNA Analysis Methods (SWGDAM) Validation Guidelines for Forensic DNA Analysis Methods, and in accordance with the FBI’s Quality Assurance Standards (QAS) for Forensic DNA Testing Laboratories [
      • Scientific Working Group on DNA Analysis Methods (SWGDAM)
      Validation Guidelines for DNA Analysis Methods.
      ,
      • Federal Bureau of Investigation
      Quality Assurance Standards for Forensic DNA Testing Laboratories.
      ]. CR typing by use of the PowerSeq system was assessed in comparison to existing, commonly employed mtDNA typing protocols to gauge results and success rates. Overall, these studies serve to establish the reliability of the PowerSeq CRM Nested System for accurate mtDNA CR typing. Additionally, these studies provide a model for laboratories seeking to validate NGS protocols for mtDNA typing according to current SWGDAM guidelines, and aid in identifying procedures that can streamline data analysis in the production of accurate mtDNA haplotypes via NGS.

      2. Materials and methods

      2.1 DNA samples

      Accuracy and precision studies used 31 samples (28 buccal swabs and three DNA controls) with known mtDNA CR profiles previously developed by Sanger sequencing, and which represented 26 distinct CR haplotypes (Table S1). All samples were collected and typed with informed consent. For mixture testing, two of these samples (54 and 57B) were combined in the ratios of 1:1, 1:2, 1:5, 1:10, and 1:20, such that each sample served as both the major and minor contributor. Sensitivity was evaluated using three sample types, each with a known CR sequence previously developed by Sanger sequencing: a positive control DNA (2800 M), a reference DNA extract (buccal swab), and a casework-type DNA extract (hair). A total of 21 non-probative case-type samples were tested, and included 11 hair and 10 calcified tissue DNA extracts. For these samples, the available Sanger-based sequence data ranged from hypervariable regions 1 and 2 (HV1 and HV2) only to the full CR. For seven of the samples, Sanger and NGS typing were attempted from the same DNA extract (Table S2). Contamination studies utilized positive control DNA (2800 M) and water in place of extract for PCR negative controls.

      2.2 DNA extraction and quantitation

      Hair samples were extracted according to Ref [
      • Brandhagen M.D.
      • Loreille O.
      • Irwin J.A.
      Fragmented nuclear DNA is the predominant genetic material in human hair shafts.
      ], while bone samples were extracted using a variation of the protocol described in Ref [
      • Loreille O.M.
      • Diegoli T.M.
      • Irwin J.A.
      • Coble M.D.
      • Parsons T.J.
      High efficiency DNA extraction from bone by total demineralization.
      ]. Extracts were quantified using either the Quantifiler Duo assay (Thermo Fisher Scientific, Waltham, MA), an internally validated mtDNA-specific qPCR assay [
      • Kavlick M.F.
      Development of a triplex mtDNA qPCR assay to assess quantification, degradation, inhibition, and amplification target copy numbers.
      ], or both.

      2.3 PowerSeq amplification and library preparation

      Amplification and library preparation generally followed the protocol described in Promega Application Note #AN322 for the PowerSeq CRM Nested System, using the Agencourt AMPure XP bead purification option and the PowerSeq™ Quant MS System for library quantification [

      Promega Corporation, Massively Parallel Sequencing of Mitochondrial Control Region using the PowerSeq™ CRM Nested System, Custom, https://www.promega.com/-/media/files/resources/protocols/custom-systems/an322-powerseq-crm-system-custom.pdf?la=en.

      ,]. However, based on results from early experimentation and prior testing using a prototype version of the PowerSeq assay, an additional (fourth) round of purification was added to the amplification purification portion of the protocol.
      In addition, most experiments did not target the manufacturer’s recommended input of 0.5 ng genomic DNA. Because the samples processed in our laboratory for mtDNA are often not suitable for nuclear DNA testing, nuclear DNA quantification is generally not performed. While accuracy and precision testing of the 31 known specimens used the recommended 0.5 ng genomic DNA input (as determined using Quantifiler Duo), the mtDNA-specific quantitation values of these sample extracts revealed widely varying mtDNA quantities within 0.5 ng of genomic DNA (Table S1). As a result, sensitivity testing in these studies used dilution series based upon mtDNA quantities (rather than genomic DNA quantities), with PCR inputs ranging from 50 to 500,000 mtDNA copies. Similarly, the PCR input for non-probative specimens was assessed on the basis of mtDNA quantities, and ranged from ∼2,000–100,000 mtGenomes for hair extracts, and from ∼5,000–800,000 mtGenomes for calcified tissue extracts (Table S2). This corresponded to 10 μl of sample DNA extract to keep input volumes consistent with those used for the laboratory’s Sanger sequencing protocols. DNA inputs for mixture testing ranged from 5,250 to 10,000 total mtGenomes. Across all studies, extract volumes were maximized at 12.5 μl for extraction blanks and PCR negative controls, and all positive controls used 0.5 ng genomic DNA.
      Contamination studies were performed using 1) a single positive control amplified alongside 47 negative controls to assess environmental and reagent contamination, and 2) a checkerboard setup, with 24 negative and 24 positive control replicates positioned in alternating wells, to assess well-to-well contamination in a 96-well plate.
      Finally, sensitivity testing was performed using three different cycle numbers for the PCR1 step: the manufacturer’s recommended 30 cycles, as well as 36 cycles and 40 cycles.

      2.4 Illumina sequencing

      Sequencing on Illumina MiSeq FGx instruments was performed as described in Promega Application Note #AN322 for the PowerSeq CRM Nested System, using a standard flow cell and 600-cycle v3 kit with 2 × 300 bp reads, plus 2 × 8 cycles to read sample-specific indices. For the first seven of the 13 total sequencing runs, 32 normalized PowerSeq libraries were pooled for sequencing. Subsequently, the non-probative samples were pooled by type for sequencing, resulting in sequencing runs of 16 libraries each, and the mixture study samples were sequenced in a single pool of 24 libraries. To assess any differences in data quality and the accuracy of haplotype determination that could arise when fewer libraries are pooled for sequencing, one sequencing run was performed with only eight libraries. Finally, two contamination studies pooled 48 libraries each. A summary of the 13 sequencing runs is provided in Table S3.
      To further evaluate potential sources of contamination, the Run 1 (Table S3) MiSeq sample sheet included 64 unused index combinations that served as bioinformatic negative controls during data analysis. Additionally, Run 2 was performed immediately following Run 1, but used 32 different index combinations. The 32 index combinations used for Run 1 were included in the Run 2 MiSeq sample sheet to assess run-to-run carryover.

      2.5 Data analysis

      PowerSeq data were analyzed with GeneMarker HTS version 1.2.2. Sequences were aligned to the revised Cambridge reference sequence (rCRS) with the following alignment options selected: 1) consensus alignment (in which local insertion and deletion realignment is based on the consensus sequence), 2) motifs using the default GeneMarker HTS motif file, 3) an identity score of 85 (reads that are less similar than 85% to the reference sequence are not aligned), and 4) soft clipping at the 3′ end using Q ≤ 25 (base calls with quality scores ≤25 are trimmed from the 3′ end of the reads).
      Additional GeneMarker HTS analysis options employed included filter settings (both region filters and variant filters) and amplicon settings. Settings were based on vendor and published paper recommendations [
      • Peck M.A.
      • Brandhagen M.D.
      • Marshall C.
      • Diegoli T.M.
      • Irwin J.A.
      • Sturk-Andreaggi K.
      Concordance and reproducibility of a next generation mtGenome sequencing method for high-quality samples using the Illumina MiSeq.
      ,
      • Holland M.M.
      • Pack E.D.
      • McElhoe J.A.
      Evaluation of GeneMarker((R)) HTS for improved alignment of mtDNA MPS data, haplotype determination, and heteroplasmy assessment.
      ,
      • Holland M.M.
      • Makova K.D.
      • McElhoe J.A.
      Deep-coverage MPS analysis of heteroplasmic variants within the mtGenome allows for frequent differentiation of maternal relatives.
      ,
      • Gallimore J.M.
      • McElhoe J.A.
      • Holland M.M.
      Assessing heteroplasmic variant drift in the mtDNA control region of human hairs using an MPS approach.
      ] as well as preliminary evaluations of the PowerSeq assay. The region filters, which define the range of mtDNA sequence to be analyzed, were set to include nucleotide positions 16013-592 of the mtDNA CR. The variant filter settings were set to require: a minimum total read depth of 100 reads, a minimum variant allele read depth of 40 reads, and a minimum of 5% or 10% for the minor variant frequency percentage. While a 5% minor variant frequency threshold was initially applied for interpretation of known samples typed for the accuracy and precision experiments, this threshold was subsequently raised to 10%, primarily to minimize the chance of falsely including sequence background noise and/or contamination as heteroplasmic sequence.
      Analysis of negative controls followed the same logic, and utilized the same minimum read depth thresholds (100 reads and 40 reads) to ensure that any contamination that could affect a sample haplotype would be detected. That is, if sequence read depth in a negative control exceeded the 100 read minimum calling threshold used for samples, the negative control signal was deemed high enough to affect the major variants called for an associated sample; and if sequence read depth in a negative control exceeded the 40 read minimum variant read depth threshold used for samples, the negative control signal was considered high enough to potentially impact any minor variants called for an associated sample. Due to the increased dynamic range of NGS signal and the proportionality of signal to noise, the same 10% threshold used for minor variant detection in samples was also applied to the negative controls. This approach ensured that high background signal in sequencing runs that also had high sample signal would not mistakenly impact sample variant calling.
      Additional parameter settings in the GeneMarker HTS software included ≤ 10 for the allele score difference (a measure that uses base quality scores to determine if a minor allele is a true allele), ≤2.5 for the SNP balance ratio, and ≤5.0 for the indel balance ratio (a measure that uses the percentages of forward and reverse reads that include the variant). The amplicon settings, which define the data ranges for each of the assay amplicons, were initially set at 16013–16126, 16116–16225, 16224–16408, 16387–16486, 16474-30, 16555-152, 136–257, 246–364, 342–436, and 429–592, based on consultation with the vendor. These ranges sit just inside the primer regions for each of the 10 amplicons of the PowerSeq mtDNA CR Nested System, as analyzing data within the PCR primer binding regions can produce false point heteroplasmies [
      • Huszar T.I.
      • Wetton J.H.
      • Jobling M.A.
      Mitigating the effects of reference sequence bias in single-multiplex massively parallel sequencing of the mitochondrial DNA control region.
      ]. Upon examination of the mixture data, the amplicon 3 settings were updated to 16223–16408 (for the reasons described in Section 3.3, below). The GeneMarker HTS analysis settings are summarized in Table S4.
      Amplicon balance was assessed using the GeneMarker HTS Amplicon output file for each of the 31 known samples from Runs 1–3. The Amplicon output file supplies a single read depth value for each amplicon that reflects the number of sequence reads that bin to the amplicon based on alignment positions (SoftGenetics, personal communication). The Amplicon out file also includes the count of reads per sample that were not binned to an amplicon, and as these counts were generally less than 0.5% of the total aligned read count per sample (max 0.53% for the samples in Runs 1–3), the read count values for each amplicon were used directly to examine relative amplicon read depth. For each sample, amplicon read depth was assessed as the percentage of the read count across all amplicons (such that the percentages for amplicons 1–10 would sum to 100%), and the percentages from Runs 1–3 were averaged to produce a single value per amplicon per sample.
      True positive rates for minor variant detection at various minimum read depths and variant frequency thresholds were assessed using data from samples with known minor variants. These included three known, two hair, and three calcified tissue samples with point heteroplasmy previously detected by Sanger sequencing, and all mixture samples (n = 420 mixed positions). The minimum variant read depth thresholds assessed were 10, 25, 40, 55, 70, 85 and 100 reads, while the minimum variant frequency thresholds assessed included 2.5%, 5% and 10%. The analysis matrix was constructed to require a minimum of 100 reads (to equal the minimum total read depth for analysis) and a maximum of 1000 reads for point heteroplasmy detection. Thus, analyses pairing a 10 read threshold with either 2.5% or 5% variant frequency, and a 25 read threshold with a 2.5% frequency were not performed.
      False positive rates at the same seven read depth thresholds were assessed using all 128,688 possible nucleotide positions from the 112 total negative controls. For these data, detection of sequence above the read depth threshold was counted as a false positive result.

      3. Results

      3.1 Accuracy and precision

      The accuracy and precision of the PowerSeq system was assessed with four experiments in which the same known samples (Table S1) were repeatedly typed by two different scientists on the same Illumina MiSeq instrument and by the same scientist on different MiSeq instruments. Following manual review of the data, the PowerSeq haplotypes for all 31 known specimens were consistent with their Sanger profiles, and consistent with one another across the four experiments. Overall, of the more than 140,000 total bp of CR data analyzed, all of the 1190 variants expected based on the known Sanger CR haplotypes were accurately detected in the PowerSeq haplotypes. The only discrepancies between the Sanger and PowerSeq haplotypes pertained to 1) a software alignment error for a single sample that, prior to manual review of the data, resulted in an incorrect 310.2C variant in all four experiments, and 2) heteroplasmies detected via PowerSeq typing that were not apparent in the Sanger data.
      The 310.2C alignment error was easily detected in the first run of the sample in question, as well as all subsequent runs of that particular sample (sample 16; see Fig. S1 and Table S5). The error was identified following initial application of the GeneMarker HTS analysis parameters, which resulted in an alignment that yielded an incorrect variant in the cytosine stretch of HV2. The 310.2C insertion was not only identified as unusual in the variant table, but the low frequency of the variant (7.55–24.75% depending on the experiment) also stood out as atypical. In each case, the low frequency of the 310.2C insertion corresponded to a reduction in the expected variant frequency for the 315.1C insertion - which in the other 30 known samples always exceeded 97% (Table S5). Review of the alignment for sample 16 within the GeneMarker HTS software indicated that several sequence reads had the common 315.1 C insertion placed at the 5′ end of the C-stretch, instead of 3′ as is common practice and recommended by the SWGDAM Interpretation Guidelines for Mitochondrial DNA Analysis by Forensic DNA Testing Laboratories [
      • Scientific Working Group on DNA Analysis Methods (SWGDAM)
      Interpretation Guidelines for Mitochondrial DNA Analysis by Forensic DNA Testing Laboratories.
      ]. Further review of the data indicated that the misalignment of the particular reads was likely due to sequencing errors in those reads. Those errors may, in turn, have been due to the specific haplotype for sample 16, which includes the variants 295 T, 310.1 T and 319C within the misaligned region. Manual correction of the misaligned reads by the reviewing scientist (movement of the C insertion to the 3′ end of the C-stretch) produced a PowerSeq haplotype consistent with the Sanger haplotype.
      The only other discrepancies between the PowerSeq and Sanger haplotypes related to the detection of heteroplasmies (Table S6). Six heteroplasmies that went undetected in the Sanger data were detected in the PowerSeq data. They corresponded to three length heteroplasmies and three point heteroplasmies, and they were consistently detected in all NGS replicates of the affected samples. Given that in all six cases the heteroplasmies exhibited variant frequencies of less than 20%, the differences can primarily be ascribed to the increased sensitivity of the NGS assay as compared to Sanger sequencing. However, the greater resolution of NGS data is also likely a factor in the increased detection. Additionally, in the two samples with a 16189 T to C transition that produced a homopolymer stretch longer than 7 cytosines (samples 39 and 115), length variants at position 16193 were included in the PowerSeq haplotypes that were not recorded in the Sanger haplotypes. However, this difference was due solely to current FBI Laboratory guidelines for the trimming and reporting of HV1 C-stretch length variants in Sanger data when a 16189C is present.
      To get a finer look at the performance of the assay across scientists and instruments, a pairwise comparison of the PowerSeq data from the accuracy and precision experiments was performed to determine the maximum difference in variant frequency across the replicates of each sample. That is, the frequency for a given variant in a given sample was directly compared across all four replicates of that sample to identify the greatest pairwise difference in frequency for each variant. Though overall average variant frequency differences across the four runs were also calculated, the maximum difference was selectively plotted (Fig. 1) to better characterize the maximum variation that may be seen across instruments and platforms. Regardless of whether the PowerSeq typing was repeated by the same individual on the same MiSeq instrument, performed by a different individual on the same MiSeq instrument, or repeated on a different MiSeq instrument, the variant frequency percentages were highly consistent. For substitutions and point heteroplasmies, the overall average difference in variant frequency across the four runs was only 0.3%. For insertions/deletions (indels) and length heteroplasmies, the overall average difference was slightly higher, at 1.5%. This was not a surprising result, given the sequencing and alignment issues associated with these types of variants [
      • Parson W.
      • Strobl C.
      • Huber G.
      • Zimmermann B.
      • Gomes S.M.
      • Souto L.
      • et al.
      Evaluation of next generation mtGenome sequencing using the Ion Torrent Personal Genome Machine (PGM).
      ,
      • Just R.S.
      • Irwin J.A.
      • Parson W.
      Mitochondrial DNA heteroplasmy in the emerging field of massively parallel sequencing.
      ,
      • Sturk-Andreaggi K.
      • Peck M.A.
      • Boysen C.
      • Dekker P.
      • McMahon T.P.
      • Marshall C.K.
      AQME: A forensic mitochondrial DNA analysis tool for next-generation sequencing data.
      ]. When only the maximum differences between these various pairwise comparisons were considered, substitutions and point heteroplasmies (N = 1068) had an average maximum difference of just 0.67%, with the greatest difference only 4.53%. Again, variant frequencies were generally less consistent for indels and length heteroplasmies. Nevertheless, the maximum difference for indels and length heteroplasmies (N = 153) between any two replicates of the same sample was only 7.9%, and averaged 2.97%.
      Fig. 1
      Fig. 1Precision: Maximum variant frequency differences.
      The box and whisker plots display the maximum difference in the variant frequencies determined for each expected sample variant considering four replicates per sample. The data are derived from four runs of each of 31 known samples typed as part of the precision and accuracy experiments. Differences in variant frequencies were generally lower for substitutions and point heteroplasmies as compared to indels and length heteroplasmies, and thus the variant types are considered separately.
      Data from three of these runs (Runs 1–3, which were independent amplifications) were also used to examine more general features of assay performance, including amplicon balance and sample throughput potential. The average percentage of sample reads per amplicon is plotted in Fig. S2. When displayed by sample, the average values show several dips and one notable spike. The haplotypes for each of the known samples were reviewed to determine whether these differences in amplicon read depth could be due to sequence variants within primer binding regions. Considering the 25 bp up and downstream from the end of the amplicon ranges defined in the GeneMapper HTS analysis settings (Table S4), a total of 14 potential primer binding site variants were identified. Of these, nine were associated with dips in amplicon read depth, and one (239C) was associated with the spike for Amplicon 8 in sample 31. When these data points were removed as outliers, eight of the ten amplicons were relatively balanced with one another, with read depth averages ranging from 7.1 to 11.7% of the reads per sample. Amplicon 9 had notably higher coverage, averaging 19.1% of the reads per sample, while Amplicon 1 exhibited the lowest coverage at 5.4% of sample reads.
      The absolute number of reads aligned per sample varied by both run and by sample in these studies, which included 31 knowns and 1 negative control per run. On average 286,860 reads were aligned per sample, but the standard deviation was 181,941 reads, and the minimum read count was 30,807 (with a 9948 positive control). Yet, these read counts were sufficient to obtain complete and correct CR sequences for all samples. Excepting the amplicons likely impacted by primer binding site variants (detailed in Fig. S2), the lowest read depth produced for any sample was 1667 reads in Amplicon 1. These data indicate that pooling three times as many known samples – for instance, from a full 96-well plate – would produce a minimum depth greater than 500 reads even for the amplicon with the lowest efficiency (Amplicon 1) in a low coverage sample. Prior work has shown that the same sequencing instrumentation and kit can produce complete mtGenome sequences for at least 90 samples [
      • Peck M.A.
      • Brandhagen M.D.
      • Marshall C.
      • Diegoli T.M.
      • Irwin J.A.
      • Sturk-Andreaggi K.
      Concordance and reproducibility of a next generation mtGenome sequencing method for high-quality samples using the Illumina MiSeq.
      ].

      3.2 Sensitivity and case-type samples

      Sensitivity of the PowerSeq system was evaluated using three sample types (a positive control DNA, a buccal swab extract and a hair extract), with two amplifications performed per DNA quantity using each of the three PCR cycle numbers. For each sample, a successful PowerSeq typing result was defined as one that both a) had a PowerSeq haplotype concordant with the known mtDNA CR sequence, and b) exhibited read depths for all 1149 bp of the CR equal to or above the 100 read threshold defined in the GeneMarker HTS analysis settings (Table S4).
      Both replicates of all three sample types were successfully typed with the PowerSeq system at all PCR cycle numbers when the total mtDNA copy number input for amplification was ∼50,000 or greater (Fig. 2). Aside from one complete failure (likely due to pipetting error during amplification setup), the positive control and buccal swab extracts were also successfully typed at all PCR cycle numbers when 5000 mtDNA copies were used for amplification. While both hair replicates were successfully typed from 5000 mtDNA copies when 30 PCR cycles were used, 36 and 40 PCR cycles resulted in incomplete CR coverage (ranging from 49 to 94%) and exhibited some stochastic variation.
      Fig. 2
      Fig. 2Sensitivity: Fraction of CR sequenced for dilution series of three sample types across varying PCR cycle numbers.
      Values reflect the percentage of the CR with sequence read depth greater than the 100 read minimum coverage threshold. One positive control (2800M) replicate that failed completely using 5000 mtDNA copies and 30 PCR cycles was most likely due to a pipetting error during amplification setup.
      PCR replicates at the lower DNA inputs tested (500 and 50 mtDNA copies) also exhibited stochastic variation. With a DNA input of 500 mtDNA copies, nearly all samples resulted in at least some portion of the CR with read depth ≥ 100 reads, but only one (a positive control replicate) met both criteria for successful PowerSeq typing. No samples were successfully typed when 50 mtDNA copies were used as the total DNA input for amplification. However, a few samples resulted in segments of the CR with read depths ≥ 100. Three of the positive control samples had from 8 to 14% of the CR with sufficient sequence read depth when 36 or 40 PCR cycles were used, and one attempt at typing the buccal swab extract with 40 PCR cycles resulted in 108 bp of CR data (3% of the CR) with read depth exceeding 100.
      Regardless of the sample type, no clear patterns in the fraction of the CR with read depth ≥100 reads were apparent across the three different PCR cycle numbers (Fig. 2). This result may variously be due to the manner in which sequence read depth was assessed, the mtDNA copy numbers used as PCR input, and the sample types tested. Nevertheless, the data clearly indicated at least some variability in the total fraction of the CR with sufficient sequence read depth at lower DNA inputs, and fully successful PowerSeq typing at these low DNA inputs was rare (1 in 36 instances at 500 mtDNA copies or fewer). In all instances in which any CR region had read depths equal to or greater than the 100 read threshold, however, the PowerSeq typing results matched the known Sanger haplotype for the sample.
      To investigate the performance of the PowerSeq assay on the sample types regularly encountered in the FBI Laboratory’s mtDNA casework, 11 non-probative hair and 10 non-probative calcified tissue DNA extracts were also sequenced using the PowerSeq system and compared to Sanger-based mtDNA data. The Sanger data consisted of either HV1 and HV2, or the full CR (Table S2). For seven of the samples (four hair and three calcified tissue), a direct side-by-side comparison of Sanger and PowerSeq typing using the same extracts and mtDNA inputs was performed. For all testing, the mtDNA copy numbers used for amplification ranged from approximately 2,200 to 820,000, and 30 cycles were used for PCR. These non-probative sample studies also served to test the stability of the assay with respect to the sample types to which it will be applied, and therefore the various chemical and environmental insults that could potentially impact sample analysis.
      Sequence read depth ≥100 reads across the full CR was achieved for 20 of the 21 samples. These 20 included all four hair DNA extracts and two of the three calcified tissue DNA extracts used for the Sanger/PowerSeq side-by-side comparisons. In contrast, complete CR data by Sanger sequencing was achieved for only two of the four hair DNA extracts from the side-by-side comparison, and for none of the three calcified tissue DNA extracts (Table S2).
      With the exception of some heteroplasmic/mixed positions, the haplotypes obtained via PowerSeq typing were consistent with the available Sanger data for each sample (Table S6). For four samples, the PowerSeq haplotype included a 309.2C variant with a frequency around or below 20% that was not called from the Sanger data. Additionally, Tooth MT2 had length variants at position 16193 in the PowerSeq haplotype that were not recorded in the Sanger haplotype. However, as was described for two known samples typed for the accuracy and precision studies (Section 3.1), this difference was due only to current FBI Laboratory guidelines regarding the trimming and reporting of HV1 C-stretch length variants in Sanger data when a 16189C is present. Bone TB1 × 3 was previously deemed a mixture from Sanger sequence data developed from a different extract, and again appeared mixed by PowerSeq typing (minor variants were detected at 8 CR positions), but some positions that appeared mixed in the Sanger data were not reproduced above variant calling thresholds in the PowerSeq typing results. This result is consistent with expectations for a different extract and amplification from a sample with DNA damage.
      The only noteworthy differences between the Sanger and PowerSeq results occurred with Tooth MT1 and Bone 1A-Cut 2. The Sanger data for Tooth MT1 exhibited a low-level point heteroplasmy at position 16126. The PowerSeq data developed from a different extract of Tooth MT1 exhibited the minor 16126 variant in only 4.77% of reads, and thus the heteroplasmy fell below the 10% variant frequency threshold. For Bone 1A-Cut 2, two additional mixed positions within the HV1 C-stretch region were detected in the PowerSeq data as compared to the Sanger haplotype. One of these (16189Y) could be seen in the Sanger data, however it was observed in only a single sequence direction and thus was not called. In addition, the presence of length heteroplasmy around position 16193 was clear (though not called) in the Sanger data, but was observed in the PowerSeq data only below the variant calling thresholds. Of note, the HV1 C-stretch region for Bone 1A-Cut 2 had very low read depth: Amplicon 2, which includes the region, had an average depth of less than 150 reads, which was more than 10-fold lower than the read depth of the next lowest amplicon for this sample (Amplicon 8 with an average read depth of 1975 reads).
      A second extract for Bone 1A was the single case-type sample that did not have read depth ≥ 100 reads across the full CR: PowerSeq typing of the Bone 1A-Cut 1 extract generated sufficient read depth for only 793 bp (69%) of the CR. The portions of the CR with <100 reads occurred in the regions targeted by the two largest PowerSeq amplicons (237 and 217 bp), as well as in the HV1 C-stretch region. Read depths for the larger target fragments was low overall for Bone 1A-Cut 1, a result that could be due to a variety of factors, including that the Bone 1A-Cut 1 extract used for PowerSeq typing may be composed mostly or only of fragments smaller than these amplicon sizes. Poor yields with Bone 1A-Cut 1, due to stochastic sampling and/or PCR product loss during clean-up, was indicated by a low PowerSeq Quant MS concentration (11.6 nM) for the cleaned amplification product, as compared to the other cutting from the same bone (Bone 1A-Cut 2). With approximately the same mtGenome copy number used for PCR input, Bone 1A-Cut 2 was quantified at 76.15 nM following amplification and resulted in read depths ≥100 reads across the full CR.
      When considered together, the PowerSeq typing results obtained for the non-probative samples were consistent with those obtained for the sensitivity samples amplified using the same number of PCR cycles (Fig. 3). With the exception of the Bone 1A-Cut 1 extract, all samples amplified using 30 PCR cycles from ∼2000 or more mtDNA copies produced complete CR haplotypes. We note that the studies described here did not include any samples with mtDNA copy inputs between 500 and ∼2,000, and no calcified tissue samples were amplified from less than ∼5000 mtDNA copies. However, the sensitivity and case-type sample results considered in combination indicated that the sensitivity of the PowerSeq assay using the manufacturer’s protocol is likely in the range of ∼1,000–5,000 mtDNA copies when DNA degradation does not preclude successful amplification.
      Fig. 3
      Fig. 3Sensitivity: Fraction of CR sequenced for dilution series and case-type samples amplified using 30 PCR cycles.
      The plot includes both sensitivity data and non-probative case-type specimens amplified using 30 PCR cycles. The data points with specific values indicate a) the minimum mtDNA copy number for which a full CR sequence was obtained (2,169 mtDNA copies, from hair sample 186-1), and b) the only sample with a higher mtDNA copy number input for which a full CR sequence was not obtained (bone sample 1A-Cut1, with 19,885 copies).

      3.3 Species specificity

      The specificity of the assay to human DNA was tested using DNA extracts from seven different species, including dog, cat, pig, cow, chicken, E. coli, and the closest genetic relative of the human - the chimpanzee. All extracts were amplified in duplicate using the maximum possible input volume. Some aligned DNA sequences were observed with the dog, cat, pig, chicken, and cow samples, but the extent was consistent with those seen in NCs from the contamination studies (i.e. zero to four reads) and thus can be considered low-level contamination. For these samples there were no instances of read depth above the 100 read threshold. Of two E. coli replicates, one had 6 bp of analyzable data (each base was exactly at the 100 read threshold), while the second had 91 bp of data above the 100 read threshold (max read depth was 108 reads). There were no differences from the rCRS in the analyzable sequence data. A NCBI GenBank BLASTn search against E. coli sequence did not produce any matches. Thus the result was most likely due to human DNA contamination as opposed to E. coli DNA being amplified by the PowerSeq primers.
      The chimpanzee DNA extract produced ∼506 bp of CR sequence with read depth above the 100 read threshold. These data were in four amplicons and were not due to contamination. A NCBI GenBank BLASTn search of the sequence resulted in a 99% match to chimpanzee mtDNA. There were 37 differences from the rCRS in the 506 bp. Most of the variant positions were ones that a) are not usually found in human profiles, and b) are not consistent with any currently recorded human mtDNA haplogroups [
      • van Oven M.
      • Kayser M.
      Updated comprehensive phylogenetic tree of global human mitochondrial DNA variation.
      ], PhyloTree rCRS oriented version of build 17), making the sequence readily identifiable as non-human. Overall, the PowerSeq™ System exhibited the expected degree of specificity to human DNA.

      3.4 Mixtures

      To assess the sensitivity of the system to mtDNA mixtures and characterize the features of the mixed sequences, two known samples (54 and 57B; mtDNA haplogroups K1b2a1a1 and L1c5, respectively) were mixed in the ratios of 1:1, 1:2, 1:5, 1:10, and 1:20 and amplified in duplicate. The samples were selected to maximize the number of mixed positions to assess, and each sample served as both the major and minor contributor. Given the haplotypes of samples 54 and 57B, 21 mixed positions were possible when length differences in the 523–524 AC repeat region were ignored.
      Initial review of the mixture data indicated that the variant percentages of the two contributors generally mirrored expectations at all but one of the 21 positions. At position 16223, the minor contributor variant percentage was notably lower than expected in all mixture replicates in which sample 57B was the major contributor, and in both of the 1:2 mixture replicates in which sample 54 was the major contributor (Table S7). The position 16223 variant percentages were also highly imbalanced in the 1:1 mixtures. Across the four 1:1 replicates, the variant percentage for the sample 57B variant (16223 T) averaged 71.66% (s.d. 0.65%) while the sample 54 variant (16223C) averaged 28.00% (s.d. 0.58%). Given the very limited overlap between the Amplicon 2 and 3 data ranges (2 bp, excluding the primer regions) and the fact that the haplotypes differed at two positions within and around the overlap region (16223 and 16224), it was assumed that the most likely cause of the discrepancy was bioinformatics. Accordingly, adjustments were made to the Amplicon 3 data analysis range. The Amplicon 3 range was adjusted by a single base pair to increase the overlap to 3 bp so that both amplicon ranges capture both variant positions (Table S4). When the mixture data were reanalyzed in GeneMarker HTS with the adjusted Amplicon 3 data range, the variant percentages at position 16223 were consistent with the other 20 mixed positions (Tables 1 and S8).
      Table 1Average variant frequencies across 21 mixed positions.
      Average Variant Frequency (s.d.)
      1:149.79% (5.69%)
      MajorMinor
      1:267.40% (4.26%)32.14% (4.24%)
      1:583.77% (3.23%)15.80% (3.17%)
      1:1091.49% (2.29%)8.04% (2.15%)
      1:2094.83% (1.75%)4.76% (1.64%)
      With a 10% variant calling threshold and a 40-read variant allele read depth threshold, mixed sequence data representing the profiles of the two component knowns was observed in the majority of the mixed extracts (Table S8). In addition, when using a criterion of 3 or more positions of mixed data across the CR to distinguish a mixture of two individuals from multiple points of heteroplasmy within a single individual [
      • Scientific Working Group on DNA Analysis Methods (SWGDAM)
      Interpretation Guidelines for Mitochondrial DNA Analysis by Forensic DNA Testing Laboratories.
      ,
      • Parson W.
      • Gusmao L.
      • Hares D.R.
      • Irwin J.A.
      • Mayr W.R.
      • Morling N.
      • et al.
      DNA Commission of the International Society for Forensic Genetics: revised and extended guidelines for mitochondrial DNA typing.
      ], “mixtures” were identified in each of the 1:1, 1:2 and 1:5 mixture replicates, and in half of the 1:10 mixture replicates. Depending on the mixture and the proportion of the minor component, mixed positions detected above the thresholds were, as expected, observed at variable numbers of positions. For the 1:1, 1:2 and 1:5 mixtures, all 21 expected positions were detected at or above 10% when either sample 54 or sample 57B was the major contributor. For the 1:10 mixture replicates in which sample 54 was the major contributor, seven and eight out of the possible 21 mixed positions were detected, respectively; whereas for the 1:20 mixture replicates, one and zero mixed positions were detected. For the 1:10 and the 1:20 mixtures in which known sample 57B was the major contributor, only one mixed position in one 1:10 mixture was detected above the variant calling thresholds.
      For the 1:1 mixtures, the variant percentages of the two contributors were within 20% of each other in all but one instance (the maximum difference of 20.52% was observed at position 16114), and averaged 10.03% (range of 0.15–20.52%, s.d. 5.31%). Similar patterns were observed with the 1:2, 1:5 and 1:10 mixtures. For example, at SNP positions in the 1:2 mixtures, the minor component exhibited variant percentages hovering around the expected 33%, with values between 23.37 and 41.47% (s.d. 4.24%). Though the detection of variants at frequencies below 10% was not formally “validated” due to the levels of noise observed, the frequencies of the minor component variants in the 1:10 and 1:20 mixtures were reviewed to assess how well the data reflected expectation. With an expected minor component frequency of approximately 4.8% in the 1:20 mixtures, variants were observed at percentages between 1.98 and 10.09% (s.d. 1.64%) and averaged 4.76%. Overall, and across all mixtures tested, observed variant frequencies were within 10.42% of expected variant frequencies and, on average, differed by 3.01%. Of course, the observed values depend directly on the accuracy of the sample quantitation and, in turn, the constructed mixture ratios. In all mixture replicates in this study, when sample 54 was the minor contributor, the variant frequencies tended to be slightly lower than when sample 57B was the minor contributor. For instance, in the 1:1 mixture replicates, the sample 54 variant frequencies averaged 44.95% (s.d. 3.01%) whereas the sample 57B variant frequencies averaged 54.64% (s.d. 2.63%), indicating a mixture ratio of 1:1.2 rather than 1:1. This apparent disparity in starting DNA quantities between the two samples indicates that some of the results here may be a conservative estimate of the consistency that can be expected between actual and observed variant frequencies at heteroplasmic SNP positions or in mixed DNA samples.

      3.5 Contamination

      Detection of exogenous DNA originating from reagents, consumables, the operator and/or the laboratory environment was assessed based on all amplification negative controls sequenced throughout the validation, including an experiment comprised solely of negative controls. The experiment (Run 11; Table S3) consisted of one positive control sample and 47 total negative controls amplified and sequenced in a single library pool. In this experiment, none of the 47 negative controls exhibited ≥ 100 reads at any CR position. The 2800 M positive control sequenced in the same experiment produced a full mtDNA CR sequence that matched the expected haplotype.
      To assess the potential for well-to-well contamination when PowerSeq sample processing is performed in 96-well plates, a second study utilized a ‘checkerboard’ plate setup for the amplification and library preparation of 24 positive control samples and 24 negative controls. All 24 of the positive control replicates produced full CR haplotypes. While 20 of the 24 negative controls did not result in any CR positions with read depths exceeding the 100 read threshold, the remaining four negatives had sequence read depths ≥100 in portions of the CR that corresponded to the small regions of overlap among the PowerSeq amplicons (Fig. 4A).
      Fig. 4
      Fig. 4Contamination: Effect of removing sequence reads smaller than 90 bp.
      Panel A displays the read coverage across the CR obtained for negative control NC-20 when no sequence read length filtering was employed. Read depth exceeding the 100 read minimum coverage threshold can be seen in the small regions of overlap between Amplicons 3 and 4, Amplicons 5 and 6, and Amplicons 9 and 10. Following removal of sequences smaller than 90 bp (panel B), negative control NC-20 had no CR positions at which read depth exceeded the 100 read threshold, and positions at which depth exceeded the 40 read variant calling threshold (used for minor variant comparisons only) occurred exclusively within the smallest amplicon (Amplicon 9, at 144 bp).
      These regions of higher read depth appeared to result at least in part from non-target “micro-amplicons”, created by the forward primer of one targeted amplicon and the reverse primer of a second, overlapping amplicon, in this multiplexed system. At less than 90 bp, these micro-amplicons are smaller than all of the target PowerSeq amplicons (the smallest of which is 144 bp), and thus may be more likely to be amplified when low levels of contaminating DNA are present. To eliminate any such non-target sequences, the 48 FASTQ files from the checkerboard study were pre-processed using the CLC Genomics Workbench software (Qiagen, Germantown, MD, USA) to remove sequence reads ≤ 90 bp. This read filtering step, performed prior to a re-analysis within GeneMarker HTS, successfully removed the micro-amplicon sequences (Fig. 4B). With the addition of this step, all 24 of the positive control replicates still produced full CR haplotypes.
      To address the possibility of sample crosstalk when low numbers of samples are sequenced and/or samples of varying quantity and quality are sequenced together [
      • Marshall C.
      • Sturk-Andreaggi K.
      • Daniels-Higginbotham J.
      • Oliver R.S.
      • Barritt-Ross S.
      • McMahon T.P.
      Performance evaluation of a mitogenome capture and Illumina sequencing protocol using non-probative, case-type skeletal samples: Implications for the use of a positive control in a next-generation sequencing procedure.
      ,
      • Sleep J.A.
      • Schreiber A.W.
      • Baumann U.
      Sequencing error correction without a reference genome.
      ,
      • Wang B.
      • Wan L.
      • Wang A.
      • Li L.M.
      An adaptive decorrelation method removes Illumina DNA base-calling errors caused by crosstalk between adjacent clusters.
      ], an additional test was constructed to mimic the minimum number of samples (8) that will be sequenced given the laboratory’s internal workflow. For this study, an eight-sample set consisting of four positive controls, one calcified tissue extract with a known CR sequence, and three negative controls were amplified and their libraries pooled for sequencing. As expected, the four positive control replicates and one calcified tissue extract all produced full PowerSeq haplotypes that were concordant when compared to the corresponding known (Sanger-based) mtDNA profiles. None of the three negative controls had sequence read depths above the 100 read threshold at any position. These findings were consistent with results from larger library pools (e.g., 32 or 48 samples), and indicated that for the PowerSeq CRM Nested System the size of the sample pool does not affect the recovery of sample sequence or increase/decrease read depths for negative controls with the analysis parameters used.
      Considering the 112 total amplification negative controls processed across all studies in this validation, nine (8.18%), had read depths exceeding the 100 read threshold in one or more portions of the CR. In one such negative control, these data occurred exclusively in the amplicon overlap regions and were resolved by the CLC Genomics Workbench filtering of reads ≤90 bp, as described above. The remaining eight negative controls still had regions above the 100 read threshold following removal of the micro-amplicon reads. Most occurred either within the 8 bp overlap of Amplicons 9 and 10 (four negative controls), or in both the same 8 bp overlap region and the smallest amplicon of the assay (amplicon 9, at 144bp). Only one amplification negative had regions above the 100 read threshold that also exceeded 10% of the read depth for the corresponding regions in a test sample within the same sequencing run. These negative control data were concordant with the sample (Bone 1A-Cut 1) in the relevant region. A further 22 negative controls had regions above the 40 read threshold used for minor variant detection, however the only sample with minor variants impacted by these findings was a bone sample previously identified as mixed (TB1 × 3). With this sample, the minor variants at two positions were consistent with the negative control sequence for the region (which also matched the positive control).
      To further assess the potential to recover sequence above threshold in negative controls that could impact variant calls in samples, data for all 128,688 possible positions of the 112 negative controls were analyzed. The average read depth per position was 9.00 reads with a s.d. of 20.65, which resulted in an average +3 s.d. of 70.65 reads. The false positive rate (detection of sequence above threshold at any position) ranged from 0.95% at a 100 read threshold to 24.96% when using a 10 read threshold (Table S9). For comparison to a true positive rate for minor variant detection, the negative control results were weighed against the detection of known point heteroplasmies from the known and case-type samples, and known mixed positions in samples from the mixture study, at the same read count thresholds (Table S9). Regardless of the variant detection percentage applied, increasing the read count threshold had little negative impact on minor variant detection sensitivity.
      Bioinformatic negatives (the use of index combinations that do not correspond to any sample in the run) were used to monitor the incidence of sequencing or demultiplexing errors, and their potential impact on achieving accurate sequencing results. For the 64 bioinformatic negatives in Run 1, the total number of reads assigned to each index combination ranged from zero to four. Bioinformatic negatives were also used to detect sample contamination/carryover from the preceding sequencing run. As described in Section 2.4, Run 2 was performed immediately following Run 1, and included the set of 32 index combinations assigned to Run 1 libraries but not used for Run 2 samples. For these 32 bioinformatic negatives, the total number of reads assigned to each index combination again ranged from zero to four. In both tests, then, the recovered reads fell well below the minimum calling threshold of 100 reads, as well as the variant allele threshold of 40 reads, and thus would not alter the haplotype for a sample with read depths above these thresholds.

      4. Discussion

      Overall, the data produced in this study indicate that 100% accuracy can be achieved with the PowerSeq mtDNA CR NGS assay. The only inaccurate results occurred prior to analyst review of the data and were the result of a misalignment of a C insertion at the 5′ end of a HV2 C-stretch, rather than at the 3′ end. This error occurred with one sample only, and consistently across all four replicates, indicating the misalignment was related to the sample haplotype. While similar misalignment could potentially occur with CR haplotypes not represented in this study, the 5′ misalignment of a C insertion in some sequence reads was easy to both recognize and manually correct. Following review of the data and manual correction, all expected variants for the 31 known samples were accurately represented in each of the replicate PowerSeq profiles. The PowerSeq System also showed some increased sensitivity as compared to Sanger sequencing, exhibiting lower-level heteroplasmies (both point and length) not apparent in the Sanger typing results.
      High precision, in terms of both repeatability and reproducibility was also observed. Regardless of the scientist who performed the typing or the MiSeq instrument used for sequencing, variant frequency values were quite consistent across sample replicates. Across runs, variant frequencies differed by less than 0.5% more than 75% of the time for substitutions and point heteroplasmies, and averaged 0.3% for these variant types. Frequencies were generally more variable for length heteroplasmies and insertion/deletion (indel) polymorphisms, as was expected due to sequencing and alignment issues generally associated with these types of variants. However, differences in these frequency values across runs still averaged less than 1.5%.
      Based on data from more than two years of mtDNA casework, the FBI currently follows guidelines that recommend inputs of 40,000 mtDNA copies from hair extracts and 300,000 mtDNA copies from calcified tissue extracts for successful amplification of the entire CR using a single primer set. Given the smaller amplicons in the PowerSeq assay, it was expected that these input thresholds could be reduced while still achieving complete CR data. Indeed, data from the sensitivity and non-probative case sample experiments demonstrated that complete CR haplotypes could be routinely obtained from total inputs as low as 2,000–5,000 total mtDNA copies when 30 cycles were used for PCR. This was regardless of sample type. Though success in this study with inputs of ∼500 mtDNA copies was highly variable, with the fraction of the CR sequenced ranging from 1 to 100% in any given case, mtDNA inputs between 500 and 2000 mtDNA copies were not tested. It is likely that complete control region haplotypes can routinely be recovered with DNA inputs less than 2000, but greater than 500 mtDNA copies. The degradation state of any particular sample will also, of course, affect the recovery of complete control region data, with some samples yielding incomplete typing results when severe degradation is a factor. In the 21 non-probative case samples tested here, which broadly represented the quantity and quality of specimen routinely encountered in casework, this scenario was encountered only once. For this particular bone sample (Bone 1A-Cut 1), and as would be expected from a more degraded specimen, no data were generated for the two largest (>200 bp) amplicons. The amplicons were recovered from a replicate sampling, however. This result and results from the sensitivity testing may suggest that a workflow with duplicate amplification could be used with samples that have lower total mtGenome inputs to reduce the need for later reprocessing.
      Interestingly, no consistent improvements in sequencing success were observed with increased PCR cycle numbers. In fact, some results indicated that with lower DNA inputs additional PCR cycles could have a detrimental effect. This is likely the result of increased primer and adapter dimers relative to target amplicons, which ultimately outcompete target reads at the sequencing stage. The design of this particular Promega assay, which combines the target marker amplification and adapter/index incorporation amplification into a single PCR probably offers greater opportunity for dimer formation than other assay designs that separate the target amplification and library preparation steps. Experiments with an earlier version of the PowerSeq assay that separated the two amplification steps indicated that such a design was more resistant to dimer interference overall, and with increasing target amplification cycles in particular. However, the potential increased sensitivity of an assay with separate amplification steps comes at the expense of workflow simplicity. And, with the 30 cycle “nested” protocol yielding complete mtDNA control region data from as few as 2000 mtDNA copies, sensitivity is already substantially increased relative to current Sanger workflows. In fact, the results from the sensitivity and non-probative casework experiments in the current study suggest an overall increase in sensitivity of 20-60x as compared to the FBI Laboratory’s current Sanger CR workflow (i.e. 2,000 mtDNA copies needed for complete CR recovery, versus 40,000 or 300,000).
      Further evidence of this increased sensitivity was seen in the side-by-side comparison of Sanger and PowerSeq CR sequencing using the same DNA extracts from non-probative casework type specimens. Of the four hair DNA extracts tested with both systems, only the two samples with mtDNA copy inputs >40,000 produced a full mtDNA CR haplotype using the Sanger system, while PowerSeq typing produced complete results for all four extracts. For the three calcified tissue DNA extracts tested in parallel, all failed to produce full CR haplotypes by Sanger typing, despite mtDNA copy inputs >300,000 mtDNA for two of the samples. Yet, these same two calcified tissue extracts resulted in full CR results using the PowerSeq system.
      Mixed mitochondrial sequence data have historically not been interpreted in forensic casework due to the fact that Sanger sequencing is not quantitative. Inconsistencies in dye terminator base incorporation [
      • Parker L.T.
      • Deng Q.
      • Zakeri H.
      • Carlson C.
      • Nickerson D.A.
      • Kwok P.Y.
      Peak height variations in automated sequencing of PCR products using Taq dye-terminator chemistry.
      ] make it challenging to associate major/minor component variants with each other. And though the phasing of variants can sometimes be inferred based on phylogenetic signatures [
      • Bandelt H.J.
      • Lahermo P.
      • Richards M.
      • Macaulay V.
      Detecting errors in mtDNA data by phylogenetic analysis.
      ,
      • Bandelt H.J.
      • Quintana-Murci L.
      • Salas A.
      • Macaulay V.
      The fingerprint of phantom mutations in mitochondrial DNA data.
      ,
      • Just R.S.
      • Irwin J.A.
      • Parson W.
      Questioning the prevalence and reliability of mitochondrial DNA heteroplasmy from massively parallel sequencing data.
      ], it is difficult to definitively establish the sequences of the component haplotypes [
      • Irwin J.A.
      • Just R.S.
      • Parson W.
      Massively parallel mitochondrial DNA sequencing in forensic genetics: principles and opportunities.
      ]. NGS is likely to change the feasibility of mtDNA mixture analysis because NGS data are by their very nature quantitative, and the phasing of variants can generally be easily discerned from the data. This is particularly true when variants are present within a single amplicon or sequence read. Previous studies have demonstrated the utility of NGS for these purposes [
      • Holland M.M.
      • McQuillan M.R.
      • O’Hanlon K.A.
      Second generation sequencing allows for mtDNA mixture deconvolution and high resolution detection of heteroplasmy.
      ,
      • Kim H.
      • Erlich H.A.
      • Calloway C.D.
      Analysis of mixtures using next generation sequencing of mitochondrial DNA hypervariable regions.
      ,
      • Vohr S.H.
      • Gordon R.
      • Eizenga J.M.
      • Erlich H.A.
      • Calloway C.D.
      • Green R.E.
      A phylogenetic approach for haplotype analysis of sequence data from complex mitochondrial mixtures.
      ,
      • Churchill J.D.
      • Stoljarova M.
      • King J.L.
      • Budowle B.
      Massively parallel sequencing-enabled mixture analysis of mitochondrial DNA samples.
      ], and though further in depth studies that fully characterize the limitations of the PowerSeq assay for this application are necessary, the results presented here offer additional support.
      Despite this clear promise, the benefits of mtDNA mixture deconvolution can likely only be realized in routine practice if mtDNA analysis is applied to a broader range of sample types. Forensic laboratories performing mtDNA analysis employ it primarily, if not exclusively, in cases involving calcified tissue and shed hair. These sample types are generally considered to be single-source, and are therefore processed/decontaminated in such a way that mixtures are generally avoided. Bones are routinely sanded and hairs are generally subjected to detergent and/or bleach washes. Because of the effort devoted to eliminating exogenous DNA, true mixtures simply are not encountered that often (e.g. only 8.7% in Ref [
      • Melton T.
      • Dimick G.
      • Higgins B.
      • Lindstrom L.
      • Nelson K.
      Forensic mitochondrial DNA analysis of 691 casework hairs.
      ]). A retrospective review of data from ∼100 mtDNA hair cases in our laboratory showed that only 2 or 3 were reported as mixtures. In addition, and because of the effort devoted to decontamination, it is likely that the mixed data are actually the result of DNA damage rather than true mixtures. Not only does the incidence of mixture tend to increase with hair age, but also replicate analyses of the same hair often produce different mixed sites [
      • Melton T.
      • Dimick G.
      • Higgins B.
      • Lindstrom L.
      • Nelson K.
      Forensic mitochondrial DNA analysis of 691 casework hairs.
      ] – a hallmark of DNA damage [
      • Willerslev E.
      • Cooper A.
      Ancient DNA.
      ].
      With such a low incidence of true mixture in current mtDNA casework, it would seem that the benefit of NGS-based mtDNA mixture analysis is not that it can finally be done. Instead, because it can be done, the range of samples, and therefore cases, for which mtDNA analysis may be useful can expand. For example, mtDNA information may prove useful in cases for which contributor number estimates based on STR data alone are inconclusive, or in the analysis of low template (e.g. touch and/or mixed) samples that yield only poor, partial STR profiles.
      Three data analysis parameters were adjusted during the course of the validation studies that improved the recovery of accurate information for known specimens while minimizing potentially contaminating data in amplification negative controls. One of these was the amplicon data ranges, which define the portions of each sequence that will be used for variant calling. The inaccurate variant frequencies in the mixture samples at position 16223 that resulted in part from the limited overlap between the Amplicon 2 and Amplicon 3 ranges were corrected by revision of the Amplicon 3 range by a single base pair to create a larger overlap and capture position 16223 data from both amplicons. While the original amplicon ranges did not negatively affect any single-source samples, and the correct variants were detected even in mixed samples, the amplicon range revision for Amplicon 3 will ensure accurate variant frequencies if heteroplasmy is encountered at position 16223 and should benefit any future studies that aim to develop guidelines for mtDNA mixture interpretation.
      A second modification that proved useful was the removal of sequences smaller than 90 bp. The overlapping amplicons of the PowerSeq CRM Nested System create the potential for non-target primer pairing during PCR. Given amplicon overlaps ranging in size from 52 to 89 bp (including primer binding regions), as compared to the target amplicon size range of 147–237 bp, it is clear that preferential amplification of the smaller overlap regions might occur in both samples and controls when contaminating DNA fragments are present. Filtering of sequences smaller than 90 bp removes any potential impact of small DNA fragments on the sample consensus sequence within the amplicon overlap regions. In this study, removal of these sequences also reduced the incidence of sequence read depths above threshold in amplification negative controls (see Fig. 4), with no impact on successful CR sequencing of the target samples. The length-based read removal is currently carried out in our laboratory using the CLCbio Genomics Workbench software prior to GeneMarker HTS analysis. However, an upcoming version of GeneMarker® HTS will enable read filtering based on sequence length (SoftGenetics, personal communication).
      The third data analysis parameter changed during the course of the validation was an increase in the minor variant frequency threshold from 5% (used for Runs 1–4) to 10%. Though the initial use of a 5% threshold did not result in any incorrect variants (excepting the sample 16 replicates that were due to misalignment), we ultimately elected to approach variant detection with more conservativism. The use of a 10% threshold clearly further reduces the chance of false detection of heteroplasmy due to contamination and/or noise. But, use of a threshold more similar to the minor variant detection levels expected with Sanger sequence data (roughly 10–20%; [
      • Parson W.
      • Gusmao L.
      • Hares D.R.
      • Irwin J.A.
      • Mayr W.R.
      • Morling N.
      • et al.
      DNA Commission of the International Society for Forensic Genetics: revised and extended guidelines for mitochondrial DNA typing.
      ,
      • Irwin J.A.
      • Saunier J.L.
      • Niederstatter H.
      • Strouss K.M.
      • Sturk K.A.
      • Diegoli T.M.
      • et al.
      Investigation of heteroplasmy in the human mitochondrial DNA control region: a synthesis of observations from more than 5000 global population samples.
      ]) also 1) enables us to maintain the use of current laboratory guidelines for mtDNA data interpretation, which are based on Sanger-level sensitivity to heteroplasmy detection, and 2) mitigates issues that might arise in sequence comparisons between previous Sanger and new NGS-based haplotypes that are due primarily to the increased sensitivity of NGS. At present, casework laboratories may use the number of mixed positions observed in a profile to distinguish between intraindividual (heteroplasmic) variation and a mtDNA mixture – the latter of which will not be interpreted. Although some studies have examined rates of heteroplasmy using detection levels below 10% [
      • Holland M.M.
      • Makova K.D.
      • McElhoe J.A.
      Deep-coverage MPS analysis of heteroplasmic variants within the mtGenome allows for frequent differentiation of maternal relatives.
      ,
      • Rebolledo-Jaramillo B.
      • Shu-Wei Su M.
      • Stoler N.
      • McElhoe J.A.
      • Dickens B.
      • Blankenberg D.
      • et al.
      Maternal age effect and severe germline bottleneck in the inheritance of human mitochondrial DNA.
      ,
      • Skonieczna K.
      • Malyarchuk B.
      • Jawien A.
      • Marszalek A.
      • Banaszkiewicz Z.
      • Jarmocik P.
      • et al.
      Heteroplasmic substitutions in the entire mitochondrial genomes of human colon cells detected by ultra-deep 454 sequencing.
      ,
      • Li M.
      • Schroder R.
      • Ni S.
      • Madea B.
      • Stoneking M.
      Extensive tissue-related and allele-related mtDNA heteroplasmy suggests positive selection for somatic mutations.
      ,
      • Cho S.
      • Kim M.Y.
      • Lee J.H.
      • Lee S.D.
      Assessment of mitochondrial DNA heteroplasmy detected on commercial panel using MPS system with artificial mixture samples.
      ], the quality of such studies has varied [
      • Just R.S.
      • Irwin J.A.
      • Parson W.
      Mitochondrial DNA heteroplasmy in the emerging field of massively parallel sequencing.
      ]. Further basic research is still needed to better characterize the incidence and patterns of heteroplasmy and the number of heteroplasmic positions that may be expected per individual (which may vary by sequence range, by tissue type, by age of the individual, etc.) at the levels of detection possible with NGS. Similarly, while the increased sensitivity of NGS may have the potential to enhance the value of heteroplasmic variation in standard sequence comparisons or for specific casework purposes [
      • Holland M.M.
      • Makova K.D.
      • McElhoe J.A.
      Deep-coverage MPS analysis of heteroplasmic variants within the mtGenome allows for frequent differentiation of maternal relatives.
      ], further work will be needed to develop confident match interpretation protocols that incorporate heteroplasmy.
      To ensure that these three changes in the analysis settings did not produce substantially different results, all of the data from the known samples (Runs 1–4) as well as the data from a set of the case-type samples (hairs) were re-analyzed using the updated GeneMarker HTS analysis settings. On average, the variant frequencies differed by just 0.05% and 0.30% for the known and case-type samples, respectively. The reanalysis resulted in no changes to the case-type sample haplotypes, and only a few changes to the known sample haplotypes (Tables S5 and S6). Specifically: 1) As expected, heteroplasmies detected in the known samples below 10% frequency (including the 310.2C alignment error, PHPs, and a 16193del) were no longer included in the haplotypes, and 2) a 309.2C variant not reported in the Sanger haplotypes was eliminated for some samples from some runs. This latter result was due to the removal of the reads smaller than 90 bp, which altered the allele balance ratio that is a component of variant detection. Ultimately, the use of these updated GeneMarker HTS settings for the samples produced PowerSeq haplotypes more similar to their comparative Sanger haplotypes.
      The specific strategy we developed to interpret negative controls aimed to mimic current Sanger-based guidelines that dictate the features of negative controls and reagent blanks that must exist for an associated sample to be interpreted (i.e. the blank/control does not analyze; the blank/control analyzes but is not of requisite quality for comparison purposes; or the blank/control analyzes, but the sequence is not in concordance with the sample). However, the approach also aimed to incorporate the quantitative information that is produced via NGS, and take into consideration the relative – rather than absolute – number of reads in negative controls and reagent blanks. All negative control data at or above 100 reads was considered potentially contaminating with respect to the associated samples, whereas negative control data between 40 and 99 reads was considered potentially contaminating with respect to minor variants only. But, because the noise in any given sample is proportional to signal, and because signal with this (or any) NGS assay can vary depending on workflow, a percentage-based threshold was used in addition to these static read count-based thresholds. For these studies, a percentage-based threshold for negative controls that mirrored the variant calling threshold (10%) was used. This approach ensured that any noise or contamination that may be present in the sample, but would not exceed the variant calling thresholds and thus would not impact variant calling, would not unnecessarily prevent use of the data.
      The negative control interpretation rules applied were as follows:
      If the negative control sequence is concordant with the associated sample sequence for any portion of the CR, the results in those regions are not used for comparison purposes. The sample sequence can only be used under the following conditions:
      • a
        the negative control read depth does not exceed the minor variant calling threshold (40 reads) in any portion of the CR;
      • b
        the negative control read depth exceeds the minor variant calling threshold (40 reads), but does not exceed the minimum read threshold (100 reads), and no minor variants were detected in the associated sample OR the sequence is not in concordance with the sample minor variant;
      • c
        the negative control read depth exceeds the minimum read threshold (100 reads), but the depth is less than 10% (minimum variant frequency) of the read depth for the corresponding portion of the associated sample, and therefore not of requisite quantity for comparison purposes;
      • d
        the negative control read depth exceeds the minimum read threshold (100 reads) and is greater than 10% (minimum variant frequency) of the read depth for the corresponding portion of the associated sample, but the sequence is not in concordance with the sample.
      Some amplification negative controls processed in the presence of multiple samples in the validation studies exhibited evidence of low-level contamination. However, for all but one of 112 total negative controls considered across many experiments, the contamination did not rise to a level that would preclude sample comparisons according to the data analysis and interpretation procedures described above. For the one negative control that exhibited contamination above defined thresholds and was consistent with the associated sample, the relevant region of the associated sample would not be usable for comparison purposes. This single instance of a negative control exhibiting contamination that would prevent comparison (0.91% of amplification negatives) is comparable to, or lower than, the frequency of contamination observed with current FBI mtDNA protocols.
      Though the results of this study support the use of a 40 read variant detection threshold, the incidence of negative controls with data above the variant detection threshold but below the minimum threshold of 100 reads (19.6% of negative controls) necessitated multiple comparisons to determine whether the negative control sequence was in concordance with a sample minor variant. Given that the number of required negative-to-sample comparisons could be substantially reduced by raising the variant detection threshold to match the minimum read depth threshold (100 reads), we re-examined all of the study data. With a 100 read variant detection threshold, fewer mixed positions were detected in four mixture study samples (3.7% fewer mixed positions overall), as well as a bone previously identified as mixed (TB1 × 3). Additionally, for the two samplings from Bone 1A, a few very low read depth positions (<150 reads) at which both observed sequence variants had depths below 100 reads resulted in an inability to call the position (a 1–2 bp gap in sequence coverage). For Bone 1A-Cut 2, this eliminated the HVI C-stretch region point heteroplasmies that differed from the Sanger haplotype. There were no changes to the known sample haplotypes (Table S5). While this re-analysis demonstrated that use of a higher variant calling threshold resulted in some decrease in sensitivity, that decrease was balanced by a substantial reduction (nearly 75% in this study) in the number of negative controls exhibiting above-threshold data that required consideration. Further, for Bone 1A-Cut 2, which had low overall read depths in the HV1 C-stretch, the change to a 100 read variant detection threshold produced a more conservative PowerSeq typing result.
      Overall, the analyses of 96 total bioinformatics controls indicated an extremely low incidence of index sequencing errors, demultiplexing errors and run-to-run sample carryover. While a few sequence reads were detected, in no case did the total number of sequencing reads for a bioinformatic control exceed four. These low read levels can be considered the noise inherent to the sequencing technology, and did not interfere with data interpretation of a sample when using the stated data analysis parameters.
      Signal cross-talk during the sequencing process has been previously reported with the Illumina chemistry [
      • Erlich Y.
      • Mitra P.P.
      • dela Bastide M.
      • Mc Combie W.R.
      • Hannon G.J.
      Alta-Cyclic: a self-optimizing base caller for next-generation sequencing.
      ]. However, no occurrences of cross-talk were observed in this study. Even when we tried to force signal cross-talk by pooling four negative controls with three positive controls and a sample, we did not observe any such occurrences. It is worth noting here, however, that the overall workflow, to include the quality of the samples being tested, the assays in use and the data analysis and interpretation parameters employed, has a clear and direct impact on the potential for signal cross-talk. Particular high-sensitivity applications for which read depths can be quite low due to poor sample quality, and for which analytical thresholds are correspondingly and necessarily low, are more susceptible [
      • Marshall C.
      • Sturk-Andreaggi K.
      • Daniels-Higginbotham J.
      • Oliver R.S.
      • Barritt-Ross S.
      • McMahon T.P.
      Performance evaluation of a mitogenome capture and Illumina sequencing protocol using non-probative, case-type skeletal samples: Implications for the use of a positive control in a next-generation sequencing procedure.
      ].
      For practical operational purposes, a 100 read variant detection threshold will be implemented in routine casework. Given the granularity of NGS data, it is quite possible that analytical and variant detection thresholds could be lowered if noise signals of various types (e.g. sequencing error, PCR error, DNA damage, pseudogenes) could be consistently distinguished from authentic signal. We have employed methods for doing exactly that in other NGS studies [
      • Brandhagen M.D.
      • Loreille O.
      • Irwin J.A.
      Fragmented nuclear DNA is the predominant genetic material in human hair shafts.
      ,
      • Loreille O.
      • Ratnayake S.
      • Bazinet A.L.
      • Stockwell T.B.
      • Sommer D.D.
      • Rohland N.
      • et al.
      Biological sexing of a 4000-Year-Old egyptian mummy head to assess the potential of nuclear DNA recovery from the most damaged and limited forensic specimens.
      ,
      • Moreno L.I.
      • Galusha M.B.
      • Just R.
      A closer look at Verogen’s Forenseq DNA Signature Prep kit autosomal and Y-STR data for streamlined analysis of routine reference samples.
      ]. However, these analyses have been largely manual, which is not suitable to a production environment. As bioinformatic approaches (for example, [
      • Ring J.D.
      • Sturk-Andreaggi K.
      • Alyse Peck M.
      • Marshall C.
      Bioinformatic removal of NUMT-associated variants in mitotiling next-generation sequencing data from whole blood samples.
      ]) and particularly software packages with filtering capabilities (e.g. Converge™ Software - compatible with Thermo Fisher assays only, Thermo Fisher Scientific) continue to become available, and as the efficacy of such filters is clearly established via rigorous testing and validation, it is possible that analysis parameters can be refined and assay sensitivity improved even further.
      Considering all 30+ mtDNA haplotypes, representing 9 major haplogroups (B, D, F, H, J, K, L, T and U), that were tested across the validation studies, only a single sample haplotype (sample 16, discussed above) required manual realignment upon analyst review to achieve a result consistent with the Sanger haplotype. As both the initial results for sample 16 and the mixture results at position 16223 make clear, however, review of the NGS results by a trained analyst is still critical to ensure that both sequence alignment and variant nomenclature are correct and consistent with the SWGDAM Interpretation Guidelines for Mitochondrial DNA Analysis by Forensic DNA Testing Laboratories [
      • Scientific Working Group on DNA Analysis Methods (SWGDAM)
      Interpretation Guidelines for Mitochondrial DNA Analysis by Forensic DNA Testing Laboratories.
      ]. Given sequence read depths that will range from the hundreds to the thousands, or even tens of thousands, with NGS typing, analyst examinations of mtDNA sequence alignments will need to shift from the historically qualitative visual review of Sanger sequence traces to a mix of a) qualitative alignment review and b) quantitative review of read depth values and variant frequencies. Even when the sample data meet all software thresholds, atypical features such as read depth spikes or dips, unusual variants or variant frequencies that fall outside of expected ranges (as occurred with sample 16) will alert forensic analysts to the need to further investigate the sample or sample haplotype.

      5. Conclusion

      The experiments performed here demonstrate the efficacy, accuracy, reproducibility, sensitivity, and reliability of the PowerSeq CRM Nested System chemistry. In addition, they serve to define the conditions under which reliable and reproducible mtDNA sequence data can be obtained from the range of sample types, quantities and qualities typically encountered in mtDNA casework. Based on the sensitivity of the PowerSeq assay considered together with mtDNA quantitation values routinely encountered in the FBI Laboratory’s mtDNA casework, a substantial increase in the number of questioned samples yielding complete CR haplotypes is expected. We anticipate that complete control region data can be obtained from approximately 90% of routine FBI casework samples with the PowerSeq assay, rather than from only 20% of samples with our current Sanger protocols.
      In addition to the clear benefits in overall data recovery, the PowerSeq assay offers substantial advantages when it comes to laboratory workflow. Existing Sanger procedures for low-quality samples are dynamic, sample quality-driven, labor-intensive and difficult to automate. The PowerSeq assay, on the other hand, can be uniformly applied to all sample types without any impact on workflow or level of effort. It therefore simplifies, and improves the efficiency of, the mtDNA analysis process. Due to this workflow standardization, automated protocols that uniformly accommodate all sample types can be relatively easily developed and implemented.
      Indeed, automation is where our internal efforts are currently focused now that the PowerSeq mtDNA chemistry is validated for operational use at the FBI Laboratory. Additionally, efforts are ongoing to put in place the infrastructure changes that will be required for implementation - such as incorporating the necessary PowerSeq chemistry, laboratory processing and data analysis/review/reporting features in the laboratory’s information management system – and to develop the training modules that will be needed for the transition to NGS-based mtDNA typing. Finally, the validation studies were submitted for NDIS acceptance, and as of May 2019, mtDNA CR data generated using the PowerSeq assay are approved for CODIS upload.
      Though an assay targeting the CR only does not take full advantage of the benefits of NGS for mtDNA testing, it offers an opportunity to implement NGS technology without the added complexities that full mtGenome data present. This straightforward approach that restricts focus of NGS technology application to mtDNA markers already accepted/supported by the legal community should not only streamline the implementation of NGS in forensic DNA casework but also, by establishing the reliability of NGS technology using a familiar application, promote general acceptance by the scientific community for broader forensic application. In addition to reaping the benefits of improved data recovery and streamlined workflows for FBI Laboratory mtDNA casework in the very near term, the lessons learned and comfort achieved with NGS via implementation of the CR assay will set the stage for the transition of additional NGS assays to casework over the coming years.

      Acknowledgements

      The authors thank Odile Loreille, Brandon Letts, Danielle Daniels and Michelle Galusha for technical assistance, and Susannah Kehl for scientific discussion. The authors additionally thank Susannah Kehl, Thomas Callaghan and Odile Loreille for critical review of the manuscript, and Anthony Onorato and Tina Delgado for administrative support. Chimpanzee DNA was obtained from the NIGMS Human Genetic Cell Repository at the Coriell Institute for Medical Research (Camden, NJ): # NS06006. This research was supported in part through the FBI’s Visiting Scientist Program, an educational opportunity administered by the Oak Ridge Institute for Science and Education (ORISE) . This is publication number 19-011 of the Laboratory Division of the Federal Bureau of Investigation. Names of commercial manufacturers are provided for information only and inclusion does not imply endorsement by the FBI or the U.S. Government. The views expressed are those of the authors and do not necessarily reflect the official policy or position of the FBI or the U.S. Government.

      Appendix A. Supplementary data

      The following is Supplementary data to this article:

      References

        • McElhoe J.A.
        • Holland M.M.
        • Makova K.D.
        • Su M.S.
        • Paul I.M.
        • Baker C.H.
        • et al.
        Development and assessment of an optimized next-generation DNA sequencing approach for the mtgenome using the Illumina MiSeq.
        Forensic. Sci. Int. Genet. 2014; 13: 20-29
        • King J.L.
        • LaRue B.L.
        • Novroski N.
        • Stoljarova M.
        • Seo S.B.
        • Zeng X.
        • et al.
        High-quality and high-throughput massively parallel sequencing of the human mitochondrial genome using the Illumina MiSeq.
        Forensic Sci. Int. Genet. 2014; 12: 128-135
        • Parson W.
        • Huber G.
        • Moreno L.
        • Madel M.B.
        • Brandhagen M.D.
        • Nagl S.
        • et al.
        Massively parallel sequencing of complete mitochondrial genomes from hair shaft samples.
        Forensic Sci. Int. Genet. 2015; 15: 8-15
        • Bodner M.
        • Iuvaro A.
        • Strobl C.
        • Nagl S.
        • Huber G.
        • Pelotti S.
        • et al.
        Helena, the hidden beauty: resolving the most common West Eurasian mtDNA control region haplotype by massively parallel sequencing an Italian population sample.
        Forensic Sci. Int. Genet. 2015; 15: 21-26
        • Peck M.A.
        • Brandhagen M.D.
        • Marshall C.
        • Diegoli T.M.
        • Irwin J.A.
        • Sturk-Andreaggi K.
        Concordance and reproducibility of a next generation mtGenome sequencing method for high-quality samples using the Illumina MiSeq.
        Forensic Sci. Int. Genet. 2016; 24: 103-111
        • Zhou Y.
        • Guo F.
        • Yu J.
        • Liu F.
        • Zhao J.
        • Shen H.
        • et al.
        Strategies for complete mitochondrial genome sequencing on Ion Torrent PGM platform in forensic sciences.
        Forensic Sci. Int. Genet. 2016; 22: 11-21
        • Holland M.M.
        • Pack E.D.
        • McElhoe J.A.
        Evaluation of GeneMarker((R)) HTS for improved alignment of mtDNA MPS data, haplotype determination, and heteroplasmy assessment.
        Forensic Sci. Int. Genet. 2017; 28: 90-98
        • Eduardoff M.
        • Xavier C.
        • Strobl C.
        • Casas-Vargas A.
        • Parson W.
        Optimized mtDNA control region primer extension capture analysis for forensically relevant samples and highly compromised mtDNA of different age and origin.
        Genes (Basel). 2017; 8https://doi.org/10.3390/genes8100237
        • Marshall C.
        • Sturk-Andreaggi K.
        • Daniels-Higginbotham J.
        • Oliver R.S.
        • Barritt-Ross S.
        • McMahon T.P.
        Performance evaluation of a mitogenome capture and Illumina sequencing protocol using non-probative, case-type skeletal samples: Implications for the use of a positive control in a next-generation sequencing procedure.
        Forensic Sci. Int. Genet. 2017; 31: 198-206
        • Peck M.A.
        • Sturk-Andreaggi K.
        • Thomas J.T.
        • Oliver R.S.
        • Barritt-Ross S.
        • Marshall C.
        Developmental validation of a Nextera XT mitogenome Illumina MiSeq sequencing method for high-quality samples.
        Forensic Sci. Int. Genet. 2018; 34: 25-36
        • Ma K.
        • Zhao X.
        • Li H.
        • Cao Y.
        • Li W.
        • Ouyang J.
        • et al.
        Massive parallel sequencing of mitochondrial DNA genomes from mother-child pairs using the ion torrent personal genome machine (PGM).
        Forensic Sci. Int. Genet. 2018; 32: 88-93
        • Mardis E.R.
        A decade’s perspective on DNA sequencing technology.
        Nature. 2011; 470: 198-203
        • Shendure J.
        • Balasubramanian S.
        • Church G.M.
        • Gilbert W.
        • Rogers J.
        • Schloss J.A.
        • et al.
        DNA sequencing at 40: past, present and future.
        Nature. 2017; 550: 345-353
        • Chaitanya L.
        • Ralf A.
        • van Oven M.
        • Kupiec T.
        • Chang J.
        • Lagace R.
        • et al.
        Simultaneous whole mitochondrial genome sequencing with short overlapping amplicons suitable for degraded DNA using the ion torrent personal genome machine.
        Hum. Mutat. 2015; 36: 1236-1247
        • Park S.
        • Cho S.
        • Seo H.J.
        • Lee J.H.
        • Kim M.Y.
        • Lee S.D.
        Entire mitochondrial DNA sequencing on massively parallel sequencing for the korean population.
        J. Korean Med. Sci. 2017; 32: 587-592
        • Strobl C.
        • Eduardoff M.
        • Bus M.M.
        • Allen M.
        • Parson W.
        Evaluation of the precision ID whole MtDNA genome panel for forensic analyses.
        Forensic Sci. Int. Genet. 2018; 35: 21-25
        • Huber N.
        • Parson W.
        • Dur A.
        Next generation database search algorithm for forensic mitogenome analyses.
        Forensic Sci. Int. Genet. 2018; 37: 204-214
        • Gabriel M.N.
        • Huffine E.F.
        • Ryan J.H.
        • Holland M.M.
        • Parsons T.J.
        Improved MtDNA sequence analysis of forensic remains using a "mini-primer set" amplification strategy.
        J. Forensic Sci. 2001; 46: 247-253
        • Edson S.M.
        • Ross J.P.
        • Coble M.D.
        • Parsons T.J.
        • Barritt S.M.
        Naming the dead confronting the realities of rapid identification of degraded skeletal remains.
        Forensic sci. Rev. 2004; 16: 63-90
        • Eichmann C.
        • Parson W.
        ’Mitominis’: multiplex PCR analysis of reduced size amplicons for compound sequence analysis of the entire mtDNA control region in highly degraded samples.
        Int. J. Legal Med. 2008; 122: 385-388
        • Scientific Working Group on DNA Analysis Methods (SWGDAM)
        Validation Guidelines for DNA Analysis Methods.
        2016
        • Federal Bureau of Investigation
        Quality Assurance Standards for Forensic DNA Testing Laboratories.
        2011
        • Brandhagen M.D.
        • Loreille O.
        • Irwin J.A.
        Fragmented nuclear DNA is the predominant genetic material in human hair shafts.
        Genes (Basel). 2018; 9https://doi.org/10.3390/genes9120640
        • Loreille O.M.
        • Diegoli T.M.
        • Irwin J.A.
        • Coble M.D.
        • Parsons T.J.
        High efficiency DNA extraction from bone by total demineralization.
        Forensic Sci. Int. Genet. 2007; 1: 191-195
        • Kavlick M.F.
        Development of a triplex mtDNA qPCR assay to assess quantification, degradation, inhibition, and amplification target copy numbers.
        Mitochondrion. 2018;
      1. Promega Corporation, Massively Parallel Sequencing of Mitochondrial Control Region using the PowerSeq™ CRM Nested System, Custom, https://www.promega.com/-/media/files/resources/protocols/custom-systems/an322-powerseq-crm-system-custom.pdf?la=en.

      2. Promega Corporation, PowerSeq™ Quant MS System Technical Manual, #TM511, https://www.promega.com/-/media/files/resources/protocols/technical-manuals/500/tm511-powerseq-quant-ms-system-protocol.pdf?la=en.

        • Holland M.M.
        • Makova K.D.
        • McElhoe J.A.
        Deep-coverage MPS analysis of heteroplasmic variants within the mtGenome allows for frequent differentiation of maternal relatives.
        Genes (Basel). 2018; 9https://doi.org/10.3390/genes9030124
        • Gallimore J.M.
        • McElhoe J.A.
        • Holland M.M.
        Assessing heteroplasmic variant drift in the mtDNA control region of human hairs using an MPS approach.
        Forensic Sci. Int. Genet. 2018; 32: 7-17
        • Huszar T.I.
        • Wetton J.H.
        • Jobling M.A.
        Mitigating the effects of reference sequence bias in single-multiplex massively parallel sequencing of the mitochondrial DNA control region.
        Forensic Sci. Int. Genet. 2019; 40: 9-17
        • Scientific Working Group on DNA Analysis Methods (SWGDAM)
        Interpretation Guidelines for Mitochondrial DNA Analysis by Forensic DNA Testing Laboratories.
        2013
        • Parson W.
        • Strobl C.
        • Huber G.
        • Zimmermann B.
        • Gomes S.M.
        • Souto L.
        • et al.
        Evaluation of next generation mtGenome sequencing using the Ion Torrent Personal Genome Machine (PGM).
        Forensic Sci. Int. Genet. 2013; 7: 543-549
        • Just R.S.
        • Irwin J.A.
        • Parson W.
        Mitochondrial DNA heteroplasmy in the emerging field of massively parallel sequencing.
        Forensic Sci. Int. Genet. 2015; 18: 131-139
        • Sturk-Andreaggi K.
        • Peck M.A.
        • Boysen C.
        • Dekker P.
        • McMahon T.P.
        • Marshall C.K.
        AQME: A forensic mitochondrial DNA analysis tool for next-generation sequencing data.
        Forensic Sci. Int. Genet. 2017; 31: 189-197
        • van Oven M.
        • Kayser M.
        Updated comprehensive phylogenetic tree of global human mitochondrial DNA variation.
        Hum. Mutat. 2009; 30: E386-94
        • Parson W.
        • Gusmao L.
        • Hares D.R.
        • Irwin J.A.
        • Mayr W.R.
        • Morling N.
        • et al.
        DNA Commission of the International Society for Forensic Genetics: revised and extended guidelines for mitochondrial DNA typing.
        Forensic Sci. Int. Genet. 2014; 13: 134-142
        • Sleep J.A.
        • Schreiber A.W.
        • Baumann U.
        Sequencing error correction without a reference genome.
        BMC Bioinformatics. 2013; 14 (367-2105-14-367)
        • Wang B.
        • Wan L.
        • Wang A.
        • Li L.M.
        An adaptive decorrelation method removes Illumina DNA base-calling errors caused by crosstalk between adjacent clusters.
        Sci. Rep. 2017; 7: 41348
        • Parker L.T.
        • Deng Q.
        • Zakeri H.
        • Carlson C.
        • Nickerson D.A.
        • Kwok P.Y.
        Peak height variations in automated sequencing of PCR products using Taq dye-terminator chemistry.
        BioTechniques. 1995; 19: 116-121
        • Bandelt H.J.
        • Lahermo P.
        • Richards M.
        • Macaulay V.
        Detecting errors in mtDNA data by phylogenetic analysis.
        Int. J. Legal Med. 2001; 115: 64-69
        • Bandelt H.J.
        • Quintana-Murci L.
        • Salas A.
        • Macaulay V.
        The fingerprint of phantom mutations in mitochondrial DNA data.
        Am. J. Hum. Genet. 2002; 71: 1150-1160
        • Just R.S.
        • Irwin J.A.
        • Parson W.
        Questioning the prevalence and reliability of mitochondrial DNA heteroplasmy from massively parallel sequencing data.
        PNAS. 2014; (pii: 201413478)
        • Irwin J.A.
        • Just R.S.
        • Parson W.
        Massively parallel mitochondrial DNA sequencing in forensic genetics: principles and opportunities.
        in: Amorim A. Budowle B. Forensic Genetics: Biodiversity and Heredity in Civil and Criminal Investigation. Imperial College Press, 2016
        • Holland M.M.
        • McQuillan M.R.
        • O’Hanlon K.A.
        Second generation sequencing allows for mtDNA mixture deconvolution and high resolution detection of heteroplasmy.
        Croat. Med. J. 2011; 52: 299-313
        • Kim H.
        • Erlich H.A.
        • Calloway C.D.
        Analysis of mixtures using next generation sequencing of mitochondrial DNA hypervariable regions.
        Croat. Med. J. 2015; 56: 208-217
        • Vohr S.H.
        • Gordon R.
        • Eizenga J.M.
        • Erlich H.A.
        • Calloway C.D.
        • Green R.E.
        A phylogenetic approach for haplotype analysis of sequence data from complex mitochondrial mixtures.
        Forensic Sci. Int. Genet. 2017; 30: 93-105
        • Churchill J.D.
        • Stoljarova M.
        • King J.L.
        • Budowle B.
        Massively parallel sequencing-enabled mixture analysis of mitochondrial DNA samples.
        Int. J. Legal Med. 2018; 132: 1263-1272
        • Melton T.
        • Dimick G.
        • Higgins B.
        • Lindstrom L.
        • Nelson K.
        Forensic mitochondrial DNA analysis of 691 casework hairs.
        J. Forensic Sci. 2005; 50: 73-80
        • Willerslev E.
        • Cooper A.
        Ancient DNA.
        Proc. Biol. Sci. 2005; 272: 3-16
        • Irwin J.A.
        • Saunier J.L.
        • Niederstatter H.
        • Strouss K.M.
        • Sturk K.A.
        • Diegoli T.M.
        • et al.
        Investigation of heteroplasmy in the human mitochondrial DNA control region: a synthesis of observations from more than 5000 global population samples.
        J. Mol. Evol. 2009; 68: 516-527
        • Rebolledo-Jaramillo B.
        • Shu-Wei Su M.
        • Stoler N.
        • McElhoe J.A.
        • Dickens B.
        • Blankenberg D.
        • et al.
        Maternal age effect and severe germline bottleneck in the inheritance of human mitochondrial DNA.
        PNAS. 2014; https://doi.org/10.1073/pnas.1409328111
        • Skonieczna K.
        • Malyarchuk B.
        • Jawien A.
        • Marszalek A.
        • Banaszkiewicz Z.
        • Jarmocik P.
        • et al.
        Heteroplasmic substitutions in the entire mitochondrial genomes of human colon cells detected by ultra-deep 454 sequencing.
        Forensic Sci. Int. Genet. 2015; 15: 16-20
        • Li M.
        • Schroder R.
        • Ni S.
        • Madea B.
        • Stoneking M.
        Extensive tissue-related and allele-related mtDNA heteroplasmy suggests positive selection for somatic mutations.
        Proc. Natl. Acad. Sci. U. S. A. 2015; 112: 2491-2496
        • Cho S.
        • Kim M.Y.
        • Lee J.H.
        • Lee S.D.
        Assessment of mitochondrial DNA heteroplasmy detected on commercial panel using MPS system with artificial mixture samples.
        Int. J. Legal Med. 2018; 132: 1049-1056
        • Erlich Y.
        • Mitra P.P.
        • dela Bastide M.
        • Mc Combie W.R.
        • Hannon G.J.
        Alta-Cyclic: a self-optimizing base caller for next-generation sequencing.
        Nat. Methods. 2008; 5: 679-682
        • Loreille O.
        • Ratnayake S.
        • Bazinet A.L.
        • Stockwell T.B.
        • Sommer D.D.
        • Rohland N.
        • et al.
        Biological sexing of a 4000-Year-Old egyptian mummy head to assess the potential of nuclear DNA recovery from the most damaged and limited forensic specimens.
        Genes (Basel). 2018; 9https://doi.org/10.3390/genes9030135
        • Moreno L.I.
        • Galusha M.B.
        • Just R.
        A closer look at Verogen’s Forenseq DNA Signature Prep kit autosomal and Y-STR data for streamlined analysis of routine reference samples.
        Electrophoresis. 2018; 39: 2685-2693
        • Ring J.D.
        • Sturk-Andreaggi K.
        • Alyse Peck M.
        • Marshall C.
        Bioinformatic removal of NUMT-associated variants in mitotiling next-generation sequencing data from whole blood samples.
        Electrophoresis. 2018; 39: 2785-2797

      CHORUS Manuscript

      View Open Manuscript