If you don't remember your password, you can reset it by entering your email address and clicking the Reset Password button. You will then receive an email that contains a secure link for resetting your password
If the address matches a valid account an email will be sent to __email__ with instructions for resetting your password
The AmpFℓSTR® NGM SElect™ PCR Amplification Kit is a new 17-plex STR genotyping kit designed for use primarily in forensic casework analysis. The kit was designed to be a counterpart to the AmpFℓSTR® NGM™ Kit for laboratories wishing to add the SE33 locus to the new European Standard Set of STR loci. The NGM SElect Kit shares the same primer sets for 16 common loci with the NGM Kit (D10S1248, D3S1358, vWA, D16S539, D2S1338, amelogenin, D8S1179, D21S11, D18S51, D19S433, TH01, FGA, D22S1045, D2S441, D1S1656 and D12S391), with additional primers for the SE33 locus. Developmental validation studies followed the Scientific Working Group on DNA Analysis Methods (SWGDAM) guidelines for STR kit manufacturers and tested several critical areas of kit performance including a sensitivity series, DNA mixtures and inhibited samples. The studies demonstrated that the NGM SElect Kit provides equivalent overall performance to the NGM Kit, but with even greater discriminatory power due to the inclusion of the highly informative SE33 locus.
The efficacy of forensic genotyping as an investigative tool has led to the continual expansion of national DNA databases as genotype profiles are added at an ever-increasing rate. Combined with recent cross-border data-sharing initiatives like the Prüm Treaty [
], the need became apparent for a standardized set of STR loci with a higher power of discrimination to minimize the occurrence of adventitious matches when searching very large databases. The European Network of Forensic Science Institutes (ENFSI) and the European DNA Profiling Group (EDNAP) published their recommendations for improved next-generation multiplexes that would expand on the number of STR loci for greater statistical power while also providing more robust amplification of DNA target loci [
]. These recommendations specifically called for the inclusion of loci with shorter amplicons, better able to amplify degraded or otherwise compromised sample types, as well as highly polymorphic loci with better discriminatory power.
Following the ENFSI/EDNAP recommendations, we have previously developed the AmpFℓSTR NGM (Next Generation Multiplex) PCR Amplification Kit, a 16-plex STR kit that became available in 2010 [
]. Since maintaining genotype concordance within existing national databases was an important performance criterion, the NGM Kit retained the same primer sequences for the 10 STR loci it shares with the earlier-generation AmpFℓSTR® SGM Plus® Kit, while adding five new loci recommended by ENFSI/EDNAP, plus the amelogenin sex-determining locus. The NGM Kit also benefitted from significant improvements in PCR chemistry and a re-optimized thermal cycling protocol to provide more sensitive and robust amplification in less time compared to earlier-generation kits.
We have developed a new STR kit that adds the SE33 locus to the existing 16 loci of the NGM Kit multiplex to further improve the discriminatory power of the kit, as well as to provide for continuity with the databases of certain Central European countries that have traditionally favored the use of the highly polymorphic SE33 locus. The AmpFℓSTR NGM SElect PCR Amplification Kit shares the same PCR primer sequences with the NGM Kit to amplify their 16 common loci (D10S1248, D3S1358, vWA, D16S539, D2S1338, amelogenin, D8S1179, D21S11, D18S51, D19S433, TH01, FGA, D22S1045, D2S441, D1S1656 and D12S391), plus SE33. The NGM SElect Kit provides the same improvements in PCR chemistry and thermal cycling as the NGM Kit, with the same level of performance, for forensic laboratories wishing to include SE33 in their genotype data.
SE33 is one of the most highly polymorphic and informative STR loci currently in use for forensic genotyping [
]. The amplicon size constraints imposed by the existing NGM Kit loci meant that new SE33 primers had to be developed to fit such a large and extensive locus into the NGM SElect Kit locus configuration. This created a further challenge of ensuring that maximum concordance was maintained between SE33 genotypes from the NGM SElect Kit and earlier-generation kits such as the AmpFℓSTR® SEfiler Plus™ Kit [
This article describes the manufacturer's developmental validation of the NGM SElect™ Kit. The experiments were performed according to the guidelines issued by the Scientific Working Group on DNA Analysis Methods (SWGDAM) [
]. The results confirm the reliability of the NGM SElect™ Kit as required for forensic casework analysis.
2. Materials and methods
2.1 DNA samples
AmpFℓSTR Control DNA 007 was obtained from Life Technologies (Carlsbad, CA), and Raji DNA was purchased from Biochain Institute (Hayward, CA). Control DNAs 9947A and 9948 were purchased from Marligen Biosciences (Ijamsville, MD) and Coriell Cell Repositories (Camden, NJ). The DNA was quantified using the Quantifiler® Human DNA Quantification Kit on the 7500 Sequence Detection System (Life Technologies).
Human DNA population studies were performed to determine allele frequencies, stutter ratios and heterozygote ratios using a set of approximately 1000 different individual DNAs. Whole blood samples from anonymous donors of self-reported ethnicity were obtained from the Interstate Blood Bank, Inc. (Memphis, TN), Boca Biolistics (Coconut Creek, FL) or HemaCare Corporation (Van Nuys, CA), and purified on the 6100 Nucleic Acid PrepStation (Life Technologies).
The DNA Profiling Standard SRM 2391b, produced by the National Institute of Standards and Technology (Gaithersburg, MD), as well as genomic DNA sets from three of the Utah pedigree CEPH families (Centre d’Etude du Polymorphisme Humain, obtained through the Coriell Institute), were used to test the accuracy of allele calls against individuals of known genotype, and trace inheritance patterns of alleles through multiple generations.
Whole blood samples of several common animal species (bovine, horse, sheep, pig, rat, hamster, rabbit, chicken, dog and cat) were obtained from Pel-Freez Biologicals (Rogers, AR) and genomic DNA was prepared using the 6100 Nucleic Acid PrepStation. Samples of chimpanzee, gorilla and macaque, were obtained as purified genomic DNAs from BIOS Laboratories (New Haven, CT) or from the Oregon State Fish and Game Laboratory. Pooled genomic DNAs from several human-associated microbial species (Candida albicans, Escherichia coli and Lactobacillus rhamnosus) were prepared from cultures grown and purified in-house with the Iso-Quick Nucleic Acid Extraction Kit (Orca Research, Inc., Bothell, WA). DNA concentration of non-human DNA was determined by measuring the absorbance of the sample at 260 nm.
2.2 Primer set concentration and master mix components
The NGM SElect Kit multiplex contains PCR primer sets for 17 loci: D10S1248, D3S1358, vWA, D16S539, D2S1338, amelogenin, D8S1179, D21S11, D18S51, D19S433, TH01, FGA, D22S1045, D2S441, D1S1656, D12S391 and SE33. All of the primers except those for SE33 had been characterized and optimized together during the previous development of the AmpFℓSTR NGM Kit, thus the development of the NGM SElect Kit focused on integrating the SE33 primer set with the rest of the multiplex.
Other than the SE33 primers, the multiplexes for the NGM and NGM SElect kits are identical. Since its initial commercial release in 2010, certain rare, population-specific variant alleles of amelogenin, D2S441 and D22S1045 were discovered that caused the dropout of affected alleles in the earliest version of the NGM Kit. New primers were added to the current NGM Kit multiplex specifically to recover these variant alleles. The same primers are present in the NGM SElect Kit.
The NGM and NGM SElect kits share the same master mix containing the “hot-start” DNA polymerase, AmpliTaq Gold® DNA polymerase. Since PCR chemistry and thermal cycling conditions had been extensively optimized during the development of the NGM Kit, and also for the sake of consistent performance between the kits, no changes were made in the development of the NGM SElect Kit.
2.3 PCR amplification
Reaction setup and thermal cycling were performed according to the NGM SElect Kit User's Guide [
]. The PCR amplification reaction was prepared by combining 5 μL of Primer Mix with 10 μL of Master Mix, to which 10 μL of sample was added to give a total reaction volume of 25 μL. DNA Suspension Buffer (10 mM Tris–Cl pH 8.0 and 0.1 mM EDTA) (Teknova, Hollister, CA) was used as a sample diluent when necessary. Thermal cycling was performed on the GeneAmp® PCR System 9700 (Life Technologies) with a 96-well silver or gold-plated silver block. Standard thermal cycling conditions used the 9600 emulation mode setting on the 9700 thermal cycler, and consisted of enzyme activation at 95 °C for 11 min, followed by 29 or 30 cycles of denaturation at 94 °C for 20 s and annealing/extension at 59 °C for 3 min. A final extension step was performed at 60 °C for 10 min, followed by a final hold at 4 °C if the PCR products were to remain in the thermal cycler for an extended time.
The standard amounts of human genomic DNA per PCR were 1.0 ng for 29 cycles of amplification and 0.5 ng for 30 cycles of amplification. Thermal cycling was performed in MicroAmp® Optical 96-well plates covered with either MicroAmp® 8-cap strips or MicroAmp® Clear Adhesive Film (when using the Clear Adhesive Film, the accessory compression pad was used to ensure a tight seal around reaction wells).
Several validation experiments such as the sensitivity, models of inhibition and degradation experiments, were performed in parallel between the NGM SElect Kit, the NGM Kit, and the AmpFℓSTR® SEfiler Plus™ Kit. The same samples were added to PCRs at the same volume (10 μL) in a total reaction volume of 25 μL. The NGM SElect and NGM kits used the same thermal cycling protocol, and amplification was done for both 29 and 30 cycles. The SEfiler Plus Kit used the thermal cycling parameters specified in its user guide for 30 amplification cycles [
Following thermal cycling, reactions were prepared for capillary electrophoresis (CE) by combining 9.5 μL of Hi-Di™ Formamide, 0.5 μL of GeneScan™-600 LIZ® size standard v2.0 (both from Life Technologies) and 1.0 μL of sample. Samples were heat-denatured for 3 min in a 9700 thermal cycler set to 95 °C, then immediately quenched on ice.
The bulk of the experiments were performed on the 3130xl (16-capillary) genetic analyzer (Life Technologies). Verification and concordance experiments were performed on the 3130 (4-capillary), 3100 (16-capillary), 3500 (8-capillary) and 3500xL (24-capillary) Genetic Analyzers (Life Technologies) using the specified G5 variable binning modules. All instruments used 36 cm capillary arrays. Standard run conditions on the 3130xl Genetic Analyzer involved the following parameters: sample injection for 10 s at 3 kV and electrophoresis at 15 kV for 1500 s in Performance Optimized Polymer (POP-4® polymer) with a run temperature of 60 °C as indicated in the HIDFragmentAnalysis36_POP4_1 module. Some run parameters, e.g. injection conditions, were different on other CE instruments such as the 3500 or the 4-capillary 3130. The different instruments used various versions of Data Collection software: DC v2.0 for the 3100, DC v3.0 for the 3130 series instruments and 3500 Data Collection Software v.1.0 for the 3500xL instrument.
The electrophoresis results were analyzed using GeneMapper®ID-X v1.2 genotyping software (Life Technologies), using the analysis settings specified in the NGM SElect Kit User's Guide. A peak amplitude of 50 RFU (relative fluorescence units) was used as the peak detection threshold when analyzing data from all CE instruments except the 3500 and 3500xL. 3500-series run files required an increase of the peak amplitude threshold from 50 RFU to 175 RFU to adjust to its different peak height scale.
2.5 Electrophoresis sizing precision, accuracy and stutter calculations
The sizing accuracy and precision of STR allele peaks run on the 3130xl instrument were assessed in two ways:
(a)
The NGM SElect Allelic Ladder was injected in all 16 capillaries of a 3130xl instrument for five successive runs (80 individual capillary injections in all) to calculate the sizing precision. The data were analyzed on GeneMapper®ID-X software and the mean size (nt) and standard deviation for each allele peak in the ladder was calculated.
(b)
42 different human genomic DNA population samples were amplified with the NGM SElect Kit, and CE was performed on the 3130xl instrument along with the NGM SElect Allelic Ladder to calculate the accuracy of sizing. The sizing deviation of sample alleles from corresponding allelic ladder alleles was calculated.
The proportion of the stutter product relative to the main allele (percent stutter) was measured by dividing the height of the stutter peak by the height of the associated allele peak. Peak heights were measured from profiles of 1080 human population sample genomic DNAs. These measurements were then used to determine stutter filter recommendations to use in genotyping analyses of CE profiles.
2.6 Sensitivity
Male 007 and female 9947a human genomic DNA stock solutions were serially diluted in two-fold increments to give the following amounts when added at 10 μL per PCR: 1 ng, 500 pg, 250 pg, 125 pg, 62.5 pg and 31.25 pg total per reaction. Four replicate PCRs were prepared for the NGM SElect Kit and also for comparison to the NGM and SEfiler Plus kits. PCR amplification was performed for both 29 and 30 cycles for the NGM SElect and NGM kits, and for 30 cycles for the SEfiler Plus Kit.
The NGM SElect Kit and comparison kit results for the sensitivity study dilution series were assessed for the number of alleles detected per reaction as well as allele peak heights.
The results of 1 ng amplifications were used to calculate peak height ratios for intracolor balance and heterozygote balance. Intracolor balance was calculated within each group of loci sharing the same dye (e.g. the 4 loci labeled with 6FAM in the “blue” dye group) by first calculating the average peak height within each locus, then dividing the lowest locus average peak height by the highest locus average peak height, expressed as a percent.
2.7 Concordance study on multiple CE instrument platforms
A subset of 42 human genomic DNA samples from the set of population samples was amplified with the NGM SElect Kit and analyzed on several different capillary electrophoresis instruments to check for genotype concordance. PCRs were run on the 3130xl (16-capillary), 3130 (4-capillary) 3100 (16-capillary) and 3500xL instruments. Allele calls for all samples on all instruments were made using GeneMapper ID-X v1.2.
2.8 Inhibited and degraded model samples
The robustness of the NGM SElect Kit was challenged using model systems to simulate the kind of compromised sample types that may be encountered in forensic casework analysis. Two models of PCR inhibition, hematin [
Identification of the heme compound copurified with deoxyribonucleic acid (DNA) from bloodstains, a major inhibitor of polymerase chain reaction (PCR) amplification.
], were used to represent inhibitory compounds from degraded blood and soil, respectively. Hematin and humic acid were obtained in powder form from Sigma-Aldrich (St. Louis, MO), and stock solutions were made in 0.1 N NaOH (hematin) or water (humic acid). PCR samples were prepared with 1.0 ng of male 007 DNA, containing a range of inhibitor concentrations: 0 μM, 50 μM, 150 μM and 300 μM of hematin, and 0 ng/μL, 25 ng/μL, 50 ng/μL and 100 ng/μL of humic acid (each condition was tested in triplicate PCRs).
Degraded DNA was prepared by first subjecting Raji human male genomic DNA (purified from cell culture) to sonication to randomly shear the DNA strands, followed by treatment with different levels of DNase I enzyme (Ambion Inc., Austin, TX) to further degrade the DNA [
]. The reduction in the size of DNA fragments with increasing times of sonication and increasing levels of DNase I was confirmed by electrophoresis on a 4% agarose gel. The extent of degradation was proportional to the units of DNase I that were added, with the “0 u” sample being an untreated control, and the 3 u, 4 u, 5 u and 6 u samples being progressively more degraded. Each degraded DNA sample was tested in triplicate PCRs.
The inhibited and degraded sample testing was performed in parallel with the NGM SElect, NGM and SEfiler Plus kits. NGM SElect and NGM kits were tested with amplification for both 29 and 30 cycles (30 cycle amplification was done only for degraded DNA tests), while the SEfiler Plus Kit was amplified for 30 cycles only, according to its standard protocol. Results were assessed for the number of alleles detected per profile.
2.9 DNA mixture study
Forensic casework analysis often encounters mixture samples that contain the DNA of more than one individual. To test the performance of the NGM SElect Kit with DNA mixture samples, a series of mixtures of two different human genomic DNAs was made in different mixing ratios. The major contributor DNA (male sample IB 0079) was always present in equal or greater amounts than the minor contributor DNA (female sample IB1060). The DNAs were chosen to have relatively few of the same STR alleles to facilitate data analysis. Mixtures were formulated to always give 1 ng of total DNA per 10 μL volume, with relative ratios of minor:major DNAs being 1:1, 1:3, 1:7, 1:10 and 1:15. For example, the 1:7 mixture contained 125 pg minor and 875 pg major DNA per reaction. Each mixture sample was tested in replicates of four PCRs with the NGM SElect Kit and amplified for both 29 and 30 cycles.
2.10 Population study
A population study of over 1000 human genomic DNAs was performed to determine allele frequencies, stutter ratios, heterozygote ratios and to obtain concordance information with other STR kits. The population sample set contained approximately equal numbers of individuals of African-American, Caucasian and Hispanic ethnicity. Additional population samples of Asian (Korean) ethnicity were also tested to verify the accurate genotyping of individuals with a rare, population-specific primer site mutation at the D2S441 locus.
3. Results
3.1 Thermal cycling studies
A representative NGM SElect Kit profile generated by using 1 ng of control DNA 007 amplified for 29 cycles is shown in Fig. 1.
Fig. 1Representative NGM SElect Kit Electropherogram for 1.0 ng of 007 human genomic DNA, amplified for 29 cycles. Panels (top to bottom) show the 6-FAM™, VIC®, NED™ and PET® dye channels. Green panel headings indicate the name and marker range for each locus, while the genotype is shown with the allele number displayed underneath each peak.
Thermal cycling developmental validation studies for the NGM SElect Kit used a guard band study approach in which the effects of incremental changes in each parameter were examined. Results were assessed for basic performance criteria such as peak height, allele counts per profile and intracolor balance. The cycle number experiment employed replicate assay plates amplified for 27, 28, 29, 30 and 31 cycles (29 cycles are the recommended condition for 1.0 ng of DNA). Experimental results showed that full allele profiles were obtained at all cycle numbers, with mean allele peak heights on the 3130xl instrument for 007 DNA (heterozygous for all NGM SElect Kit loci) ranging between 718 RFU (SD = 109) for 27 cycles to 6221 RFU (SD = 1660) for 31 cycles. Off-scale peaks occurred for all of the DNA samples at 30 and 31 cycles and the 31 cycle data contained some off-scale allele peaks with artificially truncated peak heights, as well as several split peaks that likely resulted from incomplete non-template adenylation (“minus-A”). The 31-cycle split-peak phenomenon generated a cluster of data points with disproportionately low apparent peak heights less than 4500 RFU. The graph in Fig. 2 shows the mean peak heights for 007 DNA amplified for different cycle numbers.
Fig. 2NGM SElect Kit average peak height per dye-channel vs. number of cycles for PCRs with 1 ng of 007 genomic DNA (grouped by dye color). Individual data points (extracted from N = 3 replicate PCRs per cycle number) show the mean allele peak height of loci sharing the same dye color, for a range of PCR cycles from 27 to 31. Data for 31 cycles contained some off-scale allele peaks with artificially truncated peak heights, as well as split allele peaks due to incomplete adenylation (“minus-A”) that generated the separate cluster of data points with peak heights below 4500 RFU. Dye colors are FAM (“F”), VIC (“V”), NED (“N”) and PET (“P”).
Anneal-extend step temperature was tested in increments of 2 °C: 55 °C, 57 °C, 59 °C (recommended), 61 °C and 63 °C. Overall mean allele peak heights for profiles were relatively unaffected by different test temperatures within the range of 55 °C to 59 °C, but low peak heights were observed for two specific loci at the lowest temperature: D21S11 (in the VIC-dye, or green channel) and D2S441 (in the PET-dye, or red channel). Overall peak heights began to drop at 61 °C and showed a marked decrease at 63 °C, where the mean peak height was 940 RFU (SD = 907), less than half the peak height observed for the recommended anneal-extend temperature of 59 °C (data was collected on the 3130xl instrument). Incomplete profiles with multiple allele dropouts were seen for all replicates at 63 °C. Fig. 3 shows example electropherograms of 007 DNA from this experiment. Intracolor balance (calculated as described in Section 2.6) was best at the recommended 59 °C setting, with FAM, VIC, NED and PET (i.e. “blue”, “green”, “yellow” and “red”) dye channels yielding 81%, 90%, 73% and 71% balance, respectively. Intracolor balance fell below the 40% level in the VIC (“green”) and PET (“red”) dye channels at 55 °C, and in the FAM (“blue”) and PET (“red”) dye channels at 61 °C. Intra-color balance could not be calculated at 63 °C due to alleles in each dye group that failed to amplify.
Fig. 3Representative electropherograms from the PCR anneal-extend temperature study. 1 ng of 007 DNA was amplified for 29 cycles at the indicated temperatures.
The temperature of the 20 s denaturation step, tested at 92.5°, 94° (recommended setting) and 95.5°, had no statistically significant effects on kit performance, with overall peak heights and intracolor balance ratios remaining at optimal levels (data not shown).
One property of Taq polymerase is its tendency to add an extra, non-template nucleotide (usually adenosine) at the 3′ ends of DNA strands during thermal cycling [
]. The final 60° hold is important to ensure complete terminal nucleotide addition producing uniform allele sizes and good peak morphology. The developmental validation study tested thermal cycling final hold times of 0, 10, 20, 30 and 40 min (recommended hold time is 10 min). Allele peaks were examined for incomplete terminal nucleotide addition, which was the most likely consequence of insufficient hold times. Most loci showed normal peak morphology under all conditions, but FGA and D12S391 had small shoulder peaks for the 0-min-hold PCRs attributed to incomplete terminal nucleotide addition (Fig. 4).
Fig. 4Effect of shortening the final extension time after normal thermal cycling for 29 cycles. Eliminating the normal 10 min final hold (top panel, “0 min”) resulted in incomplete terminal nucleotide addition, evident as split peaks, at some loci. The black arrows in the top panel show the minus-A shoulder peaks for the D12S391 and FGA loci due to incomplete 3′ adenylation.
Several non-human genomic DNAs, mostly those of common domestic species, were tested for cross-reactivity with the NGM SElect Kit. Results with the primate species showed expected cross reactivity when tested at 1 ng DNA per PCR, which was extensive for both chimpanzee and gorilla and relatively minimal for macaque. Other species, which were tested at 10 ng per PCR, showed no reproducible peaks over the 50 RFU threshold except for horse, which had a reproducible VIC-labeled peak with a mobility of 96 nucleotides (nt), most likely associated with amelogenin [
]. Fig. 5 shows several representative electropherogram profiles from selected non-human species.
Fig. 5Representative electropherograms from a species specificity study. From top to bottom, chimpanzee and gorilla (1 ng per PCR), dog, cat and horse (10 ng per PCR), the bacterial pool sample and a no-template control (NTC). The bacterial pool sample was added to an equivalent amount of approximately 100,000 genome copies per organism per PCR. The RFU scale (y-axis) was adjusted to 3500 RFU for primates and 100 RFU for all other samples. Data were collected using normal 29-cycle amplification.
3.3 Electrophoresis sizing and stutter calculations
The sizing precision studies using the allelic ladder revealed that the highest variability in sizing observed for any allele corresponded to a standard deviation (SD) of 0.074 nt in one set of injections for FGA allele 50.2, which was well below the target specification of 0.15 nt SD (data not shown).
Sizing accuracy between the allelic ladder alleles and the corresponding alleles in a set of 42 random human genomic DNA population samples revealed that the vast majority of alleles (>99.6%) were sized within ±0.25 nt of their corresponding allelic ladder peaks. A relative handful of data points fell outside of this range (restricted to the D2S1338, D3S1358 and SE33 loci); however, all data points fell well within the range of ±0.50 nt, which ensured that there was no danger of alleles being mis-typed due to insufficient sizing accuracy.
Stutter peaks are a normal by-product of PCR amplification of STR loci. The process is thought to involve the slippage of the Taq polymerase enzyme when it encounters long stretches of highly repetitive template DNA [
]. Stutter is usually more pronounced for shorter repeat motifs, so that a trinucleotide-repeat locus like D22S1045 will typically generate higher relative stutter peaks than the other tetranucleotide-repeat loci in the NGM SElect Kit multiplex. D22S1045 not only displayed the highest level of minus-stutter (one repeat unit shorter than the main allele peak), it also had a significant level of plus-stutter (one repeat unit larger than the main allele peak), which could sometimes be detected as an allele peak if a plus-stutter filter were not used with the GeneMapper ID-X genotyping software. Complex STRs such as D1S1656 and SE33 contain blocks of both tetra- and di-nucleotide repeats subject to the expected minus-stutter (4 nt shorter) but also a minus-2-nt stutter that is typically much lower in peak height but may also be detected as allele peaks if dedicated locus-specific stutter filters are not used.
Locus-specific stutter filter settings for use with the genotyping software GeneMapper ID-X were determined from the results of a population study of 1080 individuals with the NGM SElect Kit. Stutter ranges for the 16 NGM SElect Kit STR loci are shown in Table 1, along with stutter filter settings that were calculated from the observed stutter ratio means and standard deviations for each locus. Minus stutter filters were calculated for all 15 STR loci; additionally, D22S1045 has a plus stutter filter and D1S1656 and SE33 required a minus-2-nt stutter filter.
Table 1Observed stutter for the NGM SElect Kit STR loci. Stutter relative peak heights were determined for genotype data from over 1000 individuals. Stutter values were expressed as the peak height ratio of stutter peaks to their corresponding main allele peaks, in percentage terms. Stutter values below the recommended filter are ignored by the GeneMapper ID-X genotyping software. The filter values (shown as decimal values rather than percent) were calculated using the formula of the mean stutter value plus three times the standard deviation. Data were generated using normal 29-cycle amplification.
The sensitivity study was performed with the NGM, NGM SElect and SEfiler Plus kits. The results of the sensitivity study are summarized in Fig. 6. Full allele profiles were obtained for 007 and 9947a DNAs down to a level of 125 pg for all kits, below which stochastic dropout of alleles was observed. NGM SElect Kit mean allele counts for the lowest DNA level tested, 31.25 pg per reaction, were 24 alleles for 007 (out of 34 possible) and 24 alleles for 9947a (out of 30 possible). Whether they were amplified for 29 or 30 cycles, the NTC negative control PCRs for the NGM SElect and NGM kits were free of any reproducible artifact peaks over 50 RFU within the size range defined by the smallest and largest loci (72 nt for D10S1248 up to 440 nt for SE33).
Fig. 6The sensitivity study was performed using dilution series of human genomic DNAs (007 and 9947a). The DNA dilutions were made to give between 1.0 ng and 31.25 pg of DNA per reaction in two-fold increments. The graph shows the percent of a full profile of 007 DNA (based on the mean allele count for three replicate PCRs at each dilution level) that was obtained for NGM SElect Kit PCRs amplified for 29 (blue bars) or 30 (red bars) cycles. A full profile for 007 DNA with the NGM SElect Kit contained 34 alleles total. Error bars show the standard deviation.
The robustness of the NGM SElect Kit was tested using model systems for degraded human genomic DNA and PCR inhibition.
The effect of DNA degradation on the amplification efficiency of the NGM SElect, NGM and SEfiler Plus kits is summarized in Table 2. All kits gave full profiles with all three replicates of the untreated “0 u” control, and for some (not all) replicates of the “3 u” sample. Full profiles were not obtained with the more highly degraded 4 u, 5 u or 6 u samples whether the kit was amplified for 29 or 30 cycles. The electropherograms in Fig. 7 showed significant “ski slope effect” in the degraded samples, where smaller alleles were preferentially amplified relative to larger alleles; this effect became more pronounced with greater degrees of degradation. While allele detection for loci with larger amplicons (especially D2S1338, D18S51, FGA and SE33) dropped off significantly at higher degradation levels, the three “mini-STR” loci (D10S1248, D22S1045 and D2S441) and amelogenin produced mostly complete allele counts with even the most degraded “6 u” sample.
Table 2Amplification efficiency of Raji DNA incubated with increasing doses of DNaseI. Progressively degraded Raji DNA fractions (“Sample” column) are indicated by DNase I units. The untreated control is designated as “0 u”.“Count” columns are the mean number of alleles detected in three replicate PCRs with each kit/condition. “%Full” columns indicate the percentage of a full profile for the DNA that was represented by the mean allele count (the “0 u” untreated control always had full profiles). The NGM SElect and NGM kits were tested with both 29- and 30-cycle PCR amplification protocols, while the SEfiler Plus Kit was tested with its standard 30-cycle amplification protocols.
Fig. 7Representative electropherograms from the artificially degraded DNA study. The performance of several kits was assessed using artificially degraded DNA samples treated with increasing concentrations of DNAse I. Panels A, C and E show electropherograms for the untreated samples for the NGM SElect, NGM and SEfiler Plus kits, respectively. Panels B, D and F show electropherograms for the most degraded sample (6 u) with the same series of kits. Degraded DNA testing was performed with both 29 and 30 PCR cycles; the electropherograms shown were all amplified for 30 cycles.
Fig. 8 shows example electropherograms from the hematin and humic acid models of inhibition study. Hematin and humic acid sample series were formulated to contain a range of inhibitor concentrations with a constant level of 007 DNA (1.0 ng total per 25 μL PCR). With their improved next-generation PCR chemistry, the NGM SElect and NGM kits were able to detect full allele profiles at all levels of inhibitor tested. In contrast, the earlier-generation SEfiler Plus Kit was able to amplify alleles at only the lowest levels of hematin and humic acid (50 μM and 25 ng/μL, respectively); no alleles were detected with this kit at higher levels of either inhibitor.
Fig. 8The inhibited sample study tested kit performance with model samples prepared with humic acid or hematin PCR inhibitors. All samples contained 1.0 ng of 007 human genomic DNA with varying concentrations of PCR inhibitors. The NGM SElect and NGM kits used 29-cycle PCR amplification, while the SEfiler Plus kit was amplified for 30 cycles according to its standard protocol. Electropherograms from uninhibited control samples, 100 ng/μL humic acid and 300 μM hematin, respectively, are shown in panels A–C for the NGM SElect Kit; D–F for the NGM Kit; and G–I for the SEfiler Plus Kit.
Table 3 shows the setup and allele count results for mixture sample analysis with the NGM SElect Kit. The minor contributor, BB-1060, had 18 unique alleles for the NGM SElect Kit (those that did not overlap with major contributor alleles or their stutter peaks). All of the unique alleles of the minor contributor were detected in all four replicate PCRs of the 1:1, 1:3 and 1:7 mixtures, whether the NGM Kit was amplified for 29 or 30 cycles (the 1:7 mixture corresponded to 125 pg total of the minor contributor DNA per PCR).
Table 3DNA mixture study. DNA mixtures were made between two human genomic DNAs. DNAs were always added so that their total amount was 1.0 ng per PCR. The minor allele count columns show the average minor contributor allele count (out of 18 total unique alleles) detected in 4 replicate PCRs with the NGM SElect Kit amplified for either 29 or 30 cycles.
Results for the higher mixing ratio samples (1:10 and 1:15) reflect the stochastic nature of amplifying low input DNA levels. Whereas all four 1:15 sample replicates that were amplified for 29 cycles gave full profiles, three replicates of the same sample that were amplified for 30 cycles each had 1–3 minor contributor alleles that were below the 50 RFU detection threshold. Likewise, the 29-cycle 1:15 sample PCRs detected more alleles than the 29-cycle 1:10 sample PCRs. In this case, since the 1:10 sample contained more minor contributor DNA than the 1:15 sample (91 pg vs. 63 pg), higher allele peaks and therefore more consistent detection of the minor contributor alleles of the 1:10 sample would have been expected.
3.7 Population study and genotype concordance
A population study was performed using the NGM SElect Kit to amplify a set of over 1000 samples consisting of approximately equal numbers of African-American, Caucasian and Hispanic individuals. Genotyping results for the same sample set were concordant among the shared loci of the NGM SElect, NGM and SEfiler Plus kits with one exception: sample IB0668 was heterozygous for SE33 with the NGM SElect Kit, but one of the heterozygous alleles did not amplify with the SEfiler Plus Kit. This was likely a consequence of the kits having different primers for SE33. 100% concordance at all common loci was found when using the NIST Genotyping Standard SRM 2391b.
Allele frequency data from the population study was used to calculate the Probability of Identity for the NGM SElect Kit in three major population groups (African-American, Caucasian and Hispanic). Table 4 shows the PI values calculated for the NGM SElect, NGM and SEfiler Plus kits for the same set of population samples. Allele frequency distributions for the population groups are fully documented in the AmpFℓSTR NGM SElect Kit User Manual [
]. Population study results were also used to calculate the mean heterozygote peak height ratios for the NGM SElect Kit's 17 loci, which are shown in Table 5.
Table 4Combined Probability of Identity (PI) values for the SGM Plus, SEfiler Plus, Identifiler, NGM and NGM SElect kit loci. Genotypes of human DNAs from 1080 individuals, with approximately equal representation between African American (344), U.S. Caucasian (346) and U.S. Hispanic (390) populations, were determined using the NGM SElect Kit as well as the SGM Plus, SEfiler Plus, Identifiler and NGM kits. Allele frequency results were used to calculate the Power of Identity for each kit for the different population groups.
Table 5Heterozygote peak height ratio calculations for all NGM SElect Kit loci obtained from genotyped population samples (DNA input was approximately 1 ng, with PCR amplification performed for 29 cycles). Only heterozygous allele peaks between 8000 and 12,000 RFU on a 3500xL genetic analyzer (the central range of peak height distribution) were included in the calculations.
A study of concordance between different capillary electrophoresis instrument platforms was performed using a set of 42 human genomic DNAs. The PCR products from the 42 samples were electrophoresed on 4 different CE instrument platforms (Materials and Methods) and full concordance was observed.
The inheritance of NGM SElect Kit STR alleles was traced through three CEPH Utah pedigree families: 1333, 1334 and 1345. CEPH families each contained a set of genomic DNAs representing three generations of individuals: four grandparents, two parents and 7–9 children. Three intergenerational repeat-number changes were observed within the sets of CEPH families. Family 1333 had D12S391 allele 25 in a grandmother (1333–7341) change to allele 24 in her daughter (1333–6987). Family 1340 had D8S1179 allele 11 in a father (1340–7029) change to allele 12 in one of his sons (1340–7342). Family 1345 had a mutation at the SE33 locus that resulted in a change of allele 16 in a mother (1345–7348) to allele 15 in one of her sons (1345–7352). In all cases, the apparent mutation events resulted in either a gain or loss of a single repeat unit.
4. Discussion
The 17-plex NGM SElect Kit was designed to provide an equivalent kit to the 16-plex NGM Kit for laboratories wishing to include the SE33 locus in their forensic genotyping analyses. Because the placement of the SE33 locus within the pre-existing NGM Kit multiplex imposed constraints on the size range of SE33 alleles, a new primer set was developed that produces a larger amplicon than the SE33 primers for the earlier-generation AmpFℓSTR SEfiler™ and SEfiler Plus kits. The process of developing and validating SE33 primer sets led to the observation of sequence polymorphisms in certain variant alleles that could cause an undesirable mobility shift and resulting mis-typing of the alleles with an earlier set of prototype SE33 primers [
]. Significant effort was put into finding a set of SE33 primers that had maximum concordance with the primer sets of the SEfiler and SEfiler Plus kits, while also producing good amplification efficiency and consistent peak morphology.
Since the NGM Kit was first released commercially in early 2010, a subsequent population study identified certain rare, population-specific variant alleles of D2S441 and D22S1045 that were not detected by the earliest version of the kit (John Butler and Carolyn Hill, personal communication). Consequently, additional primers were added to the multiplex in a newer version of the kit to allow these rare alleles to be detected. The NGM SElect and NGM kits share identical primers for their common loci, including the newer variant-specific primers. The only difference between the two kits is the presence of SE33 primers in the NGM SElect Kit.
As specified in the SWGDAM guidelines, developmental validation studies examined many aspects of kit performance, such as sensitivity, species specificity, robustness, mixture studies and population studies. The studies done in parallel with the SEfiler Plus and NGM kits found performance improvements with the new kit relative to the previous-generation SE33-containing kit and equivalent performance to the NGM Kit.
The NGM SElect and NGM kits both contain the same pairs of syntenic loci: vWA and D12S391 on Chromosome 12, as well as D2S441 and D2S1338 on Chromosome 2. D2S441 and D2S1338 are distally located on opposite arms of Chromosome 2, but vWA and D12S391 are 6.2 million base pairs apart on Chromosome 12. While studies of NGM Kit population data have not shown any evidence of linkage disequilibrium between vWA and D12S391, some authors have cautioned that, due to their relatively close positions on Chromosome 12, the meiotic independence of the loci should not be assumed in kinship analysis. Nonetheless, the absence of any evidence of linkage disequilibrium at the population level should allow the product rule to be used for all STR markers in the kit for the purpose of calculating the rarity of DNA profiles [
K.L. O’Connor, C.R. Hill, P.M. Vallone, J.M. Butler, Corrigendum to “Linkage disequilibrium analysis of D12S391 and vWA in U.S. population and paternity samples” [Forensic Sci. Int. Genet. (in press), http://dx.doi.org/10.1016/j.fsigen.2010.09.003].
]. The two kits contain identical primer sets among their 16 common loci, the same improved PCR chemistry and the same thermal cycling protocol and overall work-flow to ensure equivalent performance and 100% concordance among their common loci. With the addition of the highly informative SE33 locus, the NGM SElect Kit offers even greater discriminatory power and the ability to better mesh with DNA databases in regions that have traditionally included SE33 in their standard set of loci.
Acknowledgments
The authors would like to thank Wilma Norona, Adam Broomer, and Michael Malicdem for technical assistance with this work.
References
Council of the European Union
Convention … on the Stepping up of Cross-border Cooperation, Particularly in Combating Terrorism, Cross-border Crime and Illegal Migration.
Identification of the heme compound copurified with deoxyribonucleic acid (DNA) from bloodstains, a major inhibitor of polymerase chain reaction (PCR) amplification.
K.L. O’Connor, C.R. Hill, P.M. Vallone, J.M. Butler, Corrigendum to “Linkage disequilibrium analysis of D12S391 and vWA in U.S. population and paternity samples” [Forensic Sci. Int. Genet. (in press), http://dx.doi.org/10.1016/j.fsigen.2010.09.003].