Advertisement

A multipurpose panel of microhaplotypes for use with STR markers in casework

Open AccessPublished:June 03, 2022DOI:https://doi.org/10.1016/j.fsigen.2022.102729

      Highlights

      • A small panel of 24 highly informative microhaplotype loci has been identified.
      • The MH panel is good for individualization, ancestry inference, and mixture analysis.
      • The MHs are more informative than the 24 augmented CODIS STRs typed by CE.
      • The 24 MH panel is good enough to be a stand-alone panel for forensic casework.

      Abstract

      A small panel of highly informative loci that can be genotyped on the same equipment as the standard CODIS short tandem repeat (STR) markers has strong potential for application in forensic casework. Single nucleotide polymorphisms (SNPs) can be typed by a couple of methods on capillary electrophoresis (CE) machines and on sequencers, but the amount of information relative to the laboratory effort has hindered use of SNPs in actual casework. Insertion-deletion markers (InDels) suffer from similar problems. Microhaplotypes (MHs) are much more informative per locus but have similar technical difficulties unless they are typed by massively parallel sequencing (MPS). As forensic labs are acquiring sequencing machines, MHs become more likely to be used in casework, especially if multiplexed with STRs. Here we present the details of a multipurpose panel of 24 MHs with the highest effective number of alleles (Ae) from previous work. An augmented STR panel of 24 loci (20 CODIS markers plus four commonly typed STRs) is also considered. The Ae and ancestry informativeness (In) distributions of these two datasets are compared. The MH panel is shown to have better individualization and population distinction than the augmented CODIS STRs. We note that the 24 MHs should be better for mixture analyses than the STRs. Finally, we suggest that a commercial kit including both the standard CODIS markers and this set of 24 MH would greatly improve the discrimination power over that of current commercial assays.

      Keywords

      1. Introduction

      Microhaplotypes (MHs) have been defined as small regions of DNA up to 300 bp composed of at least two SNPs and forming at least three haplotypes [
      • Kidd K.K.
      • Pakstis A.J.
      • Speed W.C.
      • Lagace R.
      • Chang J.
      • Wootton S.
      • Ihuegbu N.
      Microhaplotype loci are a powerful new type of forensic marker.
      ,
      • Kidd K.K.
      • Pakstis A.J.
      • Speed W.C.
      • Lagace R.
      • Chang J.
      • Wootton S.
      • Haigh E.
      • Kidd J.R.
      Current sequencing technology makes microhaplotypes a powerful new type of genetic marker for forensics.
      ]. Such short molecular regions with several SNPs within an amplicon can have very high heterozygosity when evaluated as haplotypes. Such regions are best typed by massively parallel sequencing (MPS). Several panels of microhaplotypes have been proposed and a few have been typed by MPS on several populations [
      • Phillips C.
      • McNevin D.
      • Kidd K.K.
      • Lagace R.
      • Wootton S.
      • de la Puente M.
      • Freire-Aradas A.
      • Mosquera-Miguel A.
      • Eduardoff M.
      • Gross T.
      • Dagostino L.
      • Power D.
      • Olson S.
      • Hashiyada M.
      • Oz C.
      • Parson W.
      • Schneider P.M.
      • Lareu M.V.
      • Daniel R.
      MAPlex – a massively parallel sequencing ancestry analysis multiplex for Asia-Pacific populations.
      ,
      • de la Puente M.
      • Phillips C.
      • Xavier C.
      • Amigo J.
      • Carracedo A.
      • Parson W.
      • Lareu M.V.
      Building a custom large-scale panel of novel microhaplotypes for forensic identification using MiSeq and Ion S5 massively parallel sequencing systems.
      ,
      • Gandotra N.
      • Speed W.C.
      • Qin W.
      • Tang Y.
      • Pakstis A.J.
      • Kidd K.K.
      • Scharfe C.
      Validation of novel forensic DNA markers using multiplex microhaplotype sequencing.
      ,
      • Oldoni F.
      • Bader D.
      • Fantinato C.
      • Wooton S.
      • Lagace R.
      • Kidd K.
      • Podini D.
      A sequence-based 74plex microhaplotype assay for analysis of forensic DNA mixtures.
      ,
      • Pakstis A.J.
      • Gandotra N.
      • Speed W.C.
      • Murtha Michael
      • Scharfe Curt
      • Kidd Kenneth K.
      The population genetics characteristics of a 90 locus panel of microhaplotypes.
      ,
      • Wu R.
      • Li H.
      • Li R.
      • Peng D.
      • Wang N.
      • Shen X.
      • Sun H.
      Identification and sequencing of 59 highly polymorphic microhaplotypes for analysis of DNA mixtures.
      ,
      • Zhao X.
      • Fan Y.
      • Zeye M.M.J.
      • He W.
      • Wen D.
      • Wang C.
      • Li J.
      • Hua Z.
      A novel set of short microhaplotypes based on non-binary SNPs for forensic challenging samples.
      ]. MHs have been shown to be powerful markers for ancestry prediction and mixture resolution [
      • Bennett L.
      • Oldoni F.
      • Long K.
      • Cisana S.
      • Maddela K.
      • Wootton S.
      • Chang J.
      • Hasegawa R.
      • Lagacé R.
      • Kidd K.K.
      • Podini D.
      Mixture deconvolution by massively parallel sequencing of microhaplotypes.
      ,
      • Oldoni F.
      • Castella V.
      • Grosjean F.
      • Hall D.
      Sensitive DIP-STR markers for the analysis of unbalanced mixtures from “touch” DNA samples.
      ] and can provide highly significant individualization. Presently, use of MHs in casework only occurs, if at all, subsequent to traditional short tandem repeat (STR) analysis. Microhaplotypes, as proposed in 2013 [
      • Kidd K.K.
      • Pakstis A.J.
      • Speed W.C.
      • Lagace R.
      • Chang J.
      • Wootton S.
      • Ihuegbu N.
      Microhaplotype loci are a powerful new type of forensic marker.
      ], can be incorporated into forensic casework in tandem with the standard 20 CODIS STR polymorphisms [
      • Butler J.M.
      ]. The MHs provide excellent mixture, ancestry analysis, and individualization and the CODIS STRs provide nearly unique individualization and mixture interpretation when probabilistic genotyping is used.
      It is likely that the CODIS STR loci will remain the standard for forensic casework for a considerable period because of the large reference databases of criminal results for the CODIS markers and the financial burden of implementing a new technology in forensic laboratories. Any new marker routinely used in casework will need to be used in conjunction with the standard STR markers and it should use the same PCR and analysis equipment. With current capillary electrophoresis (CE) methods the likelihood of incorporating MHs is low. However, as forensic laboratories begin to have MPS equipment and sequencing begins to be used for STR typing, the possibility of incorporating MHs becomes tenable. For MHs to be incorporated into routine casework it would be most efficient to multiplex STRs and MHs into a single assay. The Forenseq DNA Signature Prep Kit (Verogen) provides a proof of principle that multiplexing STRs and SNP-based markers is possible. The question arises, “What MHs to incorporate?”.
      Our long term objective has been to identify MHs worthy of converting to MPS [
      • Oldoni F.
      • Castella V.
      • Grosjean F.
      • Hall D.
      Sensitive DIP-STR markers for the analysis of unbalanced mixtures from “touch” DNA samples.
      ,
      • Kidd K.K.
      • Pakstis A.J.
      • Speed W.C.
      • Lagace R.
      • Wootton S.
      • Chang J.
      Selecting microhaplotypes optimized for different purposes.
      ]. The panel of 90 MHs [
      • Gandotra N.
      • Speed W.C.
      • Qin W.
      • Tang Y.
      • Pakstis A.J.
      • Kidd K.K.
      • Scharfe C.
      Validation of novel forensic DNA markers using multiplex microhaplotype sequencing.
      ] is the result of those searches. Those 90 loci had excellent individualization and ancestry inference capabilities [
      • Pakstis A.J.
      • Gandotra N.
      • Speed W.C.
      • Murtha Michael
      • Scharfe Curt
      • Kidd Kenneth K.
      The population genetics characteristics of a 90 locus panel of microhaplotypes.
      ]. To evaluate whether a set of highly informative loci might be appropriate for multiplexing with STRs in forensic casework applications we selected 24 loci to match the number in the augmented STR set ( Table 1 and S1). We could use larger numbers of microhaplotypes but the goal was to create a subset that matches the number of loci in the two sets of markers. This eliminates the statistical issues arising from using different numbers of markers when comparing forensic characteristics of different panels. We chose the MHs with the highest average effective number of alleles (Ae) values [
      • Kidd K.K.
      • Speed W.C.
      Criteria for selecting microhaplotypes: mixture detection and deconvolution.
      ] from among the 90 MH based on analyses of 79 populations [
      • Pakstis A.J.
      • Gandotra N.
      • Speed W.C.
      • Murtha Michael
      • Scharfe Curt
      • Kidd Kenneth K.
      The population genetics characteristics of a 90 locus panel of microhaplotypes.
      ]. The selected 24 MHs are listed in Table 1. The SNPs in each of these MHs and the primer pairs used are in Gandotra et al. [
      • Gandotra N.
      • Speed W.C.
      • Qin W.
      • Tang Y.
      • Pakstis A.J.
      • Kidd K.K.
      • Scharfe C.
      Validation of novel forensic DNA markers using multiplex microhaplotype sequencing.
      ] by MH name. Ideally the additional information provided by these 24 MHs would help in addressing all of the major forensic questions: individualization, ancestry inference, kinship analysis, and mixture resolution.
      Table 1The 24 microhaplotypes. Characteristics for the 79 population analysis.
      MH nameChrNucleotide

      position

      (GRCh37)
      Molecular Extent (bp)#SNPsInitial SNPallele 1allele 2
      mh01KK-2121202,616,54724317rs16850184GC,A
      mh02KK-0142228,524,07223916rs73084811GA
      mh02KK-02223,172,4382499rs13021132CT
      mh02KK-029269,138,95723614rs1036977184TC
      mh02KK-1342161,079,4111048rs12469721AT
      mh05KK-17052,447,91025614rs116278333CT
      mh07KK-009718,861,12118216rs13244868CG,A
      mh08KK-137831,083,23219512rs35052638CA,T
      mh09KK-14594,763,3092189rs59428916GA
      mh09KK-1539103,969,6422477rs62576887CT
      mh10KK-162103,160,65226613rs79563339CG,del
      mh11KK-180111,690,71427112rs377150512AG
      mh11KK-1831120,020,04221712rs149151952TC
      mh12KK-2011227,800,32717715rs11049080CT
      mh13KK-2131323,765,40921611rs186130261AG
      mh13KK-2171346,865,88823510rs76839632AG
      mh13KK-2181354,060,7102637rs1450180563CT
      mh13KK-22113101,759,08825312rs61973993TC
      mh16KK-0111684,285,72719811novelTC
      mh16KK-2591683,973,81924814rs544552928GA
      mh17KK-2781778,761,5461877rs4969266CT
      mh20KK-30620895,3132197rs370342593CT
      mh21KK-3202143,062,85927110rs2838081GA
      mh22KK-3402249,060,97626111rs4925431AG
      As an overview Fig. 1 plots all 24 MHs by Ae and Informativeness (In) [
      • Rosenberg N.A.
      • Li L.M.
      • Ward R.
      • Pritchard J.K.
      Informativeness of genetic markers for inference of ancestry.
      ] calculated for the 79 populations in Pakstis et al. [
      • Pakstis A.J.
      • Gandotra N.
      • Speed W.C.
      • Murtha Michael
      • Scharfe Curt
      • Kidd Kenneth K.
      The population genetics characteristics of a 90 locus panel of microhaplotypes.
      ]. The actual Ae and In values are given in Supplemental Table 1. The sequence amplicons used for these 24 MHs, as for all 90 MHs, are concentrated around the upper end of the size range of the CODIS STR amplicons (~ 100 bp to ~ 300 bp). However, these MH amplicons could be made smaller for many loci should that be an issue in optimizing the multiplex (Table S1). While most microhaplotypes published to date have relatively low Ae values < 4, this selection shows that it is possible to find MH that are highly heterozygous with Ae values >> 4. Such loci are rare but do exist and can be found [
      • Gandotra N.
      • Speed W.C.
      • Qin W.
      • Tang Y.
      • Pakstis A.J.
      • Kidd K.K.
      • Scharfe C.
      Validation of novel forensic DNA markers using multiplex microhaplotype sequencing.
      ,
      • Wu R.
      • Li H.
      • Li R.
      • Peng D.
      • Wang N.
      • Shen X.
      • Sun H.
      Identification and sequencing of 59 highly polymorphic microhaplotypes for analysis of DNA mixtures.
      ].
      Fig. 1
      Fig. 1Scatterplot of the 24 microhaplotypes by their Ae and In values for 79 populations. The values are in . The values plotted are based on Pakstis et al.
      [
      • Pakstis A.J.
      • Gandotra N.
      • Speed W.C.
      • Murtha Michael
      • Scharfe Curt
      • Kidd Kenneth K.
      The population genetics characteristics of a 90 locus panel of microhaplotypes.
      ]
      ; also contains the values based on 30 populations
      [
      • Gandotra N.
      • Speed W.C.
      • Qin W.
      • Tang Y.
      • Pakstis A.J.
      • Kidd K.K.
      • Scharfe C.
      Validation of novel forensic DNA markers using multiplex microhaplotype sequencing.
      ]
      .
      The choice of 24 MH loci for our comparison study was motivated by the availability of data on a global distribution of populations with genotypes for the 20 CODIS markers plus four other STR loci commonly studied. The STR allele frequencies were downloaded for the 57 available population samples from the popSTR [
      • Amigo J.
      • Phillips C.
      • Salas T.
      • Fernández Formoso F.L.
      • Carracedo A.
      • Lareu M.
      pop.STR – an online population frequency browser for established and new forensic STRs.
      ] online database (http://spsmart.cesga.es/about.php?dataset=strs_local) for the 20 Core CODIS STRs plus D6S1043, SE33, Penta D, and Penta E. The database is version 5.1.1 of SPSmart and the data were last updated in July of 2015 (dbSNP version 132). Some of the population samples downloaded lacked allele frequencies for various STRs. A final set of 32 populations had data on all 24 loci. We refer to these STRs as the augmented CODIS markers. For comparison to the MH data in Fig. 1, the Ae and In values of the 24 augmented CODIS markers are plotted in Fig. 2 and the values are in Table S2. Table S3 shows the chromosomal locations of the 24 STRs and the 24 MHs. Having selected these two sets of markers—24 MHs and 24 STRs—we proceed to document and compare their forensic characteristics. We show that the MH panel would provide valuable additional information if integrated into casework analyses; indeed, the 24 MHs are generally better than the 24 augmented CODIS markers.
      Fig. 2
      Fig. 2Scatterplot of 24 augmented CODIS markers for Ae and In based on 32 populations. The same scale is used for both Fig. 1, Fig. 2 to allow better visual comparison. Because SE33 has such a high Ae value, average Ae = 14.69 and In = 0.674, it is out of range and does not appear in this image.
      The 24 augmented CODIS markers have a noteworthy range of average Ae values (Fig. 2). As a test of the generality of this specific finding, we compared the popSTR average Ae values by locus with individual locus values for the four U.S. Census data from NIST. The graph (Fig. S1), sorted by the augmented CODIS values for popSTR loci shows that all of the data sources have a high correlation of values for the various loci.

      2. Documentation of value for individualization

      The population specific combined random match probability (RMP) of these markers shows the same geographic gradient seen for other panels of microhaplotypes (Fig. 3) [
      • Kidd K.K.
      • Pakstis A.J.
      • Speed W.C.
      • Lagace R.
      • Wootton S.
      • Chang J.
      Selecting microhaplotypes optimized for different purposes.
      ,
      • Bulbul O.
      • Speed W.C.
      • Gurkan C.
      • Soundararajan U.
      • Rajeevan H.
      • Pakstis A.J.
      • Kidd K.K.
      Improving ancestry distinctions among Southwest Asian populations.
      ]. For this much smaller panel, 24 rather than the 90 analyzed by Pakstis et al. [
      • Pakstis A.J.
      • Gandotra N.
      • Speed W.C.
      • Murtha Michael
      • Scharfe Curt
      • Kidd Kenneth K.
      The population genetics characteristics of a 90 locus panel of microhaplotypes.
      ], the RMP range necessarily involves much larger values: from 10−24 in South American Indians down to 10−39 for most African populations. This gradient reflects the most common genotype frequency which ranges from 10−17 down to 10−38. Stated in words, the American Indians have the fewest different alleles and therefore the highest genotype frequencies leading to the highest probability of a random match. All other populations show values intermediate between the larger values for the American Indian populations and the much smaller values in the African populations. The African populations have the most alleles and the lowest genotype frequencies and the lowest probabilities of a random match.
      Fig. 3
      Fig. 3A negative logarithm plot of the population specific Random Match Probabilities and most common genotype frequencies for the 24 MH. Populations are in the same order as in Pakstis et al.
      [
      • Pakstis A.J.
      • Gandotra N.
      • Speed W.C.
      • Murtha Michael
      • Scharfe Curt
      • Kidd Kenneth K.
      The population genetics characteristics of a 90 locus panel of microhaplotypes.
      ]
      and . For most of the world the RMP is around or less than 10−30. See for the population names corresponding to the 3-character abbreviations.
      Fig. 4
      Fig. 4A negative logarithm plot of the population specific RMP values for the augmented CODIS panel of 24 loci. (Data from popSTR downloaded January, 2022.) The values for this global set of populations fall around 10−28 except for the Pacific and Native American populations. Note that the Dominicans are Afro-Caribbean and are “American” by geography, not ancestry.
      While these RMP values are larger than those for large panels of microhaplotypes such as in [
      • Pakstis A.J.
      • Gandotra N.
      • Speed W.C.
      • Murtha Michael
      • Scharfe Curt
      • Kidd Kenneth K.
      The population genetics characteristics of a 90 locus panel of microhaplotypes.
      ] (RMP values for Africans as low as 10−115), they are also a few orders of magnitude smaller than the range of the values for the augmented CODIS STR panel (Fig. S2 compares Fig. 3 with Fig. 4). The stand-alone values for these 24 MHs are at least as probative as the STRs for most of the world. The relevant RMP when MHs and STRs are multiplexed would be at most 10−60. Thus, this small but highly selected panel has forensic value, exceeding the individualization value of the augmented CODIS panel. The information can be combined with the STR data if the markers are genetically independent. As seen in Supplemental Table S3 most pairs of markers are more than a megabase apart. The few that are closer are still far enough apart that no LD is expected.
      Kinship and parentage testing are related to the heterozygosity of the markers [
      • Butler J.M.
      Advanced Topics in Forensic DNA Typing: Interpretation.
      ]. The Ae value can be a measure of the statistical power for parentage testing. As average Ae values are higher the statistical power for clarifying more distant relationships is also higher [
      • Staadig A.
      • Tillmar A.
      Evaluation of microhaplotypes in forensic kinship analysis from a Swedish population perspective.
      ,
      • Wu R.
      • Chen H.
      • Li R.
      • Zang Y.
      • Shen X.
      • Hao B.
      • Wang Q.
      • Sun H.
      Pairwise kinship testing with microhaplotypes: can advancements be made in kinship inference with these markers?.
      ].

      3. Documentation of value for biogeographic ancestry

      Several methods are used to show the ability of a panel of ancestry informative markers (AIMs) to illuminate population relationships. STRUCTURE and Principal Components Analysis (PCA) are two that are commonly used. We have used both. We note that many previously published studies of SNPs on these population samples have demonstrated that they show no significant deviation from random mating expectations.

      3.1 STRUCTURE

      STRUCTURE v.2.3.4 [
      • Pritchard J.K.
      • Stephens M.
      • Donnelly P.
      Inference of population structure using multilocus genotype data.
      ] was used to evaluate and illustrate the clustering of individuals into predefined groups of genetic similarity based on the set of 24 MH loci. The STRUCTURE analysis parameters employed include: 10,000 burn-ins and 10,000 Markov Chain Monte Carlo iterations, admixture model, correlated allele frequencies, 20 independent replicates per predefined number of clusters (K) from K = 5 to K = 10. The input data file for the STRUCTURE analyses contained the individual genotypes for each individual. Analysis used the individual genotypes with no prior information on how they clustered individuals into populations. Graphic output then grouped the individuals into their populations of origin with cluster inference indicated by color. The results for K = 8 and K = 9 are shown in Fig. 5.
      Fig. 5
      Fig. 5STRUCTURE results for highest likelihood runs at K = 8 and 9 for the 24-microhaplotype, 79-population dataset. Each fine vertical line represents one individual. Blowups are shown for regions of several small populations to make the clustering and population labels clearer.
      The 79 populations in the STRUCTURE analysis (Fig. 5) are the same and in the same order as in Pakstis et al. [
      • Pakstis A.J.
      • Gandotra N.
      • Speed W.C.
      • Murtha Michael
      • Scharfe Curt
      • Kidd Kenneth K.
      The population genetics characteristics of a 90 locus panel of microhaplotypes.
      ] and in Fig. 3 but here the 79 populations were analyzed using only the 24 MH selected for this study (Fig. 1; Table 1 and S1). The 24 MH STRUCTURE result in Fig. 5 is very similar to the corresponding figure of K = 6 and K = 7 in Pakstis et al. [
      • Pakstis A.J.
      • Gandotra N.
      • Speed W.C.
      • Murtha Michael
      • Scharfe Curt
      • Kidd Kenneth K.
      The population genetics characteristics of a 90 locus panel of microhaplotypes.
      ] based on the whole panel of 90 MHs.

      3.2 Principal Components Analysis (PCA)

      PCA used XLSTAT 2017 (http://www.xlstat.com/en/about-us/company.html), to compare the similarities and differences among the populations. The PCA of the 79 populations based on the 24 MH loci (Fig. 6) shows a complete separation on PC1 of the sub-Saharan African populations from the remaining populations and a distribution of those remaining on PC2 from a cluster of Europeans to a cluster of East Asians. This analysis and the STRUCTURE analysis document that the MH markers contain significant global ancestry information. Although these 24 MH were selected for high Ae, they also have high In: the In range is from 0.25 to 0.86 (Fig. 1, Table S1).
      Fig. 6
      Fig. 6PCA of 79 populations based on the 24 microhaplotypes.
      The 24 augmented CODIS markers have not been the subject of any STRUCTURE analyses or PCA that we are aware of. The popSTR dataset does not contain the individual-specific genotype profiles that would allow STRUCTURE analysis of the populations. However, other statistical approaches have shown that they can provide some ancestry information [
      • Pritchard J.K.
      • Stephens M.
      • Donnelly P.
      Inference of population structure using multilocus genotype data.
      ,
      • Algee-Hewitt B.F.B.
      • Edge M.D.
      • Kim J.
      • Li J.Z.
      • Rosenberg N.A.
      Individual identifiability predicts population identifiability in forensic microsatellite markers.
      ,
      • Alladio E.
      • Rocca C.D.
      • Barni F.
      • Dugoujon J.M.
      • Garofano P.
      • Semino O.
      • Berti A.
      • Novelletto A.
      • Vincenti M.
      • Cruciani F.
      A multivariate statistical approach for the estimation of the ethnic origin of unknown genetic profiles in forensic genetics.
      ]. We have used PCA on the population frequencies of the 24 STR loci (Fig. 7). We see that relationships similar to those in Fig. 5, Fig. 6 exist. While the few populations with data for the STR PCA are different from the 79 used in Fig. 5, Fig. 6, a global distribution of populations exists in both datasets. It is clear that combining the MH and STR data on a single set of populations would at minimum reinforce the major clusters and may clarify the relationships of many intermediate populations.
      Fig. 7
      Fig. 7PCA of 32 populations based on the augmented CODIS data from popSTR database.

      4. Documentation of value for mixture analysis

      An area in which microhaplotypes can be especially informative is mixture resolution [
      • Bennett L.
      • Oldoni F.
      • Long K.
      • Cisana S.
      • Maddela K.
      • Wootton S.
      • Chang J.
      • Hasegawa R.
      • Lagacé R.
      • Kidd K.K.
      • Podini D.
      Mixture deconvolution by massively parallel sequencing of microhaplotypes.
      ,
      • Oldoni F.
      • Castella V.
      • Grosjean F.
      • Hall D.
      Sensitive DIP-STR markers for the analysis of unbalanced mixtures from “touch” DNA samples.
      ,
      • Oldoni F.
      • Podini D.
      Forensic molecular biomarkers for mixture analysis.
      ,
      • Coble M.D.
      • Bright J.-A.
      Probabilistic genotyping software: an overview.
      ]. High Ae is especially relevant to better mixture resolution. We note that mixture analyses have two different objectives. One is a forensic question of whether a known individual might or might not have contributed to a mixture. While such probabilistic genotyping [
      • Coble M.D.
      • Bright J.-A.
      Probabilistic genotyping software: an overview.
      ] is now common with forensic STR loci, interpretation requires population allele frequencies and is complicated by stutter especially when a minor contributor is in the stutter range of a major contributor to the mixture. Microhaplotypes have an advantage in absence of stutter. For many microhaplotypes good population frequency data, including for the 90 MH that include these selected 24, are becoming available [
      • Pakstis A.J.
      • Gandotra N.
      • Speed W.C.
      • Murtha Michael
      • Scharfe Curt
      • Kidd Kenneth K.
      The population genetics characteristics of a 90 locus panel of microhaplotypes.
      ]. The second objective is the complete deconvolution of the mixture to estimate the genotypes contributing to the mixture. Again, absence of stutter helps resolve the alleles in a mixture but only quantitative data and allele frequency data can provide the additional conversion into genotypes of individuals. The combinatorics of multiple loci complicates knowing the multiple locus genotype even with perfect single locus deconvolution.
      The maximum amount of information about a two-person mixture occurs when there are four alleles (haplotypes) observed at a locus in the analysis of a mixture. Obviously, this cannot occur when the haplotyped locus has only two or three alleles in the population. If only two alleles (haplotypes) exist in the population, as is generally the case for individual SNPs, one can infer a mixture if there are very different quantitative values for the two alleles. If three alleles exist in the population, similar quantitative differences will allow some inference of genotypes although the existence of a mixture can be certain if three alleles are seen in an analysis [
      • Kidd K.K.
      • Pakstis A.J.
      • Speed W.C.
      • Lagace R.
      • Chang J.
      • Wootton S.
      • Haigh E.
      • Kidd J.R.
      Current sequencing technology makes microhaplotypes a powerful new type of genetic marker for forensics.
      ]. It is only possible to see four alleles in a two-person mixture if at least four or more haplotypes (alleles) exist in the population. The probability of fully resolving the mixture at a locus will be a function of the allele frequencies in the population.
      It is possible to estimate a probability of seeing three or four alleles at a locus as proof that a mixture exists if some simplifying assumptions are made. Actual probabilities are functions of the array of allele frequencies of the persons in the mixture. That is too complex to deal with other than by simulations; instead, a simplifying assumption of an effective number of alleles is an approximation. For simplicity we are using the immediately lower integer for each Ae value to give a minimum estimate of observing all four alleles in a 2-person mixture (Table 2).
      Table 2Probabilities of finding four different alleles in a mixture detection analysis for a two-person mixture. The Ae values are based on the observation for the 24 loci in Table S1 using the 30-population dataset sequenced in Gandotra et al.
      • Gandotra N.
      • Speed W.C.
      • Qin W.
      • Tang Y.
      • Pakstis A.J.
      • Kidd K.K.
      • Scharfe C.
      Validation of novel forensic DNA markers using multiplex microhaplotype sequencing.
      or the 79-population dataset including some phase inferred haplotypes
      • Pakstis A.J.
      • Gandotra N.
      • Speed W.C.
      • Murtha Michael
      • Scharfe Curt
      • Kidd Kenneth K.
      The population genetics characteristics of a 90 locus panel of microhaplotypes.
      .
      Ae

      interval
      Probability of 4 alleles being different at one locusNumber of loci in interval

      for 30 populations
      Probability 1- (1-prob)nNumber of loci in interval

      for 79 populations
      Probability 1- (1-prob)n
      4 < 5.093750030.256
      5 < 6.1920011.904110.904
      6 < 7.277784.72830.623
      7 < 8.349850.030.725
      8 < 9.410165.92920.652
      > 9.460914.91620.709
      Cumulative Probability240.9998240.9992
      For these distributions of the 24 MH panel we can calculate the probabilities of the various possibilities following the logic in [
      • Kidd K.K.
      • Speed W.C.
      Criteria for selecting microhaplotypes: mixture detection and deconvolution.
      ]. In this case we are calculating the probability of seeing 4 alleles for a two-person mixture given the number of alleles with the integer Ae value. Table 2 shows that whether we use the Ae values based on 30 populations [
      • Gandotra N.
      • Speed W.C.
      • Qin W.
      • Tang Y.
      • Pakstis A.J.
      • Kidd K.K.
      • Scharfe C.
      Validation of novel forensic DNA markers using multiplex microhaplotype sequencing.
      ] or based on 79 populations [
      • Pakstis A.J.
      • Gandotra N.
      • Speed W.C.
      • Murtha Michael
      • Scharfe Curt
      • Kidd Kenneth K.
      The population genetics characteristics of a 90 locus panel of microhaplotypes.
      ], the probability of seeing at least one locus of the 24 loci with 4 alleles is greater than 0.999. Note, this is a conservative estimate using the lower bound of each Ae interval; the true estimate using the exact Ae values would be higher. As the Ae increases, the number of combinations of four different alleles increases even as the allele frequencies become smaller. The result is an increasing probability of at least one locus having four different alleles in a two-person mixture.
      Results from actual mixture studies illustrate the value of the high Ae markers in this set of 24 MHs (Fig. 8). These examples are based on the SNPs originally used to define the loci (cf. ALFRED; https://alfred.med.yale.edu) and incorporated in the ThermoFisher software accompanying the 74-locus multiplex [
      • Oldoni F.
      • Bader D.
      • Fantinato C.
      • Wooton S.
      • Lagace R.
      • Kidd K.
      • Podini D.
      A sequence-based 74plex microhaplotype assay for analysis of forensic DNA mixtures.
      ]. In the actual sequencing, additional sites are seen [
      • Gandotra N.
      • Speed W.C.
      • Qin W.
      • Tang Y.
      • Pakstis A.J.
      • Kidd K.K.
      • Scharfe C.
      Validation of novel forensic DNA markers using multiplex microhaplotype sequencing.
      ]. In the four mixture examples (Fig. 8) at least one locus (illustrated) allows an estimate of the minimum number of contributors in the mixture. In some cases reasonable quantitative considerations can help estimate the genotypes contributing to the actual mixture. Even if full deconvolution is not possible, a valid estimate of the minimum number of contributors is important for probabilistic genotyping [
      • Coble M.D.
      • Bright J.-A.
      Probabilistic genotyping software: an overview.
      ].
      Fig. 8
      Fig. 8Examples of mixture results. The mixture ratios are given for each example and the read numbers for the haplotypes seen are plotted. The four-person results for mh13KK-218 indicate at least four persons contributed to the mixture. The four-person results for mh21KK-320 indicate at least three persons contributed. The three-person results for mh13KK-218 indicate at least three persons contributed. The three-person results for mh02KK-134 indicate at least three persons contributed to the mixture.

      5. Discussion

      To date most studies of MHs have assumed by default that their use would be relevant independently of or subsequently to ordinary casework analysis of a sample with forensic STR loci. A major focus has been on demonstrating value for ancestry inference (e.g., de la Puente et al. [
      • de la Puente M.
      • Ruiz-Ramirez J.
      • Ambroa-Conde A.
      • Xavier C.
      • Amigo J.
      • Casares de Cal M.
      • Gomez-Tato A.
      • Carracedo A.
      • Parson W.
      • Phillips C.
      • Lareu M.V.
      Broadening the applicability of a custom multi-platform panel of microhaplotypes: bio-geographical ancestry inference and expanded reference data.
      ], Zou et al. [
      • Zou X.
      • He G.
      • Liu J.
      • Jiang L.
      • Wang M.
      • Chen P.
      • Hou Y.
      • Wang Z.
      Screening and selection of 21 novel microhaplotype markers for ancestry inference in ten Chinese subpopulations.
      ]) or mixture deconvolution [
      • Oldoni F.
      • Bader D.
      • Fantinato C.
      • Wooton S.
      • Lagace R.
      • Kidd K.
      • Podini D.
      A sequence-based 74plex microhaplotype assay for analysis of forensic DNA mixtures.
      ,
      • Bennett L.
      • Oldoni F.
      • Long K.
      • Cisana S.
      • Maddela K.
      • Wootton S.
      • Chang J.
      • Hasegawa R.
      • Lagacé R.
      • Kidd K.K.
      • Podini D.
      Mixture deconvolution by massively parallel sequencing of microhaplotypes.
      ], two areas of weakness for the forensic STR loci. Our analyses are directed toward documenting that a small, selected set of MHs can address casework issues and supplement the CODIS markers. Actual use in casework is becoming possible as more and more labs are considering sequencing for casework analyses.
      We have compared the two sets of markers in Fig. 1, Fig. 2 for their Ae and In values. We show that the 24 MHs we selected are better, on average, than the 24 augmented CODIS markers in terms of both Ae and In. All of the MHs have an Ae that exceeds 4.8 whereas 10 of the STR loci fall below that value. Only 8 of the MHs have an In value below 0.40 whereas only 3 of the augmented CODIS markers have an In value above 0.40.
      In Table S1 we show that the In and Ae values of the 24 MH are dependent on the set of populations used to determine those values. The 30 populations are mostly 1000 Genomes [
      • 1000 Genomes Consortium Project
      • Auton A.
      • Brooks L.D.
      • Durbin R.M.
      • Garrison E.P.
      • Kang H.M.
      • Korbel J.O.
      • Marchini J.L.
      • McCarthy S.
      • McVean G.A.
      • Abecasis G.R.
      A global reference for human genetic variation.
      ] populations plus a few others [
      • Gandotra N.
      • Speed W.C.
      • Qin W.
      • Tang Y.
      • Pakstis A.J.
      • Kidd K.K.
      • Scharfe C.
      Validation of novel forensic DNA markers using multiplex microhaplotype sequencing.
      ]. Inclusion of those additional populations may have resulted in an increase in Ae from that based on just the 1000 Genomes alone. The 79 populations include more East Asian and Native American populations, populations that generally have lower values of Ae but would contribute to a higher In.
      The RMP values of the two marker sets illustrate several points. First, the reference populations for the STR loci constitute a poor global reference. Second, data for the MH reference populations show the large differences in the allele frequencies also seen for several panels of SNPs. Though it shows no large difference in the STR RMPs across Africa and EurAsia, the set of East Asian populations is not broad. One possible explanation for the absence of a difference is the higher mutation rate for the STRs compared to MHs that would counter the loss of alleles by random genetic drift. Alternatively, the one East Asian population may just be an outlier.
      With the exception of three loci, the values of average Ae are > 5.0 for the 24 MH markers in this panel. To account for differences in sets of populations studied, an Ae of 4.5 seems to be a good working level for future selection of candidate high-Ae markers. Other studies on fewer populations have found some of these 24 MHs to have high Ae. Turchi [
      • Turchi C.
      • Melchionda F.
      • Pesaresi M.
      • Tagliabracci A.
      Evaluation of a microhaplotypes panel for forensic genetics using massive parallel sequencing technology.
      ] found 14 MHs to have an Ae > 4.5 and five of these are among the 24. Pang [
      • Pang J.-B.
      • Rao M.
      • Chen Q.-F.
      • Ji A.-Q.
      • Zhang C.
      • Kang K.-L.
      • Wu H.
      • Ye J.
      • Nie S.-J.
      • Wang L.
      A 124-plex microhaplotype panel based on next-generation sequencing developed for forensic applications.
      ] found 8 MHs with Ae > 4.5, four of which are among the 24. Both of those studies analyzed a subset of the 182 loci in [
      • Kidd K.K.
      • Pakstis A.J.
      • Speed W.C.
      • Lagace R.
      • Wootton S.
      • Chang J.
      Selecting microhaplotypes optimized for different purposes.
      ]. Thus, they are not independent of the present search.
      Several studies of other sets of populations have published markers that could fall within the range of Ae values for these 24 MH, i.e., Ae > 4.5 [
      • de la Puente M.
      • Phillips C.
      • Xavier C.
      • Amigo J.
      • Carracedo A.
      • Parson W.
      • Lareu M.V.
      Building a custom large-scale panel of novel microhaplotypes for forensic identification using MiSeq and Ion S5 massively parallel sequencing systems.
      ,
      • Wu R.
      • Li H.
      • Li R.
      • Peng D.
      • Wang N.
      • Shen X.
      • Sun H.
      Identification and sequencing of 59 highly polymorphic microhaplotypes for analysis of DNA mixtures.
      ]. They should be evaluated in efforts toward a better panel. A problem with comparing studies for statistics like Ae is that different sets of populations have been used. The Ae rankings can differ depending on the population (Table S1). However, a large global panel should be adequate for identifying the markers with very high Ae. Also, while Ae and In are correlated theoretically [
      • Algee-Hewitt B.F.B.
      • Edge M.D.
      • Kim J.
      • Li J.Z.
      • Rosenberg N.A.
      Individual identifiability predicts population identifiability in forensic microsatellite markers.
      ], the correlation in our studies is weak at the lower levels of Ae. Considerations of In value may be relevant in decisions among individual MHs when the number of MHs is to be kept small. The 1000 Genomes dataset can be used for comparison but does not have a good representation of Native Americans.
      An advantage of the mMHseq methodology used for the 90 microhaplotypes [
      • Gandotra N.
      • Speed W.C.
      • Qin W.
      • Tang Y.
      • Pakstis A.J.
      • Kidd K.K.
      • Scharfe C.
      Validation of novel forensic DNA markers using multiplex microhaplotype sequencing.
      ] and applicable to the 24 MHs in this study is that markers can easily be removed or added. We expect this initial set may be modified when more high Ae markers have been tested on a large global set of populations comparable to the 79 populations studied for these 24 MHs. A future research project will be to find more microhaplotypes that have an average Ae > 4.5 for a large set of populations. Given the lower levels of genetic variation in human populations located farther from Africa, MHs with high Ae will be less common and will require a focused search in those populations. It is especially important to find more markers that have higher Ae values for the East Asian, Pacific, and Native American populations. While the Pacific populations are small in a global context, the Native American and East Asian (including Chinese) populations are not. Fortunately, there are resources that will allow such searches.
      Wu [
      • Wu R.
      • Li H.
      • Li R.
      • Peng D.
      • Wang N.
      • Shen X.
      • Sun H.
      Identification and sequencing of 59 highly polymorphic microhaplotypes for analysis of DNA mixtures.
      ] deliberately searched for MH with Ae > 4 using the 1000 Genomes data. They were successful in identifying, in just the Chinese, many with an average Ae > 5 and a few with an average Ae > 6. These loci should help balance the RMP for East Asians. Because the analyses [
      • Wu R.
      • Li H.
      • Li R.
      • Peng D.
      • Wang N.
      • Shen X.
      • Sun H.
      Identification and sequencing of 59 highly polymorphic microhaplotypes for analysis of DNA mixtures.
      ] were based on 1000 Genomes data and most of the 30 populations [
      • Gandotra N.
      • Speed W.C.
      • Qin W.
      • Tang Y.
      • Pakstis A.J.
      • Kidd K.K.
      • Scharfe C.
      Validation of novel forensic DNA markers using multiplex microhaplotype sequencing.
      ] were in the 1000 Genomes database, it is a close approximation to compare the Ae rankings of the two sets. Sorting the two sets of MH together by Ae shows that the top 24 MHs consist of 14 MHs from Gandotra [
      • Gandotra N.
      • Speed W.C.
      • Qin W.
      • Tang Y.
      • Pakstis A.J.
      • Kidd K.K.
      • Scharfe C.
      Validation of novel forensic DNA markers using multiplex microhaplotype sequencing.
      ] and 10 from Wu [
      • Wu R.
      • Li H.
      • Li R.
      • Peng D.
      • Wang N.
      • Shen X.
      • Sun H.
      Identification and sequencing of 59 highly polymorphic microhaplotypes for analysis of DNA mixtures.
      ] (Table S5). Given the different rankings when more populations are typed (cf. Table S1), this sorting is not final but an example of the need to compare using the same global set of populations. Some of the Wu markers might displace some of the top 24 of the Gandotra markers were identical reference populations used. Resolution of an improved set comparable to these best 24 MHs by Ae will necessarily await more comparable population sets. Thus, data suggest an improved panel could be developed from MHs already identified. What is missing is a comparison based on studies of the same populations.
      Ultimately, a set of MHs needs to be agreed upon by the forensic community. Such agreement should enhance the development of a commercial panel, one that optimizes the multiplexing of the STRs and MHs. Software to separate the interpretation of the different amplicons—those for STRs and those for MHs—from one sequencing run will need to be written but the software already exists for each type of sequence alone. Moreover, Verogen already markets a kit that multiplexes STRs and SNPs, providing a proof of principle that STR loci can be multiplexed with small amplicons containing SNP-like information.
      Microhaplotypes are often considered much less heterozygous than STRs with one estimate that 86 % of ~ 380 MHs had Ae values ranging from 2.0 to 4.0. MHs with Ae values > 5 were especially rare based on review of 7 different publications on MHs [
      • Wu R.
      • Li H.
      • Li R.
      • Peng D.
      • Wang N.
      • Shen X.
      • Sun H.
      Identification and sequencing of 59 highly polymorphic microhaplotypes for analysis of DNA mixtures.
      ]. Many of those studies used a minimum of the 1000 Genomes data; so, there is a global, albeit imperfect, perspective. Our study shows that, while rare, MHs with high Ae values do exist in sufficient numbers for meaningful analyses.
      A very large MH array may be difficult to multiplex while preserving the depth of coverage needed to identify alleles of the minor contributors in mixtures. To avoid this potential problem we have focused on a smaller panel of size comparable to the augmented CODIS panel. Amplicons could be made smaller for many loci in future iterations of a MH panel for multiplexing should that be an issue in optimizing the multiplex. Even if these MH are never multiplexed with the standard CODIS markers, this 24-MH panel is an excellent stand-alone panel for follow-up testing when information from STR analyses is insufficient in casework. Of course, the entire set of 90 MHs is an even better panel for forensic analysis if MH analyses by MPS are an independent follow-up to STR typing by CE.

      6. Conclusion and recommendation

      This panel of 24 microhaps has been shown for its size to be excellent for individualization, for ancestry, and, in theory, for mixture analysis. It has the advantage of using the same sequencing analysis as is becoming useful for the forensic standard STRs. We are proposing that a panel of markers for forensic casework be developed to include these MHs in addition to the CODIS markers. We are recommending the set of 24 microhaplotypes in this study be that initial addition to the new casework kit. We think that these 24 MH loci are adequately spaced among the CODIS markers to be statistically independent for forensic analyses. We have shown that the 24 MHs are very informative and add forensic value in individualization, ancestry inference, and mixture resolution. They are worth incorporating into a forensic casework panel. Indeed, as the database of MHs from casework accumulates, MHs will become sufficient to be a casework panel by themselves.

      Funding

      This work was funded in part by National Institute of Justice (NIJ) Grant 2018-75-CX-0041 awarded to KKK by the National Institute of Justice, Office of Justice Programs of the United States Department of Justice and by the United States National Institutes of Health Grant R01 HD102537 to CS and by NIJ Grant 2017-DN-BX-0164 to DP. Points of view in this presentation are those of the authors and do not necessarily represent the official position or policies of the U.S. Department of Justice.

      CRediT authorship contribution statement

      KKK and AJP designed the study, analyzed the data, and wrote the initial draft of the paper. All authors read the paper and helped edit the initial draft.

      Data Availability

      See Data availability section in Pakstis et al. [
      • Pakstis A.J.
      • Gandotra N.
      • Speed W.C.
      • Murtha Michael
      • Scharfe Curt
      • Kidd Kenneth K.
      The population genetics characteristics of a 90 locus panel of microhaplotypes.
      ].

      Declaration of Competing Interest

      None.

      Acknowledgments

      The authors thank Dr. Francoise R. Friedlaender for her expert help in formatting and labeling the STRUCTURE bar plots. Special thanks go to the many hundreds of individuals who volunteered to give blood or saliva samples for studies of gene frequency variation and to the many colleagues who helped collect the samples. Some cell lines were obtained from the National Laboratory for the Genetics of Israeli Populations at Tel Aviv University.

      Appendix A. Supplementary material

      References

        • Kidd K.K.
        • Pakstis A.J.
        • Speed W.C.
        • Lagace R.
        • Chang J.
        • Wootton S.
        • Ihuegbu N.
        Microhaplotype loci are a powerful new type of forensic marker.
        Forensic Sci. Int. Genet. Suppl. Ser. 2013; 4: e123-e124
        • Kidd K.K.
        • Pakstis A.J.
        • Speed W.C.
        • Lagace R.
        • Chang J.
        • Wootton S.
        • Haigh E.
        • Kidd J.R.
        Current sequencing technology makes microhaplotypes a powerful new type of genetic marker for forensics.
        Forensic Sci. Int. Genet. 2014; 12: 215-224https://doi.org/10.1016/j.fsigen.2014.06.01
        • Phillips C.
        • McNevin D.
        • Kidd K.K.
        • Lagace R.
        • Wootton S.
        • de la Puente M.
        • Freire-Aradas A.
        • Mosquera-Miguel A.
        • Eduardoff M.
        • Gross T.
        • Dagostino L.
        • Power D.
        • Olson S.
        • Hashiyada M.
        • Oz C.
        • Parson W.
        • Schneider P.M.
        • Lareu M.V.
        • Daniel R.
        MAPlex – a massively parallel sequencing ancestry analysis multiplex for Asia-Pacific populations.
        Forensic Sci. Int. Genet. 2019; 42 (doi.org/10.1016/j.fsigen.2019.06.022): 213-226
        • de la Puente M.
        • Phillips C.
        • Xavier C.
        • Amigo J.
        • Carracedo A.
        • Parson W.
        • Lareu M.V.
        Building a custom large-scale panel of novel microhaplotypes for forensic identification using MiSeq and Ion S5 massively parallel sequencing systems.
        Forensic Sci. Int. Genet. 2020; 45102213https://doi.org/10.1016/j.fsigen.2019.102213
        • Gandotra N.
        • Speed W.C.
        • Qin W.
        • Tang Y.
        • Pakstis A.J.
        • Kidd K.K.
        • Scharfe C.
        Validation of novel forensic DNA markers using multiplex microhaplotype sequencing.
        Forensic Sci. Int. Genet. 2020; 47102275https://doi.org/10.1016/j.fsigen.2020.102275
        • Oldoni F.
        • Bader D.
        • Fantinato C.
        • Wooton S.
        • Lagace R.
        • Kidd K.
        • Podini D.
        A sequence-based 74plex microhaplotype assay for analysis of forensic DNA mixtures.
        Forensic Sci. Int. Genet. 2020; 49102367https://doi.org/10.1016/j.fsigen.2020.102367
        • Pakstis A.J.
        • Gandotra N.
        • Speed W.C.
        • Murtha Michael
        • Scharfe Curt
        • Kidd Kenneth K.
        The population genetics characteristics of a 90 locus panel of microhaplotypes.
        Hum. Genet. 2021; 140: 1753-1773https://doi.org/10.1007/s00439-021-02382-0
        • Wu R.
        • Li H.
        • Li R.
        • Peng D.
        • Wang N.
        • Shen X.
        • Sun H.
        Identification and sequencing of 59 highly polymorphic microhaplotypes for analysis of DNA mixtures.
        Int. J. Leg. Med. 2021; 135: 1137-1149https://doi.org/10.1007/s00414-020-02483-x
        • Zhao X.
        • Fan Y.
        • Zeye M.M.J.
        • He W.
        • Wen D.
        • Wang C.
        • Li J.
        • Hua Z.
        A novel set of short microhaplotypes based on non-binary SNPs for forensic challenging samples.
        Int. J. Leg. Med. 2022; 136: 43-53https://doi.org/10.1007/s00414-021-02719-4
        • Bennett L.
        • Oldoni F.
        • Long K.
        • Cisana S.
        • Maddela K.
        • Wootton S.
        • Chang J.
        • Hasegawa R.
        • Lagacé R.
        • Kidd K.K.
        • Podini D.
        Mixture deconvolution by massively parallel sequencing of microhaplotypes.
        Int. J. Leg. Med. 2019; 133: 719-729https://doi.org/10.1007/s00414-019-02010-7
        • Oldoni F.
        • Castella V.
        • Grosjean F.
        • Hall D.
        Sensitive DIP-STR markers for the analysis of unbalanced mixtures from “touch” DNA samples.
        Forensic Sci. Int. Genet. 2017; 28: 111-117https://doi.org/10.1016/j.fsigen.2017.02.004
        • Butler J.M.
        Fundamentals of Forensic DNA Typing. first edition. Elsevier Science, 2009 (eBook ISBN: 9780080961767)
        • Kidd K.K.
        • Pakstis A.J.
        • Speed W.C.
        • Lagace R.
        • Wootton S.
        • Chang J.
        Selecting microhaplotypes optimized for different purposes.
        Electrophoresis. 2018; 39: 2815-2823
        • Kidd K.K.
        • Speed W.C.
        Criteria for selecting microhaplotypes: mixture detection and deconvolution.
        Invest. Genet. 2015; 6: 1https://doi.org/10.1186/s13323-014-0018-3
        • Rosenberg N.A.
        • Li L.M.
        • Ward R.
        • Pritchard J.K.
        Informativeness of genetic markers for inference of ancestry.
        Am. J. Hum. Genet. 2003; 73: 1402-1422https://doi.org/10.1086/380416
        • Amigo J.
        • Phillips C.
        • Salas T.
        • Fernández Formoso F.L.
        • Carracedo A.
        • Lareu M.
        pop.STR – an online population frequency browser for established and new forensic STRs.
        Forensic Sci. Int. Genet. 2009; Suppl. 2: 361-362https://doi.org/10.1016/j.fsigss.2009.08.178
        • Bulbul O.
        • Speed W.C.
        • Gurkan C.
        • Soundararajan U.
        • Rajeevan H.
        • Pakstis A.J.
        • Kidd K.K.
        Improving ancestry distinctions among Southwest Asian populations.
        Forensic Sci. Int. Genet. 2018; 35: 703-711https://doi.org/10.1007/s00414-017-1748-6
        • Butler J.M.
        Advanced Topics in Forensic DNA Typing: Interpretation.
        Academic Press, Oxford2015
        • Staadig A.
        • Tillmar A.
        Evaluation of microhaplotypes in forensic kinship analysis from a Swedish population perspective.
        Int. J. Leg. Med. 2021; 135: 1151-1260https://doi.org/10.1007/s00414-021-02509-y
        • Wu R.
        • Chen H.
        • Li R.
        • Zang Y.
        • Shen X.
        • Hao B.
        • Wang Q.
        • Sun H.
        Pairwise kinship testing with microhaplotypes: can advancements be made in kinship inference with these markers?.
        Forensic Sci. Int. 2021; 325110875https://doi.org/10.1016/j.forsciint.2021.110875
        • Pritchard J.K.
        • Stephens M.
        • Donnelly P.
        Inference of population structure using multilocus genotype data.
        Genetics. 2000; 155: 945-959
        • Algee-Hewitt B.F.B.
        • Edge M.D.
        • Kim J.
        • Li J.Z.
        • Rosenberg N.A.
        Individual identifiability predicts population identifiability in forensic microsatellite markers.
        Curr. Biol. 2016; 26: 935-942https://doi.org/10.1016/j.cub.2016.01.065
        • Alladio E.
        • Rocca C.D.
        • Barni F.
        • Dugoujon J.M.
        • Garofano P.
        • Semino O.
        • Berti A.
        • Novelletto A.
        • Vincenti M.
        • Cruciani F.
        A multivariate statistical approach for the estimation of the ethnic origin of unknown genetic profiles in forensic genetics.
        Forensic Sci. Int. Genet. 2020; 45102209https://doi.org/10.1016/j.fsigen.2019.102209
        • Oldoni F.
        • Podini D.
        Forensic molecular biomarkers for mixture analysis.
        Forensic Sci. Int. Genet. 2019; 41: 107-119https://doi.org/10.1016/j.fsigen.2019.04.003
        • Coble M.D.
        • Bright J.-A.
        Probabilistic genotyping software: an overview.
        Forensic Sci. Int. Genet. 2019; 38: 219-224
        • 1000 Genomes Consortium Project
        • Auton A.
        • Brooks L.D.
        • Durbin R.M.
        • Garrison E.P.
        • Kang H.M.
        • Korbel J.O.
        • Marchini J.L.
        • McCarthy S.
        • McVean G.A.
        • Abecasis G.R.
        A global reference for human genetic variation.
        Nature. 2015; 526: 68-74
        • Turchi C.
        • Melchionda F.
        • Pesaresi M.
        • Tagliabracci A.
        Evaluation of a microhaplotypes panel for forensic genetics using massive parallel sequencing technology.
        Forensic Sci. Int. Genet. 2019; 41: 120-127
        • Pang J.-B.
        • Rao M.
        • Chen Q.-F.
        • Ji A.-Q.
        • Zhang C.
        • Kang K.-L.
        • Wu H.
        • Ye J.
        • Nie S.-J.
        • Wang L.
        A 124-plex microhaplotype panel based on next-generation sequencing developed for forensic applications.
        Sci. Rep. 2020; 10: 1945https://doi.org/10.1038/s41598-020-58980-x
        • de la Puente M.
        • Ruiz-Ramirez J.
        • Ambroa-Conde A.
        • Xavier C.
        • Amigo J.
        • Casares de Cal M.
        • Gomez-Tato A.
        • Carracedo A.
        • Parson W.
        • Phillips C.
        • Lareu M.V.
        Broadening the applicability of a custom multi-platform panel of microhaplotypes: bio-geographical ancestry inference and expanded reference data.
        Front. Genet. 2020; 1581041https://doi.org/10.3389/fgene.20.20.581041
        • Zou X.
        • He G.
        • Liu J.
        • Jiang L.
        • Wang M.
        • Chen P.
        • Hou Y.
        • Wang Z.
        Screening and selection of 21 novel microhaplotype markers for ancestry inference in ten Chinese subpopulations.
        Forensic Sci. Int. Genet. 2022; 58102687https://doi.org/10.1016/j.fsigen.2022.102687