Advertisement

RMplex reveals population differences in RM Y-STR mutation rates and provides improved father-son differentiation in Japanese

Open AccessPublished:August 19, 2022DOI:https://doi.org/10.1016/j.fsigen.2022.102766

      Highlights

      • RMplex differentiated more father-son pairs in Japanese than in Europeans.
      • RMplex differentiated many more father-son pairs than Yfiler Plus.
      • Strong mutation rate differences were found compared to previous studies.
      • Mutation rate population differences may depend on Y-SNP haplogroup background.

      Abstract

      Rapidly mutating Y chromosomal short tandem repeat markers (RM Y-STRs) –characterized by at least one mutation per 100 generations– are suitable for differentiating both related and unrelated males. The recently introduced multiplex method RMplex allows for the efficient analysis of 30 Y-STRs with increased mutation rates, including all 26 currently known RM Y-STRs. While currently available RM Y-STR mutation rates were established mostly from European individuals, here we applied RMplex to DNA samples of 178 genetically confirmed father-son pairs from East Asia. For several Y-STRs, we found significantly higher mutation rates in Japanese compared to previous estimates. The consequent father-son differentiation rate based on RMplex was significantly higher (52%) in Japanese than previously reported for Europeans (42%), and much higher than with Yfiler Plus in both sample sets (14% and 13%, respectively). Further analysis suggests that the higher mutation and relative differentiation rates in Japanese can in part be explained by on average longer Y-STR alleles relative to Europeans. Moreover, we show that the most striking difference, which was found in DYS712, could be linked to a Y-SNP haplogroup (O1b2-P49) that is common in Japanese and rare in other populations. We encourage the forensic Y-STR community to generate more RMplex data from more population samples of sufficiently large sample size in combination with Y-SNP data to further investigate population effects on mutation and relative differentiation rates. Until more RMplex data from more populations become available, caution shall be placed when applying RM Y-STR mutation rate estimates established in one population, such as Europeans, to forensic casework involving male suspects of paternal origin from other populations, such as non-Europeans.

      Keywords

      1. Introduction

      Y chromosomal short tandem repeats (Y-STRs) are commonly used in forensic DNA analysis, especially in sexual assault cases involving mixed DNA evidence to which the male perpetrator and the female victim contributed, as such male-female mixed DNA samples are notorious for showing difficulties in perpetrator identification based on autosomal STR profiling [
      • Kayser M.
      Forensic use of Y-chromosome DNA: a general overview.
      ]. However, due to the paternal inheritance of the male-specific part of the Y chromosome, Y-STR profiles are typically shared between patrilineally related men. In consequence, a Y-STR haplotype match between the male suspect and the evidence DNA does not necessarily indicate that the suspect contributed to the evidence sample, as this could also have been his male relatives that share the same Y-STR haplotype. The need to overcome this limitation motivated the search for Y-STRs characterized by elevated mutation rates [
      • Ballantyne K.N.
      • et al.
      Mutability of Y-chromosomal microsatellites: rates, characteristics, molecular bases, and forensic implications.
      ], 13 Y-STRs which had mutation rates in the order of 10−2 (at least 1 mutation every 100 generations per locus, mpg) were identified and termed rapidly mutating Y-STRs (RM Y-STRs) [
      • Ballantyne K.N.
      • et al.
      A new future of forensic Y-chromosome analysis: rapidly mutating Y-STRs for differentiating male relatives and paternal lineages.
      ]. Recently, a new set of RM Y-STRs was discovered by applying in silico search for candidate markers and empirical confirmation in father-son pairs [
      • Ralf A.
      • et al.
      Identification and characterization of novel rapidly mutating Y-chromosomal short tandem repeat markers.
      ], which increased the number of currently known RM-Y-STRs to 26 and also identified Y-STRs with mutation rates higher than 5 × 10−3 but lower than 10−2 termed fast mutating Y-STRs (FM Y-STRs). Moreover, a new multiplex genotyping assay, RMplex, was developed and validated that analyses a total of 30 Y-STRs with increased mutation rates, which includes the 26 RM Y-STRs and 4 FM Y-STRs [
      • Ralf A.
      • et al.
      RMplex: an efficient method for analyzing 30 Y-STRs with high mutation rates.
      ]. It was shown that this new RMplex assay outperformed the state-of-the art commercial Y-STR genotyping assay Yfiler™ Plus PCR Amplification Kit, demonstrating increased father-son differentiation rate by a factor of three [
      • Neuhuber F.
      • et al.
      Improving the differentiation of closely related males by RMplex analysis of 30 Y-STRs with high mutation rates.
      ].
      Notably, all currently known RM Y-STRs were discovered solely via mutation rate studies in Europeans [
      • Ballantyne K.N.
      • et al.
      Mutability of Y-chromosomal microsatellites: rates, characteristics, molecular bases, and forensic implications.
      ,
      • Ralf A.
      • et al.
      Identification and characterization of novel rapidly mutating Y-chromosomal short tandem repeat markers.
      ]. While some worldwide father-son and population data were previously established for the initial set of 13 RM Y-STRs [
      • Ballantyne K.N.
      • et al.
      Toward male individualization with rapidly mutating Y‐chromosomal short tandem repeats.
      ], non-European data for the new set of 26 RM Y-STRs and for the full set of 30 Y-STRs included in RMplex are missing as of yet. Some previous studies showed differences in mutation rates between males from populations for the initial set of 13 RM Y-STRs [
      • Ballantyne K.N.
      • et al.
      Toward male individualization with rapidly mutating Y‐chromosomal short tandem repeats.
      ,
      • Adnan A.
      • et al.
      Improving empirical evidence on differentiating closely related men with RM Y-STRs: a comprehensive pedigree study from Pakistan.
      ,
      • Boattini A.
      • et al.
      Mutation rates and discriminating power for 13 rapidly-mutating Y-STRs between related and unrelated individuals.
      ,
      • Chen Y.
      • et al.
      Mutation rates of 13 RM Y-STRs in a Han population from Shandong province, China.
      ,
      • Yuan L.
      • et al.
      Mutation analysis of 13 RM Y-STR loci in Han population from Beijing of China.
      ,
      • Zgonjanin D.
      • et al.
      Mutation rate at 13 rapidly mutating Y-STR loci in the population of Serbia.
      ,
      • Zhang W.
      • et al.
      Multiplex assay development and mutation rate analysis for 13 RM Y-STRs in Chinese Han population.
      ]. While most previous Y-STR mutation rate studies lack Y-SNP haplogroup data, Claerhout et al. studied the effect of Y-SNP haplogroups on Y-STR mutation rates in European males [
      • Claerhout S.
      • et al.
      Determining Y-STR mutation rates in deep-routing genealogies: identification of haplogroup differences.
      ] and found lower overall average mutation rates within some Y-SNP haplogroups (i.e., haplogroup I & J) compared to others (i.e., R1b). Lower Y-STR mutation rates coincided with allele frequency distributions that were skewed towards shorter alleles for some Y-STRs in males with haplogroup I & J. Other previous STR mutation studies also revealed a strong impact of allele length, i.e., number of repeats, on mutability of STRs, with longer alleles leading to increased and shorter ones to decreased mutation rates [
      • Kelkar Y.D.
      • et al.
      The genome-wide determinants of human and chimpanzee microsatellite evolution.
      ,
      • Brinkmann B.
      • et al.
      Mutation rate in human microsatellites: influence of the structure and length of the tandem repeat.
      ]. Many populations have varying Y-SNP haplogroup compositions indicating different male founders who carried their own Y-STR haplotypes. Male founders with longer Y-STR alleles may lead to increased mutability of such Y-STRs in the population of descendants, but data evidence to empirically prove this hypothesis is scarce.
      Y-STR mutations are rare events and thus mutation rate estimates are prone to stochastic effects. Hence, ideally, thousands of father-son pairs should be analyzed to make accurate Y-STR mutation rate estimates in samples from many populations; in reality, however, most studies only involve hundreds. Obviously, the larger the mutation rate of a Y-STR, the more mutations will be observed, thereby decreasing the impact of stochastic effects. This makes RM Y-STRs the most suitable markers for studying population effects on Y-STR mutation and male relative differentiation rates, even when sample size is limited.
      A recent study applied the current state-of-the-art commercial Yfiler™ Plus PCR Amplification Kit (Yfiler) to Japanese father-son pairs and found that the six RM Y-STR markers included in this commercial kit did not show the expected elevated mutation rates known from previous European studies [
      • Otagiri T.
      • et al.
      Mutation analysis for 25 Y-STR markers in Japanese population.
      ]. The authors therefore concluded that this commercial kit may be less suitable for the purpose of male relative differentiation in the Japanese population.
      In the current study, we applied RMplex to DNA-confirmed father-son pairs originating from Japan. The newly established mutation and relative differentiation rate data were compared to previously published consensus mutation rate estimates based on multiple studies [
      • Neuhuber F.
      • et al.
      Improving the differentiation of closely related males by RMplex analysis of 30 Y-STRs with high mutation rates.
      ]. Additionally, Y haplogroup data were generated from genotyping 10 Y-SNPs to infer 8 Y haplogroups known to be frequent in Japanese, which allowed us to link the established Y-STR mutation data with the relevant Y-SNP haplogroup background information.

      2. Materials and methods

      2.1 DNA samples

      Genomic DNA was extracted of a total of 340 males using QIAamp DNA Blood Mini Kit (Qiagen) following the manufacturer’s instructions. Of the 340 males were included in this study, 296 originated from pairs of fathers with a single son (n = 148), while additionally the dataset consisted of one father with three sons, and 12 fathers with two sons. Lastly, the dataset included one family with four generations of male relatives (one individual per generation, three father-son pairs). In total, 178 father-son pairs were defined from this dataset. The biological relationship of all pair of males was confirmed through autosomal STR typing. This study was approved by the Ethics committee of Shinshu University School of Medicine (Permission number: 667). All individuals included in this study provided informed consent. A total of 280 samples used in the present study overlap with those in the previous study based on Yfiler Plus [
      • Otagiri T.
      • et al.
      Mutation analysis for 25 Y-STR markers in Japanese population.
      ], while the remaining 60 samples were newly typed in the context of the present study.

      2.2 Amplification and genotyping

      Y-STR typing was done using the protocol previously described by Ralf et al. [
      • Ralf A.
      • et al.
      RMplex: an efficient method for analyzing 30 Y-STRs with high mutation rates.
      ], while using the alternative forward primer for DYS570 as described in that publication. The total reaction volume was reduced to 10 µL, which keeping the primer concentrations the same as described previously [
      • Ralf A.
      • et al.
      RMplex: an efficient method for analyzing 30 Y-STRs with high mutation rates.
      ]. Amplification was performed using a GeneAmp® PCR System 9700 (Thermo Fisher Scientific), the resulting PCR products were analyzed using a 3500 Genetic Analyzer (Thermo Fisher Scientific) using POP-4 Polymer (Thermo Fisher Scientific) and the resulting electropherograms were analyzed using GeneMapper® ID-X Software v1.4 (Thermo Fisher Scientific).

      2.3 Y-SNP based haplogroup analysis

      Y-SNP testing was performed by developing a custom-made multiplex genotyping assay using SNaPshot™ Multiplex Kit (Thermo Fisher Scientific). A total of 10 Y-SNPs were selected, 8 target specific (sub)haplogroups that were expected to be present in the Japanese male population: C-M130, D-M174, N-M231, O1a-M119, O1b2-P49, O2-M122, O-P186* (xM119,P49,M122), Q-M242). Additional two intermediate Y-SNPs were included in the design: DE-M145 and CF-P143. The primer sequences and thermal cycler conditions are shown in supplementary Table S1. The PCR was performed in 10 µL volumes including 5 µL QIAGEN Multiplex PCR Plus Kit (Qiagen), PCR primers with the concentrations as detailed in supplementary Table S1 and 1 µL of DNA (~ 1 ng/ µL). The PCR products were purified by adding 4 µL of 1 unit / µL EXO-SAP (Thermo Fisher Scientific) to the total volume of PCR products and incubating for 30 min at 37 °C, followed by 15 min at 80 °C for deactivation. SBE reactions were performed as described in Table S1 in a mix consisting of 2 µL SNaPshot Ready Reaction Mix (Thermo Fisher Scientific), 2 µL of primer mix, 5 µL of DNA-free water and 1 µL of purified PCR products. The SBE products were purified by adding 2 µL of 1 unit / µL SAP (Thermo Fisher Scientific) to the total volume of SBE products and incubating for 30 min at 37 °C, followed by 15 min at 80 °C for deactivation. Amplifications, enzymatic purifications and single base pair extensions were performed using a GeneAmp® PCR System 9700 (Thermo Fisher Scientific), the resulting PCR products were analyzed using a 3500 Genetic Analyzer (Thermo Fisher Scientific) using POP-4 Polymer (Thermo Fisher Scientific) and the resulting electropherograms were analyzed using GeneMapper® ID-X Software v1.4 (Thermo Fisher Scientific).

      2.4 Data analysis

      Both the mutation rates and the differentiation rates were calculated using the frequentist approach. The Clopper-Pearson interval was used to determine the 95% confidence intervals of these rates. Fisher’s exact tests were used to determine the statistical significance of difference observed between the present study and previous studies. Pairwise Rst values were calculated by an in-house pipeline that performs per-marker allele comparisons for each pair of samples. The Rst value for a given pair was defined as the sum of the differences among all Y-STRs. Multi-copy markers pose additional complexity for this approach because it is typically not possible to tell which copies correspond to each other. To calculate the Rst value in such Y-STRs the pipeline chose the shortest path, e.g. if one individual typed with alleles 12, 16 was paired with another individual displaying alleles 13, 15, the pipeline derived a distance value of 2 (the sum of the difference between 12 and 13, and the difference between 15 and 16) instead of 6 (the sum of the difference between 12 and 15 and the difference between 13 and 16). R [

      R Core Team, R: a language and environment for statistical computing, 2013.

      ] was used to create boxplots.

      3. Results and discussion

      3.1 Mutation analysis

      In total, 157 mutations were observed amongst the 178 Japanese father-son pairs, of which 138 were detected using RMplex and 29 with Yfiler Plus of which 10 were found at the six RM Y-STRs overlapping between the two methods. All mutations with the specific allele changes and the haplogroup of the pairs in which they occurred are shown in supplementary Table S2. A total of two 2-step mutations were observed, one in DYF399S1 and one in DYS712, the remaining 155 mutations (98.7%) were single step mutations. Generally speaking, the sample size in the present study was, with 178 father-son pairs, relatively small to yield highly reliable estimations of the mutation rates. This uncertainty becomes apparent from the 95% confidence intervals where even Y-STRs not showing a single mutation among the 178 pairs have an upper limit of 2 × 10−2 mpg (Table 1). Because of this limited sample size, all conclusions ought to be treated with caution, at least until larger scale studies in the same population replicate these results.
      Table 1Empirically established locus-specific mutation rates of 49 Y-STRs by applying RMplex and Yfiler™ Plus to a total of 178 DNA-confirmed father-son pairs from Japan and their comparisons with consensus locus-specific reference mutation rates previously described
      • Neuhuber F.
      • et al.
      Improving the differentiation of closely related males by RMplex analysis of 30 Y-STRs with high mutation rates.
      .
      MarkerAssayTotal pairsMutationsExpansionsContractionsMutation rate (x10−3)95% confidence interval (x10−3)Reference mutation rate (x10−3)p-value#
      DYF399S1RMplex17819712106.765.5–161.762.80.0280
      DYS712RMplex178178995.556.6–148.531.10.0001
      DYF1001RMplex1781711695.556.6–148.548.00.0120
      DYF403S1aRMplex17895450.623.4–93.827.30.0985
      DYF1000RMplex17884444.919.6–86.635.90.5305
      DYS713RMplex17871639.316–79.313.90.0162
      DYS458Yfiler Plus17863333.712.5–71.98.50.0051
      DYS724RMplex17862433.712.5–71.948.00.4643
      DYS1010RMplex17851428.19.2–64.314.00.1852
      DYS711RMplex17842222.56.2–56.526.61.0000
      DYS612RMplex17841322.56.2–56.516.30.5401
      DYF403S1bRMplex17844022.56.2–56.59.10.0859
      DYS1005RMplex17840422.56.2–56.59.80.1191
      DYS576Yfiler Plus+RMplex17841322.56.2–56.512.70.2938
      DYS460Yfiler Plus17842222.56.2–56.54.30.0092
      DYS547RMplex17833016.93.5–48.514.70.7474
      DYS1007RMplex17832116.93.5–48.517.21.0000
      DYF404S1RMplex17831216.93.5–48.512.50.4921
      DYS1013RMplex17821111.21.4–4010.81.0000
      DYR88RMplex17821111.21.4–40.026.30.3170
      DYS526bRMplex17821111.21.4–40.012.31.0000
      DYF1002RMplex17821111.21.4–40.016.80.7650
      DYS570Yfiler Plus+RMplex17822011.21.4–40.08.30.6612
      DYS1003RMplex17821111.21.4–40.012.61.0000
      DYS481Yfiler Plus17821111.21.4–40.04.70.2141
      DYS1012RMplex17821111.21.4–40.015.81.0000
      DYS626RMplex17820211.21.4–40.08.60.6676
      DYS627Yfiler Plus+RMplex17822011.21.4–40.014.51.0000
      DYS449Yfiler Plus+RMplex1781015.60.1–30.911.20.7258
      DYS385Yfiler Plus1781105.60.1–30.97.51.0000
      DYF387S1Yfiler Plus+RMplex1781015.60.1–30.910.21.0000
      DYS456Yfiler Plus1781105.60.1–30.94.40.5443
      DYS393Yfiler Plus1781105.60.1–30.91.70.2719
      DYS19Yfiler Plus1781015.60.1–30.92.00.3041
      YGATAH4Yfiler Plus1781105.60.1–30.91.90.2980
      DYS442RMplex1781015.60.1–30.97.41.0000
      DYS635Yfiler Plus1781105.60.1–30.93.80.5005
      DYS389IIYfiler Plus1781015.60.1–30.95.50.6267
      DYS533Yfiler Plus1780000.00–20.53.51.0000
      DYS439Yfiler Plus1780000.00–20.54.81.0000
      DYS391Yfiler Plus1780000.00–20.52.51.0000
      DYS518Yfiler Plus+RMplex1780000.00–20.513.30.1784
      DYS437Yfiler Plus1780000.00–20.51.21.0000
      DYF393S1RMplex1780000.00–20.57.10.6248
      DYS389IYfiler Plus1780000.00–20.52.41.0000
      DYS448Yfiler Plus1780000.00–20.50.81.0000
      DYS390Yfiler Plus1780000.00–20.52.71.0000
      DYS438Yfiler Plus1780000.00–20.50.31.0000
      DYS392Yfiler Plus1780000.00–20.50.81.0000
      #Statistically significant differences (p-values < 0.05) are indicated in bold.
      The Y-STR marker that showed most mutations was DYF399S1 with 19 mutations, which confirms previous studies in different populations showing that this multi-copy RM Y-STR is the most mutable out of all previously characterized Y-STR [
      • Ballantyne K.N.
      • et al.
      Mutability of Y-chromosomal microsatellites: rates, characteristics, molecular bases, and forensic implications.
      ,
      • Ralf A.
      • et al.
      Identification and characterization of novel rapidly mutating Y-chromosomal short tandem repeat markers.
      ,
      • Neuhuber F.
      • et al.
      Improving the differentiation of closely related males by RMplex analysis of 30 Y-STRs with high mutation rates.
      ,
      • Ballantyne K.N.
      • et al.
      Toward male individualization with rapidly mutating Y‐chromosomal short tandem repeats.
      ,
      • Adnan A.
      • et al.
      Improving empirical evidence on differentiating closely related men with RM Y-STRs: a comprehensive pedigree study from Pakistan.
      ,
      • Boattini A.
      • et al.
      Mutation rates and discriminating power for 13 rapidly-mutating Y-STRs between related and unrelated individuals.
      ,
      • Chen Y.
      • et al.
      Mutation rates of 13 RM Y-STRs in a Han population from Shandong province, China.
      ,
      • Yuan L.
      • et al.
      Mutation analysis of 13 RM Y-STR loci in Han population from Beijing of China.
      ,
      • Zgonjanin D.
      • et al.
      Mutation rate at 13 rapidly mutating Y-STR loci in the population of Serbia.
      ,
      • Zhang W.
      • et al.
      Multiplex assay development and mutation rate analysis for 13 RM Y-STRs in Chinese Han population.
      ]. However, in the current study DYF399S1 mutated even more frequently than previously observed with > 10% of the analyzed pairs showing a mutation at this RM Y-STR (Table 1). The noted increase in mutation rate was statistically significant (p-value: 0.028) relative to the reference mutation rate based on > 7500 father-son pairs of which only 6.3% displayed a mutation for this multi-copy marker [
      • Neuhuber F.
      • et al.
      Improving the differentiation of closely related males by RMplex analysis of 30 Y-STRs with high mutation rates.
      ]. Another remarkable result was the high number of 17 mutations found at DYS712, which resulted in a mutation rate estimate of 9.6 × 10−2 mpg. This mutation rate we obtained for DYS712 here from Japanese father-son pair data is significantly higher than those previously obtained from European father-son pair data with 2.7 × 10−2 mpg (p-value: <0.0001) [
      • Ralf A.
      • et al.
      Identification and characterization of novel rapidly mutating Y-chromosomal short tandem repeat markers.
      ] and 4.3 × 10−2 mpg (p-value: 0.0138) [
      • Neuhuber F.
      • et al.
      Improving the differentiation of closely related males by RMplex analysis of 30 Y-STRs with high mutation rates.
      ]. A study analyzing father-son pairs from the Shanxi Province in China reported a mutation rate of 3.0 × 10−2 mpg for DYS712 [
      • Liu J.
      • et al.
      The construction and application of a new 17-plex Y-STR system using universal fluorescent PCR.
      ], which is similar to the rates previously obtained from European data for this marker, but significantly lower than the rate we obtained here for Japanese (p-value: 0.0030).
      Notably, the commonly used Y-STR DYS458 showed a mutation rate of 3.4 × 10−2 mpg in the present study, which is remarkably higher compared to the rate of 8.4 × 10−3 mpg that was previously estimated in a study using European father-son pairs [
      • Ballantyne K.N.
      • et al.
      Mutability of Y-chromosomal microsatellites: rates, characteristics, molecular bases, and forensic implications.
      ]. Also, compared to the consensus estimate based on data from 11,830 father son-pairs [
      • Neuhuber F.
      • et al.
      Improving the differentiation of closely related males by RMplex analysis of 30 Y-STRs with high mutation rates.
      ], the observed mutation rate in the current Japanese study is significantly higher (p-value: 0.0051) than the consensus mutation rate of 8.5 × 10−3 that was previously obtained (Table 1). Other Y-STRs that show significant differences compared to the previously reported consensus mutation rate estimates [
      • Neuhuber F.
      • et al.
      Improving the differentiation of closely related males by RMplex analysis of 30 Y-STRs with high mutation rates.
      ] are: DYF1001 (p-value 0.0120), DYS460 (p-value 0.0092), and DYS713 (p-value 0.0162). All six Y-STRs showed an increased mutation rate in the present study compared to the previous consensus estimates. Table 1 shows mutation rate estimates as obtained in the present study; these mutation rates are compared to the consensus Y-STR mutation rate estimates as described in Neuhuber et al., 2022 [
      • Neuhuber F.
      • et al.
      Improving the differentiation of closely related males by RMplex analysis of 30 Y-STRs with high mutation rates.
      ].

      3.2 Differentiation of father-son pairs

      Based on Yfiler Plus alone, a total of 25 out of the 178 (14%) Japanese father-son pairs were differentiated, which is comparable to a previous study based on European males where 13% of father-son pairs were differentiated [
      • Neuhuber F.
      • et al.
      Improving the differentiation of closely related males by RMplex analysis of 30 Y-STRs with high mutation rates.
      ]. Based on RMplex alone, a total of 93 out of the 178 pairs (52%) were differentiated, which reflects an about 3.7-fold increase compared to Yfiler Plus in the same samples, and is significantly higher (Fisher’s exact p-value: 0.0179) than in a previous study based on European father-son pairs where a differentiation rate of 42% was reported [
      • Neuhuber F.
      • et al.
      Improving the differentiation of closely related males by RMplex analysis of 30 Y-STRs with high mutation rates.
      ]. Combining the data from both Yfiler Plus and RMplex resulted in an even higher father-son differentiation rate of 57% with 101 out of the 178 Japanese pairs being separated. With less than a five percent point increase, the contribution of the non-overlapping Yfiler Plus markers to the differentiation of male relatives was limited, as expected based on their lower mutation rates.
      When looking more closely to the number of mutations that differentiate a father-son pair, we saw that 22 of the 25 pairs differentiated by Yfiler Plus (88%) only showed a mutation at a single Y-STR marker. In contrast, for RMplex and for both assays combined, 62% of the 93% and 57% of the 101 differentiated pairs were separated only by a single mutation, respectively. Furthermore, 8%, 27%, and 33% of the differentiated pairs were separated by mutations at two Y-STRs and 4%, 11%, and 8% of the differentiated pairs by mutations at three Y-STRs, for Yfiler Plus, RMplex, and the combined methods, respectively (Fig. 1). Mutations at four and five markers were only observed when both methods were combined and were each observed only in a single pair.
      Fig. 1
      Fig. 1Percentage of father-son pairs analyzed with Yfiler Plus (25 Y-STRs), RMplex (30 Y-STRs), and both methods combined (49 Y-STRs) with mutations at zero, one, two, three, four, and five Y-STR markers per pair. None of these pairs was differentiated by mutations at more than five Y-STRs. The error bars represent the exact binomial 95% confidence interval (Clopper-Pearson).

      3.3 Differentiation of unrelated males

      To assess the efficiency in differentiated unrelated males, we compared the Y-STR haplotypes obtained with Yfiler Plus and RMplex combined in the total of 162 fathers and found that each of them carried a unique haplotype based on the full set of 49 Y-STR. Notably, the same number of unique haplotypes were also seen when considering RMplex and Yfiler Plus separately. Hence, in the current study, no difference in capabilities to differentiate unrelated males was seen for RMplex and Yfiler Plus, which was also reported in a previous European study [
      • Neuhuber F.
      • et al.
      Improving the differentiation of closely related males by RMplex analysis of 30 Y-STRs with high mutation rates.
      ]. However, the relatively low sample size in both studies might have influenced this rather unexpected result, as the probability of observing shared haplotypes too increases with sample size. Future studies with increased sample size need to show if indeed both methods are performing equally well in differentiating unrelated men, or if the identical performance of both methods was influenced by sample size effects in this Japanese and the previous European study. Despite its limited sample size, the data obtained in the current study does provide some insights regarding the potential to differentiate unrelated males by determining pairwise Rst values. Rst considers the mutational differences between haplotypes and is typically estimated from data of different population to express the proportion of the diversity seen between populations. Here we estimated Rst between pairs of individual haplotypes, and not between populations as is typically done. This way, Rst provides an estimate on the diversity difference between the haplotypes derived from the two genotyping methods. As shown in Fig. 2, there is a sharp distinction between Yfiler Plus and RMplex in pairwise-Rst distributions, where Yfiler Plus clearly results in lower Rst values, indicating more similarity between the Yfiler Plus derived haplotypes from the unrelated males compared to those from RMplex. By extrapolation, it could be expected that more similarity in a small sample would translate to more overlapping haplotypes in a significantly lager sample. Although, there is still a need for empirical evidence based on large numbers of unrelated males, the difference in Rst values between the two methods could be seen as a first indication that RMplex may be superior in differentiating not only related males but also unrelated ones.
      Fig. 2
      Fig. 2Pairwise Rst-value distribution obtained RMplex and Yfiler Plus based on the 162 unrelated Japanese males.

      3.4 Differences in Y-STR allele lengths between populations and Y-SNP haplogroups

      Y-SNP analysis was performed on 162 unrelated fathers with one individual not providing a full profile, leaving 161 unrelated males in this analysis for which Y-SNP based haplogroups were established. The most commonly observed haplogroup in this Japanese dataset was O1b2-P49 (32%), followed by D-M174 (30%), O2-M122 (19%), C-M130 (14%). The remaining four haplogroups were found in less than 5 individuals each (<2.5%). A previous study that analyzed Y-SNPs in commonly observed Japanese surnames found that 37% belonged to haplogroup D, 30% to O1b, 20% to O2, and 9% to C [
      • Ochiai E.
      • et al.
      Y chromosome analysis for common surnames in the Japanese male population.
      ]. Despite some stochastic variations, these results show a similar occurrences of the most common haplogroups in the Japanese population.
      One of the most remarkable results obtained in the present study was the increased number of mutations found at DYS712. We compared the allele frequencies previously reported for DYS712 from a European population sample from Austria [
      • Neuhuber F.
      • et al.
      Improving the differentiation of closely related males by RMplex analysis of 30 Y-STRs with high mutation rates.
      ], to the allele frequencies found in the current Japanese samples. Fig. 3 shows that there was more variability in the DYS712 Y-STR alleles found in the Japanese relative to the Europeans. Moreover, the Japanese sample had a higher median allele length compared to the European. When comparing the most common Y-SNP haplogroups in the Japanese sample set (Fig. 3), it became evident that the longer Y-STR alleles at DYS712 were especially common in males belonging to a subgroup of haplogroup O1b2 (O-P49). This was the most frequently observed haplogroup in the current Japanese sample and is completely absent from Europeans [
      • Navarro-López B.
      • et al.
      Phylogeographic review of Y chromosome haplogroups in Europe.
      ]. It is widely established that longer STR alleles are more prone to mutations [
      • Kelkar Y.D.
      • et al.
      The genome-wide determinants of human and chimpanzee microsatellite evolution.
      ,
      • Brinkmann B.
      • et al.
      Mutation rate in human microsatellites: influence of the structure and length of the tandem repeat.
      ]. This notion is supported by the fact that in the current study 13 out of the total 17 mutations (76%) observed at DYS712 arose from fathers that had an allele length larger than the population median value of 23 in this Japanese dataset. That this effect likely is haplogroup dependent is further supported by our finding that 9 out of 17 pairs (53%) that showed a mutation for DYS712 belonged to haplogroup O1b2 (O-P49). Notably, this haplogroup was rarely found in Eastern Han Chinese [
      • Lang M.
      • et al.
      Forensic characteristics and genetic analysis of both 27 Y-STRs and 143 Y-SNPs in Eastern Han Chinese population.
      ], which may explain why the high mutation rate found in the current Japanese dataset for DYS712 was not previously found in Chinese from the Shanxi region [
      • Liu J.
      • et al.
      Development of a new 17 Y-STRs system using fluorescent-labelled universal primers and its application in Shanxi population in China.
      ]. Also the opposite effect is seen in the Japanese data, where the frequently occurring haplogroup D generally displays short Y-STR alleles at DYS712 (Fig. 3), while only 1 out of the 17 pairs (6%) that showed a mutation at DYS712 belonged to haplogroup D despite the high overall prevalence of haplogroup D in the studied population sample. Allele frequencies for all 49 Y-STRs analyzed with RMplex and Yfiler Plus are shown in supplementary Table S3; additionally the haplotype frequencies of the multi-copy are shown in Table S4.
      Fig. 3
      Fig. 3Boxplots showing the difference in allele length distribution of the RM Y-STR marker DYS712 between the current Japanese study (overall and stratified per predicted Y-SNP haplogroup) where this marker has a mutation rate of 9.6 × 10−2 and a previous European study where the mutation rate was 3.1 × 10−2
      [
      • Neuhuber F.
      • et al.
      Improving the differentiation of closely related males by RMplex analysis of 30 Y-STRs with high mutation rates.
      ]
      .
      To further investigate the potential influence of allele frequencies on mutation rates, we compared both the mutation rates and the mean allele frequency from both the current Japanese study and the previous European study that used RMplex [
      • Neuhuber F.
      • et al.
      Improving the differentiation of closely related males by RMplex analysis of 30 Y-STRs with high mutation rates.
      ] for the six Y-STRs that showed significantly different mutation rates between the current study and the previously established consensus mutation rates (Table 1). As evident from Table 2, the observed mutation rate differences were not always statistically significant in the direct comparison. Both studies had a relatively small sample size leading to reduced statistical power, whereas the sample size of the consensus mutation rate estimates used in prior comparisons were relatively large. Regardless, the higher occurrence of mutations in the current study on Japanese father-son pairs compared to the study based on European pairs was still clearly noticeable (Table 2). Interestingly, for all six Y-STRs, the mean allele length was also higher in the Japanese population than it was in the European population, albeit to different extents across the six markers (Table 2).
      Table 2Direct comparison of the mutation rates and mean allele lengths between the current Japanese study and a previous European study
      • Neuhuber F.
      • et al.
      Improving the differentiation of closely related males by RMplex analysis of 30 Y-STRs with high mutation rates.
      for six Y-STRs with remarkable high mutation rate estimates in the current Japanese study.
      Mutation rates (x10−3)Mean allele length
      MarkerJapaneseEuropeanp-value#JapaneseEuropean
      DYF399S1106.777.40.21723.923.1
      DYF100195.536.00.00564.362.5
      DYS71295.543.40.01423.421.4
      DYS71339.317.00.13946.143.0
      DYS45833.713.20.10317.016.2
      DYS46022.57.50.11510.710.5
      #The p-values are based on Fisher’s exact tests and the proportions of the father-son pairs where mutations were observed from each study, respectively.
      Due to limited sample size, the evidence we present here for increased RM Y-STR mutation and relative differentiation rates in Japanese compared to Europeans being explained by population effects is statistically not clear-cut. However, our findings show a clear trend, which shall be confirmed with more Japanese data in future studies. Moreover, also other population samples of paternally related males from different parts of the world and of suitable size should be analyzed with RMplex and relevant Y-SNPs to collect more empirical evidence on whether population differences in RM Y-STR mutation and relative differentiation rates truly exist and can be explained.

      4. Conclusion

      Here, we showed for the first time, the efficiency at which RMplex differentiates non-European, i.e., Japanese, father-son pairs, which turned out to be at a significantly higher rate (52%) than previously established in Europeans (42%), and much higher than with the current state-of-the-art commercial Y-STR kit Yfiler Plus (14%). Furthermore, we show that Y-STR mutation rates may depend on Y-STR allele lengths and that this effect can be linked to Y-SNP haplogroup background. By showing preliminary evidence for population differences in RM Y-STR mutation and male relative differentiation rates, our study highlights that the rates obtained from one population may not necessarily be transferable to samples from another population. For forensic practice this means that using RM Y-STR mutation rates established in for instance Europeans for interpreting RMplex casework results from a suspect of non-European paternal ancestry may be error-prone. Additionally, our study calls for more RMplex population studies of suitably large (i.e., larger than used here) sample size to further study potential population effects on RM Y-STR mutation and relative differentiation rates, where Y-SNP haplogroups should be analyzed too to better understand the rate differences that may be observed between different populations. Ultimately, our study implies that there may be a need to look beyond locus-specific mutation rates and rather start establishing population-specific, haplogroup-specific, or perhaps allele-specific mutation rates for each locus.

      References

        • Kayser M.
        Forensic use of Y-chromosome DNA: a general overview.
        Hum. Genet. 2017; 136: 621-635
        • Ballantyne K.N.
        • et al.
        Mutability of Y-chromosomal microsatellites: rates, characteristics, molecular bases, and forensic implications.
        Am. J. Hum. Genet. 2010; 87: 341-353
        • Ballantyne K.N.
        • et al.
        A new future of forensic Y-chromosome analysis: rapidly mutating Y-STRs for differentiating male relatives and paternal lineages.
        Forensic Sci. Int. Genet. 2012; 6: 208-218
        • Ralf A.
        • et al.
        Identification and characterization of novel rapidly mutating Y-chromosomal short tandem repeat markers.
        Hum. Mutat. 2020; 41: 1680-1696
        • Ralf A.
        • et al.
        RMplex: an efficient method for analyzing 30 Y-STRs with high mutation rates.
        Forensic Sci. Int. Genet. 2021; 55102595
        • Neuhuber F.
        • et al.
        Improving the differentiation of closely related males by RMplex analysis of 30 Y-STRs with high mutation rates.
        Forensic Sci. Int. Genet. 2022; 102682
        • Ballantyne K.N.
        • et al.
        Toward male individualization with rapidly mutating Y‐chromosomal short tandem repeats.
        Hum. Mutat. 2014; 35: 1021-1032
        • Adnan A.
        • et al.
        Improving empirical evidence on differentiating closely related men with RM Y-STRs: a comprehensive pedigree study from Pakistan.
        Forensic Sci. Int. Genet. 2016; 25: 45-51
        • Boattini A.
        • et al.
        Mutation rates and discriminating power for 13 rapidly-mutating Y-STRs between related and unrelated individuals.
        PLoS One. 2016; 11e0165678
        • Chen Y.
        • et al.
        Mutation rates of 13 RM Y-STRs in a Han population from Shandong province, China.
        Forensic Sci. Int. Genet. Suppl. Ser. 2017; 6: e346-e348
        • Yuan L.
        • et al.
        Mutation analysis of 13 RM Y-STR loci in Han population from Beijing of China.
        Int. J. Leg. Med. 2019; 133: 59-63
        • Zgonjanin D.
        • et al.
        Mutation rate at 13 rapidly mutating Y-STR loci in the population of Serbia.
        Forensic Sci. Int. Genet. Suppl. Ser. 2017; 6: e377-e379
        • Zhang W.
        • et al.
        Multiplex assay development and mutation rate analysis for 13 RM Y-STRs in Chinese Han population.
        Int. J. Leg. Med. 2017; 131: 345-350
        • Claerhout S.
        • et al.
        Determining Y-STR mutation rates in deep-routing genealogies: identification of haplogroup differences.
        Forensic Sci. Int. Genet. 2018;
        • Kelkar Y.D.
        • et al.
        The genome-wide determinants of human and chimpanzee microsatellite evolution.
        Genome Res. 2008; 18: 30-38
        • Brinkmann B.
        • et al.
        Mutation rate in human microsatellites: influence of the structure and length of the tandem repeat.
        Am. J. Hum. Genet. 1998; 62: 1408-1415
        • Otagiri T.
        • et al.
        Mutation analysis for 25 Y-STR markers in Japanese population.
        Leg. Med. 2021; 50101860
      1. R Core Team, R: a language and environment for statistical computing, 2013.

        • Liu J.
        • et al.
        The construction and application of a new 17-plex Y-STR system using universal fluorescent PCR.
        Int. J. Leg. Med. 2020; 134: 2015-2027
        • Ochiai E.
        • et al.
        Y chromosome analysis for common surnames in the Japanese male population.
        J. Hum. Genet. 2021; 66: 731-738
        • Navarro-López B.
        • et al.
        Phylogeographic review of Y chromosome haplogroups in Europe.
        Int. J. Leg. Med. 2021; 135: 1675-1684
        • Lang M.
        • et al.
        Forensic characteristics and genetic analysis of both 27 Y-STRs and 143 Y-SNPs in Eastern Han Chinese population.
        Forensic Sci. Int. Genet. 2019; 42: e13-e20
        • Liu J.
        • et al.
        Development of a new 17 Y-STRs system using fluorescent-labelled universal primers and its application in Shanxi population in China.
        Forensic Sci. Int. Genet. Suppl. Ser. 2019; 7: 95-97