If you don't remember your password, you can reset it by entering your email address and clicking the Reset Password button. You will then receive an email that contains a secure link for resetting your password
If the address matches a valid account an email will be sent to __email__ with instructions for resetting your password
Forensic Laboratory for DNA Research, Department of Human Genetics, Leiden University Medical Center, Postzone S-05-P, P.O. Box 9600, 2300 RC Leiden, The Netherlands
Forensic Laboratory for DNA Research, Department of Human Genetics, Leiden University Medical Center, Postzone S-05-P, P.O. Box 9600, 2300 RC Leiden, The Netherlands
Forensic Laboratory for DNA Research, Department of Human Genetics, Leiden University Medical Center, Postzone S-05-P, P.O. Box 9600, 2300 RC Leiden, The Netherlands
Department of Human Biological Traces, Netherlands Forensic Institute, P.O. Box 24044, 2490 AA The Hague, The NetherlandsForensic Laboratory for DNA Research, Department of Human Genetics, Leiden University Medical Center, Postzone S-05-P, P.O. Box 9600, 2300 RC Leiden, The Netherlands
Forensic Laboratory for DNA Research, Department of Human Genetics, Leiden University Medical Center, Postzone S-05-P, P.O. Box 9600, 2300 RC Leiden, The Netherlands
Forensic Laboratory for DNA Research, Department of Human Genetics, Leiden University Medical Center, Postzone S-05-P, P.O. Box 9600, 2300 RC Leiden, The Netherlands
Forensic Laboratory for DNA Research, Department of Human Genetics, Leiden University Medical Center, Postzone S-05-P, P.O. Box 9600, 2300 RC Leiden, The Netherlands
All 5 Y-STR multiplexes efficiently generated genotyping data for 2085 Dutch donors.
•
99.0% of the haplotypes were unique when all 36 Y-STR marker units were combined.
•
19 Y-STR marker units were present in multiple kits and showed 0.002% discordance.
Abstract
The genotypes of 36 Y-chromosomal short tandem repeat (Y-STR) marker units were analysed in a Dutch population sample of 2085 males. Profiling results were compared for several partially overlapping kits, i.e. PowerPlex Y, Yfiler, PowerPlex Y23, and two in-house designed multiplexes with rapidly mutating Y-STRs. Nineteen Y-STR marker units, of which two are rapidly mutating, reside in at least two of these multiplexes, and for these markers concordance testing was performed. Two samples showed discordant genotyping results and the probable causative base change was revealed by Sanger sequencing. In addition, we encountered concordant, but aberrant genotyping results including one allele with low peak height and several null alleles. For 12 samples, this involved a null allele in two adjacent loci suggesting a large and recurrent deletion as the samples represent three distinct haplogroups. For each marker unit, the allele counts and frequencies are presented, as are the haplotype counts and haplotype diversities for several combinations of markers.
The number of Y-chromosomal short tandem repeat (Y-STR) markers for routine forensic and population genetic use has grown considerably over the past few years. Initially, a minimal haplotype set of nine Y-STR marker units was recommended for forensic use [
]. The subsequently developed and commercially available multiplexes contain a growing number of Y-STR marker units, such as 12 in the PowerPlex® Y System (PPY, Promega, released in 2003), 17 in the AmpFlSTR® Yfiler®, (Yfiler, Life Technologies, released in 2004), 23 in the PowerPlex Y23 System (PPY23, Promega, released in 2012) and 27 in the AmpFlSTR® Yfiler® Plus Kit [
Y-STRs can be of great value in stains with small quantities of male DNA and overwhelming amounts of female DNA, for instance in sexual assault cases. They can also be very useful in kinship analyses, due to their strict paternal inheritance pattern. However, as a result of the relatively low mutation rate for the commonly used Y-STRs, it is difficult, if not impossible, to differentiate between closely related males. The introduction of 13 rapidly mutating (RM) Y-STRs with median mutation rates about 6.5 times higher than the Yfiler STRs [
]. We use the term “marker unit” for previously defined distinct Y-STR markers, e.g. for DYS385 a separate “a” and “b” part are described and these are counted as two marker units (resulting for instance in 17 marker units for Yfiler in total), while DYF387S1 is counted as one marker unit even though it can show up to three alleles (resulting in 15 RM marker units in total). These 36 marker units were tested in 2085 DNA samples from Dutch male blood donors. For the 19 Y-STR marker units that are present in more than one set, concordance testing was performed and discordant alleles were subsequently analysed with Sanger sequencing. Allele counts and frequencies are reported together with the haplotype counts and haplotype diversities for several marker combinations. All PowerPlex Y23 haplotypes have been submitted to the publicly available Y Chromosome Haplotype Reference Database (YHRD) [
A total of 2085 male blood donors with self-defined Dutch ancestry were sampled from 99 locations across the Netherlands, while excluding major cities to avoid very recent admixture effects. All volunteers had given their informed consent. A detailed description of the samples is given in [
Clinal distribution of human genomic diversity across the Netherlands despite archaeological evidence for genetic discontinuities in Dutch population history.
2.2 Marker units, DNA amplification, capillary electrophoresis (CE) and DNA profile analysis
All 2085 DNA samples were amplified with five Y-STR multiplex PCRs, targeting 36 marker units (present in 32 different Y-STRs of which one has a “I” and “II” part (i.e. DYS389) and three have an “a” and “b” part (i.e. DYF403S1, DYS385 and DYS526). Three of these multiplexes are commercially available: PPY and PPY23 from Promega Corporation (Promega, Madison, WI, USA) and Yfiler from Life Technologies (Life Tech, Foster City, CA, USA). All 12 PPY marker units reside in Yfiler, and all 17 Yfiler marker units are represented in PPY23 (Table 1). The other two multiplexes (RMY1 and RMY2) were redesigned in-house based on the three RM Y-STR multiplexes published in [
]. They analyse 15 rapidly mutating Y-STR marker units (that reside in 13 Y-STRs). RMY1 holds six and RMY2 nine marker units, and RMY2 contains two marker units overlapping with PPY23 (Table 1).
Table 1Marker units present in five Y-STR multiplexes together with the number and percentage discordant results.
DNA amplification with the Yfiler, PPY, and PPY23 multiplexes was performed according to the manufacturer's protocols, but with half of the reaction volume. PCR products were detected by CE on an ABI Prism 3100 Genetic Analyzer (Life Tech), using a 36 cm array, POP-4 and dye set G5 (for Yfiler and PPY23) or C (for PPY). 1 μL sample or allelic ladder was mixed with 11.6 μL ddH2O and 0.4 μL GeneScan™ LIZ 600 Size Standard (Life Tech) for Yfiler, with 11.5 μL ddH2O and 0.5 μL ILS600 (Promega) for PPY, or with 11 μL ddH2O and 1 μL CC5 ILS500 Y23 (Promega) for PPY23, and analysed after 3 min of denaturation and 3 min on ice. CE injection settings were 1 kV for 22 s for Yfiler and PPY, and 3 kV for 5 s for PPY23. The Y-STR profiles were analysed using GeneMapper v. 3.0 (Life Tech) for PPY or GeneMarker v. 1.75 (Softgenetics, LLC., State College, PA, USA) for Yfiler and PPY23 with a detection threshold of 30 rfu.
RMY1 and RMY2 PCRs were performed in a 10 μL reaction volume using 1× QIAGEN Multiplex PCR Buffer (Qiagen, Venlo, the Netherlands), primers as described in Supplementary Table S1 and 1.0 ng DNA. The PCR protocol starts with a pre-denaturation step for 10 min at 94 °C, followed by a step-down PCR of 10 cycles at 94 °C for 30 s, 65 °C (1 °C/cycle) for 30 s and 72 °C for 1 min, and 23 cycles (for RMY1) or 25 cycles (for RMY2) of 94 °C for 30 s, 50 °C for 30 s and 72 °C for 1 min, with a final extension at 60 °C for 45 min. PCR products were detected by CE on an ABI Prism 3130xl Genetic Analyzer (Life Tech), using a 36 cm array, POP-7 and dye set G5. 1 μL sample was mixed with 8.7 μL Hi-Di™ Formamide (Life Tech) and 0.3 μL GeneScan™ LIZ 600 Size Standard (Life Tech), and analysed after 4 min of denaturation and 5 min on ice. CE injection settings were 3 kV for 10 s. The RM Y-STR profiles were analysed using GeneMapper® ID-X v. 1.1.1 (Life Tech) with a detection threshold of 50 rfu. For most markers a stutter filter of 20% was applied, except for DYS518 and DYS526b (both 25%), DYS570 (30%) and DYS612 (35%).
Twenty-five microliters singleplex PCR reactions were performed using PCR buffer I (Life Tech) with 1.5 mM MgCl2, 0.2 mM dNTP mix (Life Tech), 2 units AmpliTaq Gold (Life Tech) and 2 pmol of each HPLC-purified primer (Supplementary Table S1). The amplification, purification, sequencing, detection and sequence analysis was performed as described in [
2.4 Allele counts, allele frequencies, haplotypes and haplotype diversities
Based on the Y-STR data, haplotypes were constructed and compared using Excel (Microsoft, Redmond, WA, USA) for all 2085 donors. For each allele in each marker unit, the number of occurrences was counted. Allele frequencies were calculated by dividing the allele count for a specific allele through the total number of counted alleles for that marker unit (which was not always 2085, due to null alleles or additional alleles in multi copy marker units). Haplotype diversities were calculated using Arlequin v3.5.1.3 [
] and an adjusted version of an Excel worksheet kindly provided by Ballantyne (personal communication) to be able to calculate numbers with more than four digits after the decimal point (our version of the worksheet is available on request). In case of multi copy marker units, the “empty cells” were filled with a dummy variable for donors that showed less than the maximum number of alleles.
2.5 Haplogroups and familial relationships
For 12 donors, Y-SNP analysis was performed to determine their haplogroup using the methods described in [
] (http://www.bonaparte-dvi.com). To this end, fictive family trees were produced in which one of the donors of a pair was fixed (grey square in Fig. 1) and the other donor was tested for all the other possible male relationships (eight white squares in Fig. 1). Additional relationship testing was performed with a version of RelPair [
] that was adjusted to enable the analysis of a dataset containing 2085 individuals (details are available on request).
Fig. 1Fictive family tree used to deduce the most likely family relationship between two donors. Based on genotyping information at 23 autosomal STRs, Bonaparte software is used to calculate the log10(LR) for the different relationships by fixing one of the donors of a pair (grey square) and testing the second donor for the other possible male relationships (eight white squares).
DNA samples of 2085 male donors were analysed with five Y-STR multiplexes: PPY, Yfiler, PPY23, RMY1 and RMY2 (both in-house designed, based on the markers published in [
]). Of the 36 Y-marker units analysed by these multiplexes, 19 reside in two or three systems (Table 1) and enable concordance testing. Two discordances were found (Table 2): for one person DYS448 showed an allele 19 for PPY23 and no allele with Yfiler, while for another person Yfiler resulted in an allele call 23 for DYS635 with no result for PPY23. Using Sanger sequencing, for both discordances single base changes were disclosed: an A > G transition 49 nucleotides prior to the DYS448 repeat motif, and a T > A transversion 7 nucleotides before the DYS635 repeat structure. As the primer positions for these markers are not publicly available, we cannot check whether these nucleotide changes are located at the primer binding sites for the kits showing the null allele. Both Davis et al. [
] did not find any discordance in the 17 overlapping loci between Yfiler and PPY23 in their sample sets of 951 American and 535 Belgian donors, respectively. This befits the low percentage of 0.002% discordance that we observe in our larger Dutch dataset (Table 1).
Table 2Y-STR discordances and null alleles in a population sample of 2085 Dutch males.
Beside the above-described two discordances, 32 other null alleles were observed. For seven donors, a null allele was found on DYF403S1b, which is only present in RMY2 (Table 2). For one person DYS439 showed no results in all three commercial kits (PPY, Yfiler and PPY23; Table 2). In 12 different samples both DYS448 (present in Yfiler and PPY23) and DYS626 (present in RMY1) showed no results (Table 2). These marker units are located 52.2 kbp from each other with none of the other markers situated between them [
Dynamic nature of the proximal AZFc region of the human Y chromosome: multiple independent deletion and duplication events revealed by microsatellite analysis.
]), but since DYS626 is less commonly typed it is unclear whether these have such a double null allele as well. In order to test whether the 12 persons with this double null allele in our sample set are related in the male lineage, their haplogroups were determined using Y-SNPs. Six of them demonstrated haplogroup I, five had haplogroup R1a and one showed haplogroup R1b. Therefore, we deduce that the 12 persons with this double null allele do not originate from one male lineage, and that this double null allele is recurrent and identical by state among the different haplogroups. When examining the Y-STR haplotypes for the persons belonging to the same haplogroup (I or R1a), it was noticed that two donors in haplogroup I showed haplotypes with only one difference between them, while all others displayed at least five differences (data not shown). This one difference was detected in the rapidly mutating DYF403S1b marker and we infer that these two donors may be (closely) related in the male lineage, which would mean that, in this case, the double null allele is identical by descent. However, relationship testing based on 23 autosomal STRs did not suggest a first or second degree relationship between these donors (data not shown).
A noteworthy observation occurred for single copy marker DYS576, as in one person an additional allele 14 of low peak height was found next to a much higher allele 18 in both PPY23 and RMY2 profiles. The peak height ratio between both alleles varied between 0.12 and 0.31 in four independent amplifications with both multiplexes. The presence of the two alleles was confirmed with Sanger sequencing, although the signals for allele 14 were again very low and did not allow detecting a possible primer binding site mutation. As the PCR primers for Sanger sequencing were positioned at least 100 nucleotides further up- and downstream than those used in RMY2 (and the primer positions for PPY23 are unknown), we infer either the presence of multiple primer binding site mutations, or a chimeric situation that is specific for this Y-STR marker as none of the other Y-STR or autosomal markers showed additional weak alleles. More detailed sequence information may be obtained from next generation sequencing [
], but for now it remains unclear what causes the presence of the second lower allele on DYS576 in this sample.
3.2 Allele counts, allele frequencies and haplotypes
Four RM Y-STR marker units (DYF387S1, DYF399S1, DYF403S1a and DYF404S1) most often show multiple alleles per marker (between one and five alleles, Table 1), and are therefore categorised as multi copy markers. The other 32 marker units are considered single copy markers, although 14 of these, including the previously described DYS576, show a second allele in one to six of the 2085 samples (Table 1). In 26 of the 32 cases, the second allele differs only one repeat length in size from the first allele, but size differences up to six repeat lengths have been found. Except for the previously described sample showing two unbalanced alleles in DYS576, both alleles are balanced in the other cases. Therefore, one needs to realise that finding a second allele on a marker unit that is believed to be a single copy marker, not always reflects a mixture of two donors.
All haplotypes are presented in Supplementary Table S2. For all 36 marker units, the alleles present in the 2085 DNA samples were counted and their frequencies were calculated (Table 3). DYS393 and DYS437 show the smallest allelic range with only five different alleles in our Dutch population sample; DYF399S1 has the largest range with 36 different alleles.
Table 3Allele counts and frequencies per marker unit.
Next, we examined the haplotypes resulting from different combinations of Y-STR marker units: the minimum YHRD marker set, the various commercial kits (PPY, Yfiler and PPY23), the rapidly mutating Y-STRs (RMY1 + RMY2), and all 36 marker units together (PPY23 + RMY1 + RMY2). Table 4 shows the level of uniqueness of haplotypes (the number of times a haplotype was observed) and how many haplotypes have that level of uniqueness (the number of occurrences in our 2085 samples). In general, with more Y-STR markers, more unique haplotypes are found. The PPY23 markers resulted in 92.5% unique haplotypes (1929 haplotypes occurred only once (Table 4), haplotype diversity = 0.999959494976 (Table 5)), which is in the same range as the 93.5% described for the European group analysed with PPY23 by Purps et al. [
]. For the RM Y-STRs (RMY1 + RMY2), 98.4% unique haplotypes were observed (2052 haplotype singletons (Table 4), haplotype diversity = 0.999991714881 (Table 5)), which is somewhat lower than the 100% reported by Ballantyne et al. [
] for the 112 Dutch samples in their set. When combining all 36 Y-STR marker units, 2065 haplotypes were seen just once (99.0% unique haplotypes (Table 4), haplotype diversity = 0.999995397156 (Table 5)) and ten were each seen twice (representing ten haplotype pairs), resulting in 2075 different haplotypes for the complete set of 2085 samples.
Table 4Uniqueness of Y-STR haplotypes in 2085 Dutch samples.
]. Bonaparte software was used to deduce the most likely family relationship between the two donors residing in one haplotype pair, based on fictive family trees in which one of the donors of a pair was fixed (grey square in Fig. 1) and the other donor was tested for all the other possible male relationships (eight white squares in Fig. 1). When the donors were switched, slightly different log10(LR) scores were obtained, due to the differences in genotypes and their corresponding allele frequencies in the formulae, but all results were comparable, as expected (results not shown). Based on the log10(LR) results, we infer that two of the haplotype pairs have a father/son relationship (log10(LR) of 8.1 or 10.5), two have a brother/brother relationship (log10(LR) of 6.3 or 12.2) and the other six are likely to have a more distant relationship than the eight relationships tested in Fig. 1 (log10(LR) between −28.3 and 1.6). These results were confirmed by RelPair analyses (results not shown). Although the study was designed to sample unrelated individuals, it appears that few family relationships are present in such a large population sample.
4. Concluding remarks
All the commercial multiplexes (PPY, PPY23 and Yfiler) and the redesigned RM Y-STR multiplexes (RMY1 and RMY2) tested in this study functioned well and efficiently generated genotyping data for all 2085 Dutch donors. Very little discordance (0.002%) was detected in our data set, which contained 19 Y-STR marker units that were present in multiple (two or three) kits. This might be due to little nucleotide variation in the areas around the targeted markers, or companies using similar primers. The percentage of unique haplotypes was 92.5% for the 23 marker units in PPY23, 98.4% for the 15 RM Y-STR marker units, and it was even raised to 99.0% when all 36 marker units were combined, resulting in a very high discriminating power for Y-STR standards.
Acknowledgements
This study was supported by a grant from the Netherlands Genomics Initiative/Netherlands Organization for Scientific Research (NWO) within the framework of the Forensic Genomics Consortium Netherlands. We thank Kaye Ballantyne for her assistance in the haplotype diversity calculations.
References
de Knijff P.
Kayser M.
Caglia A.
Corach D.
Fretwell N.
Gehrig C.
Graziosi G.
Heidorn F.
Herrmann S.
Herzog B.
Hidding M.
Honda K.
Jobling M.
Krawczak M.
Leim K.
Meuser S.
Meyer E.
Oesterreich W.
Pandya A.
Parson W.
Penacino G.
Perez-Lezaun A.
Piccinini A.
Prinz M.
Roewer L.
Chromosome Y microsatellites: population genetic and evolutionary aspects.
Clinal distribution of human genomic diversity across the Netherlands despite archaeological evidence for genetic discontinuities in Dutch population history.
Dynamic nature of the proximal AZFc region of the human Y chromosome: multiple independent deletion and duplication events revealed by microsatellite analysis.