If you don't remember your password, you can reset it by entering your email address and clicking the Reset Password button. You will then receive an email that contains a secure link for resetting your password
If the address matches a valid account an email will be sent to __email__ with instructions for resetting your password
Servicio General de Investigación Genómica: Banco de ADN (SGIker), UPV/EHU, Vitoria-Gasteiz, SpainBIOMICs Research Group, Centro de Investigación “Lucio Lascaray”, University of the Basque Country, Vitoria-Gasteiz, Spain
The GHEP-ISFG Working Group performed a collaborative exercise to monitor the current practice of mitochondrial (mt)DNA reporting. The participating laboratories were invited to evaluate a hypothetical case example and assess the statistical significance of a match between the haplotypes of a case (hair) sample and a suspect. A total of 31 forensic laboratories participated of which all but one used the EMPOP database. Nevertheless, we observed a tenfold range of reported LR values (32–333.4), which was mainly due to the selection of different reference datasets in EMPOP but also due to different applied formulae. The results suggest the need for more standardization as well as additional research to harmonize the reporting of mtDNA evidence.
The interpretation of matching DNA profiles involves a statistical evaluation to assess the strength of the evidence. Likelihood ratios (LRs) or Random Match Probabilities (RMPs) are two of some possible ways of expression [
]. There are different procedures with nuclear and mitochondrial markers based on their genetic properties. The variation of autosomal Short Tandem Repeat loci (aSTRs) can be mainly attributed to slippage mutation events and recombination albeit to a much smaller extent. The distribution of allele frequencies has traditionally been estimated on the basis of relatively small sizes of randomly sampled individuals (e.g. few hundreds), as the number of observed alleles for an STR locus is usually not very large. Further, Mendelian inheritance allows for the assumption that populations are more or less in equilibrium (that is, the allelic frequencies are roughly maintained throughout different generations). Thus, generally accepted criteria have been established for estimating allele and genotype frequencies in aSTRs [
On the contrary, the level of polymorphism in mtDNA is restricted to single mutational base exchanges with an apparent lack of recombination. The spatial distribution of contemporary mtDNA haplotypes is based on the history of the radiating lineages that left their footprints during human migration. Numerous population studies established and confirmed that the number of unique (novel) mtDNA haplotypes is usually well exceeding 50% of the total sample size and is only reduced in populations with restricted gene flow, genetic bottleneck and/or strong founder effects with further limited genetic exchange (maternal). The rarity of an mtDNA profile is traditionally estimated by querying datasets (databases) of randomly sampled individuals. The high number of unobserved “alleles” requires much larger sample sizes to be established compared to aSTRs. In fact, the estimates of haplotype frequencies when using small mtDNA population databases are often biased, and even in larger databases, the estimation of rare haplotypes poses a major challenge [
Between 1990 and 2000, numerous individual laboratories started to establish local mtDNA datasets for frequency estimations that were particularly small and – as found out later – often fraught with error. The Europe DNA Profiling Group (EDNAP, www.ednap.org) initiated collaborative exercises on mtDNA typing to understand sources and reasons for error [
]. As a result, human clerical error and phantom mutations (artificial signals due to suboptimal experimental conditions) were found to be responsible for the majority of problems in the generated data. As a consequence, improved experimental protocols and specific guidelines were elaborated to improve the quality of mtDNA datasets [
Generating population data for the EMPOP database – an overview of the mtDNA sequencing and data evaluation processes considering 273 Austrian control region sequences as example.
‘Mitominis’: multiplex PCR analysis of reduced size amplicons for compound sequence analysis of the entire mtDNA control region in highly degraded samples.
]. Sequence data generated under these and similar protocols form the forensic data source in EMPOP, an internet-based resource presenting mtDNA haplotypes from all over the world (www.empop.org; [
]). By means of continued collaborative initiatives, the total sample size meanwhile exceeded 16,000 (Release 6, March 2012), resulting in improved estimation of haplotype frequencies [
The interpretation and statistical evaluation of mtDNA typing results is not always straight forward. It is not uncommon for differences to be observed between haplotypes of maternal relatives [
]. The allelic frequency distribution of aSTR markers is known to differ between populations, however, the effect on the final LR is not as pronounced as with mtDNA haplotypes, where it is more crucial to select the relevant population database [
]. Also, correction for subpopulation effects is more difficult with mtDNA and may lead to higher correction factors than for aSTRs. Other issues that are typically encountered when analyzing mtDNA are the edition range of the query haplotype, inclusion/exclusion of point and/or length heteroplasmies, the selection of the appropriate reference population and the applied methodology to correct for sampling bias.
Historically, the Spanish and Portuguese Speaking Working Group of the International Society for Forensic Genetics (GHEP-ISFG) has shown major interest in the use and standardization of mtDNA analysis for forensic purposes since many years [
Results of the 1999-2000 collaborative exercise and proficiency testing program on mitochondrial DNA of the GEP-ISFG: an inter-laboratory study of the observed variability in the heteroplasmy level of hair from the same donor.
]. So far, a lot of information has been made available on protocols and different mtDNA analysis strategies that forensic laboratories use. Little however is documented about how member laboratories report mtDNA results. In order to obtain an overview of the communication of mtDNA evidence, the GHEP-ISFG proposed an exercise regarding the interpretation of mtDNA results in its last proficiency test performed in 2011.
2. Hypothetical case example
Participating laboratories were asked to evaluate a hypothetical forensic case (homicide in Barcelona) in which the mtDNA haplotype of a hair shaft found in the hand of the victim matched the suspect's haplotype (see Table 1). The laboratories were also asked to consider the evidence under the following two hypotheses:
(a)
Prosecutor: Hp: the hair shaft originates from the suspect or from a maternal relative of the suspect.
(b)
Defense: Hd: the hair shaft originates from an unknown individual of the European population not related to the suspect.
Table 1MtDNA haplotypes from a hypothetical case example. The hair sample was recovered from a hand of the victim and matched the suspect's haplotype.
Two additional questions had to be answered by the laboratories in order to better evaluate their results: (i) which mtDNA database they have used and (ii) the number of matches that were found in that database.
A total 31 laboratories participated in this GHEP-ISFG exercise (not including the laboratory managing the EMPOP database in Innsbruck).
3. Results
Almost all participating laboratories (30 of 31) used the EMPOP database to infer the rarity of the given control region mtDNA haplotype (Table 2). The exercise took place in spring 2011 when EMPOP Release 4 was online with a total of 12,785 haplotypes. Of these, 2667 were full control region profiles from Europe (3201 full control region profiles with west Eurasian metapopulation affiliation). Under standard query settings (pattern search, disregarding differences in C-stretches; see below) the mtDNA haplotype in question was observed 26 times in Europe (29 times in “west Eurasia”). The EMPOP database has provided the uncorrected frequency estimates for the queries (9.749e−3 and 9.060e−3, respectively) along with the lower and upper bound 95% confidence intervals (C.I.) ([6.661e−3; 1.425e−2] and [6.315e−3; 1.298e−2], respectively) estimated according to Wilson [
F = forensic data, L = literature data, details see text.
Match type indicates query settings with respect to heteroplasmy, details see text.
x = matches in the database; N = database size; p = x/N; q = 1 − p.
g = number of generations (in this case 2 generations were considered, see details in the text); μ = mutation rate per base and per generation (3.6 × 10−6).
a This laboratory performed the search disregarding insertions/deletions in the homopolymeric C-tract around position 309.
The reported LR values by the participating laboratories ranged from 32.0 to 333.4 (Table 2 and Fig. 1). This can in part be explained by the varying selection of the reference dataset including (i) the entire database content (N = 10, LRs between 200 and 241), (ii) the European portion of EMPOP (N = 17, LRs from 94 to 124), (iii) the South European portion of EMPOP (N = 1, LR = 123.7) and (iv) the Spanish population as reference dataset (N = 2, LR = 32). One laboratory (#25) reported a different result based on its own database. Varying LR values within the first two groups (i and ii) are further due to the following:
(a)
Applied formulae: Eleven laboratories (35%) reported the uncorrected frequency as the haplotype in question was observed more than once in the datasets (with the exception of those two labs that used the Spanish dataset; Table 2). Thirteen laboratories (42%) added the suspect's haplotype to the database [(x + 1)/(N + 1); x = number of matches in the database, N = size of the database]. Five laboratories (16%) added both haplotypes (evidence and suspect) to the database, [(x + 2)/(N + 2)]; one laboratory (#25) used the upper confidence interval (C.I.) as described in [
]: [p + 1.96 √(pq/N), p = x/N, q = 1 − p] with p = matches in the local database, and one laboratory (#23) used the (x + 1)/(N + 1) correction plus the upper bound of the C.I. following [
The combined use of forensic and literature data or forensic data only: the EMPOP database Vs. 2.1 R4 contained two types of mtDNA data regarding its source, forensic and literature. While the former are supported by high-quality raw sequencing data, the latter are based on lower quality sequence data only. Seven laboratories (23%) selected only the forensic data source of EMPOP, while the remaining laboratories queried both data sources (except one laboratory that used a different database).
(c)
Point heteroplasmy (PHP): All but one laboratory used pattern search mode (standard settings), which regards two sequences differing by only heteroplasmic positions as non-exclusive (e.g. 152Y matches 152C). In contrast the literal search mode would result in exclusion for the given example.
(d)
Length variants: All but two laboratories (#1 and 17) performed the EMPOP query under standard settings with respect to the treatment of differences at the HVS-2 homopolymeric C-tract, i.e. differences were ignored. Similar to PHP it is common practice not to formulate exclusion on differences there [
]. When using the query settings of laboratories #1 and 17 only 8 exact matches (instead of 26) to the haplotype in question were found, as all database haplotypes with 309.1C were excluded from the search. These two laboratories consequently reported lower estimates of the relative haplotype frequencies and therefore higher LRs.
Fig. 1Summary of reported LR results (N = 31 labs).
To the best of our knowledge this is the first report on a collaborative exercise on statistical evaluation of mtDNA results. A total of 31 laboratories participated in the study, where a hypothetical case example was described and the resulting mtDNA haplotypes provided (Table 1, Table 2). Although all but one used the same database (EMPOP Vs. 2.1 R4), we observed considerable variation in reported LR values. We briefly discuss why.
4.1 The selected source dataset
The suspect's haplotype (that matched the one of the hair) belongs to haplogroup HV0 [
]. This haplogroup is typically found in west Eurasia, particularly in Europe and North America. Some HV0 lineages are also present in northern parts of Africa and extend over the Middle East to Central Asia and the Indian sub-continent, albeit at much lower frequencies.
Under the defense hypothesis that the hair shaft originated from an unknown individual of the European population unrelated to the suspect, it would be appropriate to use the European portion of EMPOP (or any other database) as reference dataset. Alternatively, the west Eurasian metapopulation in EMPOP could have also been selected that includes additional “European” lineages sampled outside the borders of contemporary Europe (e.g. US “Caucasians”). One laboratory selected the Southern European dataset, which is another interpretation of the defense hypothesis. While it is not fully in agreement with the Hd it may represent a more cautious way of evaluating the evidence.
The restriction of the query dataset to the Kingdom of Spain may seem logical (at least to some layperson) as the crime happened there, but violates the defense hypothesis, as this would exclude the possibility that the suspect derives from other European populations. We see that the resulting LR is smaller (Table 2), as the haplotype has not been observed in this very small sample of full CR haplotypes that are available for Spain.
The selection of the entire database as reference dataset resulted in the highest LR values but violates the defense hypothesis and – even more problematic – leads to underestimation of the haplotype's frequency to the disadvantage of the suspect, e.g. by including all East Asian populations that are characterized by a different phylogenetic background and usually do not harbor HV0 lineages.
The variability of the reported answers demonstrates that more research, education, and harmonization is needed to improve consistency in mtDNA interpretation, albeit some flexibility will (need to) remain in response to different possible formulations of the defense hypothesis.
4.2 Applied formulae
Some of the observed variation in reported LR values can be attributed to differences in the applied formulae. Roughly one third of the laboratories used the uncorrected haplotype frequency, since the haplotype was observed more than once (except for those two laboratories that used Spain as reference population). The recommendations of the International Society for Forensic Genetics and from the EDNAP group state the observations in the database should be corrected for sampling bias [
], which was state of the art with the smaller mtDNA databases available at that time. Further research is needed to evaluate the requirement of sampling correction with today's database sizes. Other laboratories added the suspect's or the suspects and the crime scene sample's haplotypes to the database, which resulted in increasingly conservative estimates. Apparently there is different understanding and practice among laboratories, which highlights the need for harmonization. Diverse approaches can be found in the literature [
], that do not only differ by their mathematical interpretation but more importantly by the underlying hypotheses.
4.3 LR calculation
As no differences between evidence and suspect haplotypes were present in this exercise, the LR (p(E|Hp)/p(E|Hd)) can be calculated as 1/f (f = haplotype frequency). Almost all laboratories performed this. One laboratory calculated the LR by using the formula e−gμ/f [
], where g is the number of generations and μ the mutation rate of the control region. This laboratory considered g = 2 since “a maternal relative” was included in the prosecutor's hypothesis. The HVR mutation rate of 3.6 × 10−6 [
] was used. The resulting LR (233.6) however was in the range of the labs which used the same reference dataset (LRs between 200 and 241).
Earlier guidelines stimulated the estimation of mutation rates not only with respect to the number of generations or years, but also the tissue type of an individual [
]. The somatic mutation rate depends on several factors, including the number of cell divisions (and therefore the age of an individual) or the metabolic activity in the tissue [
]. In fact, it was previously described that active tissues such as hair follicles or muscle show a higher rate of mutation than other tissues with little cell division or low energy requirement [
]. The body of data however, is not large and consistent enough to allow for the deduction of reliable values for mutation rates. More research is needed here.
4.4 Quality of EMPOP reference data
Regarding the source of reference data for their queries seven laboratories have based their conclusions on EMPOP forensic data only. Both forensic and literature data presented in EMPOP undergo rigorous quality control, and great scrutiny is applied to unveil possible errors (e.g. reference bias, phantom mutation, base mis-scoring, nomenclature issues, alignment violation, clerical errors, sample mix-up, phylogenetic sense; see [
] as an example). As indicated in the description of the EMPOP website the term forensic refers to the collection of haplotypes for which high quality sequencing raw data are available, so that questionable positions can be evaluated any time. The term literature data refers to haplotypes that were either extracted from reliable publications or sent to EMPOP for review without accompanying raw data. The evaluation of literature follows the same stringent procedure of reviewing forensic data and authors/submitters have been contacted for raw data of questionable observations. However, raw lane data may not always be available to full extent and quality and therefore literature data not harbor the same quality as forensic data.
Apparently some forensic users are confused with respect to which data sources to query in a forensic case. The classification forensic and literature data was established for the first EMPOP Release (2006) when still little was known about the efficiency of detecting errors in data tables. We believe that both data sources are of very high quality – especially with respect to the uncorrected literature – and therefore represent valuable resources for forensic applications. We understand that the classification scheme is confusing and we will perform further research to establish if forensic and literature data can be pooled in the future to avoid additional complexity for database searches.
4.5 Heteroplasmy
Point heteroplasmy (PHP) is observed in approx. 6% of blood and saliva samples (in the entire control region [
Investigation of heteroplasmy in the human mitochondrial DNA Control Region: a synthesis of observations from more than 5000 global population samples.
]), and more frequently in hair and metabolically active tissues (e.g. muscle). In the forensic context two sequences are usually not excluded as originating from the same lineage/individual when they differ at PHP positions only (assuming that the homoplasmic variant is included in the PHP). Database queries should reflect this convention, i.e. a haplotype including PHP at 152 (152Y) should not be considered different from the same haplotype with the exception of a 152C (or T). While this can easily be addressed in a database search when the query haplotype shows the PHP (e.g. by querying both variants in separate searches) a user cannot know how many haplotypes in the database carry PHP and which positions are affected. It is therefore important that the query engine of a database executes this convention, which is the case in EMPOP when pattern search is used. Under the literal search mode such sequences would be considered one-step neighbors as only exact matches are found identical. This function was implemented to enable specific searches for PHP positions and their relative frequencies in EMPOP.
Therefore, pattern searches are always preferred when evidence has to be statistically evaluated. In this exercise the effect of the two modi was only small. In fact, two haplotypes went missing under the literal search mode (laboratories 11 and 28): 72C 152Y A263G 315.1C 16298C and 72C 263G 309.1C 309.2C 315.1C 16144W 16298C (24 matches instead of 26). Nevertheless, greater differences can be observed in the case of other haplotypes (see Table 3).
Table 3Summary of haplotype queries under pattern and literal search modes in EMPOP (v2.1, Release 6; Europe). LRs = 1/f.
The GHEP-ISFG is known as an active working group on forensic mtDNA matters. With this study it has pioneered the field by performing a collaborative exercise on statistical evaluation of mtDNA evidence. The results derived from the interpretation of a hypothetical case example show that albeit using the same database laboratories arrive at discordant statistical interpretation with respect to the reported LR values in a range of one order of magnitude.
The reasons for this can be summarized in few categories: there is an apparent lack of harmonization with respect to the query settings, which can have a considerable effect on the result. Although the usage of EMPOP is described on the website in the help section, the lack of harmonization can be addressed by adapting the database settings, more and easier understandable directions of use and by continued education. Apparently, some participants selected a defense hypothesis that did not adequately consider the circumstances of the case. That became evident by the selection of different reference datasets for the query. This problem is difficult to be solved by the database but lies in the responsibility of the reporting person.
The observed variability does not come as surprise. The field of mtDNA analysis has always been a niche application in forensic science and therefore suffers the evident lack of guidance and education that is necessary to comply with the high demands of stable and comparable procedures in the international forensic community. This study demonstrates the need for further research and database development to strengthen the application in the field.
Acknowledgements
Josefina Gómez (Unit of Guarantee Quality of the National Institute of Toxicology and Forensic Sciences, Madrid, Spain) was the organizer of the whole GHEP-ISFG Proficiency Test 2011. We would like to express our sincere gratitude for her help in the managing of participating laboratories. WP was partly supported by the Austrian Science Fund FWF (P22880).
References
Buckleton J.
A framework for interpreting evidence.
in: Buckleton J. Triggs C.M. Walsh S.J. Forensic DNA Evidence Interpretation. CRC Press,
London2005: 27-63
Generating population data for the EMPOP database – an overview of the mtDNA sequencing and data evaluation processes considering 273 Austrian control region sequences as example.
‘Mitominis’: multiplex PCR analysis of reduced size amplicons for compound sequence analysis of the entire mtDNA control region in highly degraded samples.
Results of the 1999-2000 collaborative exercise and proficiency testing program on mitochondrial DNA of the GEP-ISFG: an inter-laboratory study of the observed variability in the heteroplasmy level of hair from the same donor.
Investigation of heteroplasmy in the human mitochondrial DNA Control Region: a synthesis of observations from more than 5000 global population samples.