Advertisement
Research paper| Volume 33, P33-37, March 2018

The redesigned Forensic Research/Reference on Genetics-knowledge base, FROG-kb

Open AccessPublished:November 14, 2017DOI:https://doi.org/10.1016/j.fsigen.2017.11.009

      Abstract

      The Forensic Resource/Reference on Genetics-knowledge base (FROG-kb) web site <https://frog.med.yale.edu/FrogKB/> was introduced in 2011 and in the five years since the previous publication ongoing research into how the database can better serve forensics has resulted in extensive redesign of the database interface and functionality. Originally designed as a prototype to support forensic use of single nucleotide polymorphisms (SNPs), FROG-kb provides a freely accessible web interface that facilitates forensic practice and can be useful for teaching and research. Based on knowledge gained through its use, the web interface has been redesigned for easier navigation through the multiple components. The site also has functional enhancements, extensive new documentation, and new reference panels of SNPs with new curated data. FROG-kb focuses on single nucleotide polymorphisms (SNPs) and provides reference population data for several published panels of individual identification SNPs (IISNPs) and several published panels of ancestry inference SNPs (AISNPs). For each of the various marker panels with reference population data, FROG-kb calculates random match probabilities (RMP) and relative likelihoods of ancestry for a user-entered genotype profile (either completely or partially specified). Example genotype profiles are available and the User’s Manual presents interpretation guidelines for the calculations. The extensive documentation along with ongoing updates makes FROG-kb a comprehensive tool in facilitating use of SNPs in forensic practice and education. An overview of the new FROG-kb with examples and material explaining the results of its use are presented here.

      Keywords

      1. Introduction

      In 2011 we introduced FROG-kb (Forensic Resource/Reference on Genetics-knowledge base) (https://frog.med.yale.edu/FrogKB/), an open access web tool, as a reference database of population allele frequencies for Single Nucleotide Polymorphisms (SNPs) likely to be used in forensics. Thus, the focus has been on di-allelic markers as distinct from the standard multiallelic short tandem repeat (STR) polymorphisms (STRPs) traditionally used in forensic sciences. FROG-kb allows viewing and retrieval of forensically relevant data as well as calculation of statistics on several forensically relevant published sets of SNPs and one panel of Insertion-Deletion polymorphisms (InDels) [
      • Rajeevan H.
      • Soundararajan U.
      • Pakstis A.J.
      • Kidd K.K.
      Introducing the Forensic Research/Reference on Genetics knowledge base, FROG-kb.
      ]. Since the introduction of FROG-kb, SNPs have gained in importance in forensic sciences. Consequently, FROG-kb has considerably changed from the original description [
      • Rajeevan H.
      • Soundararajan U.
      • Pakstis A.J.
      • Kidd K.K.
      Introducing the Forensic Research/Reference on Genetics knowledge base, FROG-kb.
      ] involving new functionalities and expanded data. These results of ongoing research into database and interface design as well as the newly incorporated population genetic data warrant this description of the current version of FROG-kb.
      As background, we note that the ability of DNA genotyping to be of use in forensic sciences is completely dependent on the existence of reference data. A random match probability (RMP) is calculated using the frequencies of a subject’s alleles in a population; for which population the RMP is calculated is an issue that can be of relevance and will be case dependent. The need for web-based tools and databases to predict population affiliations by allowing calculation of random match probabilities in forensic cases is well recognized. Many databases exist for standard sets of Short Tandem Repeat (STR) Polymorphisms (STRPs), e.g., STRBase (http://www.cstl.nist.gov/strbase/) [
      • Ruitberg C.M.
      • Reeder D.J.
      • Butler J.M.
      STRBase: a short tandem repeat DNA database for the human identity testing community.
      ], the European Network of Forensic Science Institute’s (ENFSI) DNA working group database STRidER (STRs for identity ENFSI Reference database, http://strider.online/) [
      • Gill P.
      • Foreman L.
      • Buckleton J.S.
      • Triggs C.M.
      • Allen H.
      A comparison of adjustment methods to test the robustness of an STR DNA database comprised of 24 European populations.
      ,
      • Welch L.A.
      • Gill P.
      • Phillips C.
      • Ansell R.
      • Morling N.
      • Parson W.
      • et al.
      European network of forensic science institutes (ENFSI): evaluation of new commercial STR multiplexes that include the european standard set (ESS) of markers.
      ], and PopAffiliator (http://cracs.fc.up.pt/popaffiliator/). The same requirement exists for database(s) with reference allele frequencies for di-allelic markers of forensic interest, SNPs and InDels. In many ways, multiple reference populations are more important for SNPs than for the standard forensic STR markers because the very high mutation rates and global heterozygosity of STRPs result in relatively low levels of global differentiation [
      • Algee-Hewitt B.F.
      • Edge M.D.
      • Kim J.
      • Li J.Z.
      • Rosenberg N.A.
      Individual identifiability predicts population identifiability in forensic microsatellite markers.
      ] whereas SNPs can have the maximum difference of alternative alleles fixed in different populations.
      The enhancements to the database and the redesign of the interface to FROG-kb have involved many that are individually small, but helpful and/or important in standardization. Some of them are mentioned here, but for those planning to actually use FROG-kb, more detail is given in the supplemental material and in the online User’s Manual. We have also included here material to help in both the understanding of potential uses of FROG-kb and the interpretation of results of the calculations made possible through the FROG-kb web site.

      2. Basic redesign and update of FROG-kb

      The original purpose of FROG-kb was to be a prototype that, from a forensic perspective, could serve as a tool facilitating use of SNPs in forensic practice and for teaching and research. FROG-kb focuses on individual identification SNPs (IISNPs) and ancestry inference SNPs (AISNPs). For those two types of markers the interface allows the user to query the reference data for many different panels of SNPs for a multisite genotype of an individual. The web site returns the probability of that genotype in each of the reference populations and the likelihood ratio of the most probable population compared to each alternative specified population, all based on the data in the underlying database. Through the connections into ALFRED, the ALlele FREquency Database (https://alfred.med.yale.edu/), the “knowledge base” component of FROG-kb provides details on the population frequency data and the molecular definitions of the polymorphisms. This paper focuses on the web site of FROG-kb and what functionalities are available; the original paper provides a description of the underlying database structure and bioinformatics aspects [
      • Rajeevan H.
      • Soundararajan U.
      • Pakstis A.J.
      • Kidd K.K.
      Introducing the Forensic Research/Reference on Genetics knowledge base, FROG-kb.
      ].

      3. Basic navigation of FROG-kb

      A more intuitive user-friendly interface has been designed. All pages have a series of buttons across the top for top-level navigation through the web site (Fig. 1). The Home Page text gives a brief summary of the functions available in FROG-kb.
      Fig. 1
      Fig. 1Screen shot of the main navigation buttons.
      Sub-menus exist, specific to each button. For example, selecting the Documentation button on the top-level navigation opens a sub-menu (Fig. 2) with options for more detailed information. This is recommended as a first step for all new users because it leads to the User’s manual. The Manual button opens the downloadable comprehensive user manual with text and graphical elements designed to make FROG-kb navigation easier for the user. The manual provides navigation pointers to our graphical user interface. It is the ultimate resource for information on the database/web interface and we welcome input on improving it. The manual explains the buttons for all of the various sub-menus.
      Fig. 2
      Fig. 2Screen shot showing the buttons for options available under Documentation.

      4. Reference panels available

      In Fig. 1, there are two buttons for SNP panels (IISNP, AISNP) that open pages with the respective panels of each type available for use (Table 1). For each of the IISNP and AISNP panels there are (1) a list of the specific SNPs available for calculations with url links to ALFRED and dbSNP, (2) a list of the reference populations with links to ALFRED and (3) a link for data entry. Explanatory information, examples, and the reference allele frequencies used are also available for each of the panels.
      Table 1The specific SNP panels available for IISNPs (1a) and AISNPs (1b). For each panel the number of reference populations currently available is given. Relevant references for the panels are contained within the database. The specific reference populations for each panel are listed as part of the information for each panel with links to their definitions in ALFRED.
      (1a) Summary of IISNP panels in FROG-kb. IISNP panels and number of Populations Included for likelihood calculations
      KiddLab - 45 Unlinked IISNP45
      KiddLab - List of 86 IISNPs45
      SNPforID 52-plex20
      Qiagen Investigator DIPplex kit28
      (1b) Summary of AISNP panels in FROG-kb. AISNP panels and number of Populations Included for likelihood calculations
      Seldin's list of 128 AISNPs90
      SNPforID 34-plex53
      KiddLab − Set of 55 AISNPs139
      Kayser's set of 24 AI Markers73
      Daniele Podini's list of 32 AISNPs111
      Eurasiaplex 23 SNP Panel76
      Nievergelt's Set of 41AIMs123
      Li’s panel of 74AIMS78
      Overlap set of AISNPs72
      Combined panel of 192 SNPs79
      Pacifiplex34
      EUROFORGEN Global ancestry-informative SNP panel40
      For each panel, the number of reference populations currently available is given. Relevant references for the panels are contained within the database. The specific reference populations for each panel are listed as part of the information for each panel with links to their definitions in ALFRED. Data for many of the various reference populations are the result of data collection on the populations studied in the Kidd Lab.
      The phenotype inference (PISNP) button currently links only to the 6-SNP Irisplex [
      • Walsh S.
      • Liu F.
      • Ballantyne K.N.
      • van Oven M.
      • Lao O.
      • Kayser M.
      IrisPlex: a sensitive DNA tool for accurate prediction of blue and brown eye colour in the absence of ancestry information.
      ] calculation; the eye color prediction in FROG-kb uses the formula from the publication. We note that this formula may not be accurate in all parts of the world [
      • Liu F.
      • Walsh S.
      • Kayser M.
      Of sex and IrisPlex eye colour prediction: a reply to Martinez-Cadenas, et al.
      ].
      Besides these SNP sets in FROG-kb, additional IISNP and AISNP panels are also available from the ALFRED SNP Sets page under the Search tab on the ALFRED homepage. These have not, so far, had sufficient population data to have priority for entry into FROG-kb.

      5. New panels addressing the ‘empty matrix’ issue

      One of the user requests following the initial release of FROG-kb was for a way to calculate statistics using SNPs from more than one published panel. Meeting that request has been difficult because of the empty matrix problem: different SNP panels have been studied on different populations [
      • Soundararajan U.
      • Yun L.
      • Shi M.
      • Kidd K.K.
      Minimal SNP overlap among multiple panels of ancestry informative markers argues for more international collaboration.
      ]. The likelihood comparisons that are fundamental to forensic ancestry inference require that all SNPs have allele frequency data for all reference populations. We have added two AISNP panels that partially address the empty matrix issue: “Overlap set of AISNPs” (Overlap set) and the “Combined panel of 192 AISNPs” (Combined panel). The “Overlap set” is comprised of 44 SNPs of the 46 SNPs that occur in three or more of 21 different published AI panels involving 1397 markers in total [
      • Soundararajan U.
      • Yun L.
      • Shi M.
      • Kidd K.K.
      Minimal SNP overlap among multiple panels of ancestry informative markers argues for more international collaboration.
      ]. Two of the SNPs have data for only a few of the populations; 44 of the 46 SNPs have complete data for 72 reference populations. Unfortunately, these 44 SNPs do not allow biogeographic resolution by STRUCTURE [
      • Pritchard J.K.
      • Stephens M.
      • Donnelly P.
      Inference of population structure using multilocus genotype data.
      ] beyond about five “quasi-continental” regions. The “Combined panel” is comprised of subsets of the KiddLab-55 [
      • Kidd K.K.
      • Speed W.C.
      • Pakstis A.J.
      • Furtado M.R.
      • Fang R.
      • Madbouly A.
      • et al.
      Progress toward an efficient panel of SNPs for ancestry inference.
      ,
      • Pakstis A.J.
      • Kang L.
      • Liu L.
      • Zhang Z.
      • Ji T.
      • Grigorenko E.L.
      • et al.
      Increasing the reference populations for the 55 AISNP panel: the need and benefits.
      ], SeldinLab-128 [
      • Kosoy R.
      • Nassir R.
      • Tian C.
      • White P.A.
      • Butler L.M.
      • Silva G.
      • et al.
      Ancestry informative marker sets for determining continental origin and admixture proportions in common populations in America.
      ,
      • Kidd J.R.
      • Friedlaender F.R.
      • Speed W.C.
      • Pakstis A.J.
      • De La Vega F.M.
      • Kidd K.K.
      Analyses of a set of 128 ancestry informative single-nucleotide polymorphisms in a global set of 119 population samples.
      ]and SNPforID 34-plex [
      • Phillips C.
      • Salas A.
      • Sanchez J.J.
      • Fondevila M.
      • Gomez-Tato A.
      • Alvarez-Dios J.
      • et al.
      Inferring ancestral origin using a single multiplex assay of ancestry-informative marker SNPs.
      ] AISNPs. 79 reference populations have data for 192 of the union of 200 SNPs. With this integrated combined panel, any user-defined subset of the 192 SNPs can be used to calculate the likelihoods of a sample originating from any of the 79 populations.

      6. Ongoing data curation and entry

      The usefulness of any database is dependent on the quality of its contents. New published panels and additional reference population data for existing SNP panels are systematically added to FROG-kb. Data are added from scanning of published literature, from Kidd Lab data collection, from collaborators, and through data submissions by researchers. SNPs included in FROG-kb are made consistent to represent alleles on the forward strand.
      The reference frequency data entered for each of these panels has supporting population information. Web site links exist to pages in ALFRED for more details and allele frequency data tables for specific populations. The comprehensive set of reference populations available for most of these panels includes the 26 population samples from the Phase III 1000 Genomes consortium (1000 Genomes Project Consortium) [
      • 1000 Genomes Consortium
      • Abecasis G.R.
      • Auton A.
      • Brooks L.D.
      • DePristo M.A.
      • Durbin R.M.
      • Handsaker R.E.
      • et al.
      An integrated map of genetic variation from 1,092 human genomes.
      ,
      • Sudmant P.H.
      • Rausch T.
      • Gardner E.J.
      • Handsaker R.E.
      • Abyzov A.
      • Huddleston J.
      • et al.
      An integrated map of structural variation in 2, 504 human genomes.
      ]. When a new population has data for all SNPs in a panel, it becomes a reference population sample included in the computations in the FROG-kb interface. Data often exist on additional populations for individual SNPs and are accessible through ALFRED.

      7. Data availability

      Because it is important to document the exact allele frequency estimates used in the likelihood calculations, tables of the values used for the various panels are available for download through Frequencies Download in the Documentation sub-menu and the sub-menu available when each panel is selected.

      8. Interpreting the results from FROG-kb

      8.1 The statistical results

      For each SNP panel FROG-kb calculates the probability of the user-entered multi-locus genotype in each of the reference populations. If a SNP is not included in the data entered, it is not used in the calculation. The results of the calculation are displayed as a table with three columns: each line contains 1) the name of the reference population sample with its geographic region and sample size, 2) the probability of the entered genotype occurring in that population, and 3) the likelihood ratio of the most probable population to the specific reference population (Fig. 3).
      Fig. 3
      Fig. 3Screenshot of the result page showing calculation result displayed as three columns for an IISNP dataset of a Korean individual.
      The populations are ordered by their probabilities of generating the entered genotype from highest to lowest. Note that the example in Fig. 3 uses an IISNP panel with loci selected for little variation in allele frequencies. Thus, very distant populations have very similar probabilities of generating the specific multi-locus genotype found in this Korean individual.

      8.2 Random match probability

      The probability of the entered genotype is equivalent to a random match probability (RMP) assuming no deviation from Hardy-Weinberg ratios in the population. The results for one of the IISNP panels (Fig. 3) provides an indication of how rare the genotype is globally. The largest value, the one listed at the top, provides an upper bound for the RMP among the populations tested. In this example the relative likelihoods for the populations in Fig. 3 do not provide useful ancestry information because the SNPs in the IISNP panels were generally chosen to have similar allele frequency values around the world. The results for an AISNP panel can also be interpreted simply as an indication of the upper bound for the RMP among the reference populations.

      8.3 Inference of ancestry

      In the case of ancestry inference each probability can also represent the likelihood that the specific population is the origin of the entered genotype. Fig. 4 is an example. In ancestry inference the absolute value has no meaning; only the relative likelihoods are meaningful. The population with the highest probability is the most likely ancestral population among the set of reference populations. Dividing the highest likelihood by those for the other populations yields likelihood ratios representing how many times more likely the entered genotype is in the most likely population compared to occurring in the specific population. These range from 1 to progressively larger numbers for the less likely populations of origin. More detailed information on Results of the Calculations can be found in Section 5.2 of the User’s Manual.
      Fig. 4
      Fig. 4An example of ancestry inference for a JPT individual using the Combined Panel of 192 SNPs. This screen shot also included the graph of the log likelihoods showing the full range of nearly 100 orders of magnitude.
      Multiple factors affect how one interprets the results from FROG-kb for one of the AISNP panels. Many are discussed in [
      • Kidd K.K.
      Thoughts on estimating ancestry. Chapter 7.
      ]. One of the first is that none of the panels contains reference populations truly representative of the human species. The inference of ancestry for an unknown DNA sample (individual) can only be as good as the global coverage of the reference population samples. If the true population of origin is not among the reference populations, the results cannot identify it. In Fig. 4, even a separate sample of Japanese is a less good fit to the JPT individual than a sample of Han Chinese. Were there no Japanese reference samples, Chinese and Vietnamese would be the most likely ancestries. If the unknown comes from one of the closely related reference populations, any distinction is questionable a priori because the true population of origin may not be the most likely or significantly different from the most likely. Moreover, those issues can be different for different sets of SNPs.
      Using the likelihood framework makes it clear that the “most likely” may not be meaningfully different from other highly likely populations. A very relevant point is that there is a finite probability of the unknown genotype arising in almost every population in the world. Thus, the “most likely” is simply that, the most likely, and others are less likely to extremely unlikely. If the likelihood ratio among the more likely populations is within a factor of 10 of the most likely, there is no meaningful basis for distinguishing among those potential ancestral populations. Even a ratio of up to 100 includes populations that cannot be meaningfully excluded from possibly being ancestral for the specific genotype.
      The fact that the “most likely” ancestral population cannot be interpreted as the true ancestral population may be easier to understand when one considers the fact that the SNPs being used are polymorphic and hence different individuals in the same population will have different genotypes. An example for two individuals from Kerala in India is elaborated in [
      • Kidd K.K.
      Thoughts on estimating ancestry. Chapter 7.
      ]. For one individual, the likelihoods favor populations from South India. For the other individual, a Pakistani population is the most likely and it is not possible to exclude other more northern groups in India.

      9. Examples and exercises in ancestry inference using FROG-kb

      Part of the objective of FROG-kb as a tool for forensic sciences is facilitating the understanding of the results of the calculations. Details of the FROG-kb calculations and more text pertaining to the interpretation of the results are available in the online User Manual. To further help with the understanding of the FROG-kb likelihood results we have included in Supplemental Materials specific examples to illustrate important aspects such as the dependence of results on the specific SNPs used. We have also included in the supplemental material some exercises to help with the inference of ancestry using FROG-kb

      10. Conclusion

      FROG-kb is a unique web site offering access to reference data for many published SNP panels that have forensic relevance and the ability to calculate relevant statistics for an unknown forensic sample when it has genotypes for the SNPs in one of those panels. The web site has undergone significant redesign with enhancements in functionality and user friendliness since the original version was put online in 2011. In addition to the major reorganization of the interface, new published SNP panels have been added with their reference population data. The redesign of the interface to FROG-kb has also involved many individually small changes that are helpful and/or important in standardization. Some of the more important changes are briefly mentioned above with additional information given in supplemental data. An online User’s Manual has been developed and updated to be more useful for those planning to actually use FROG-kb. The text in the User’s Manual and example data provided for each panel are designed to help forensic scientists understand the results of the calculations. As part of ongoing curation of the database efforts will be made to increase the reference panels and populations and to enhance the educational value of FROG-kb.

      Conflicts of interest

      None.

      Acknowledgments

      The underlying database and the FROG-kb and ALFRED web interfaces are supported by grant 2016-DN-BX-0162 to K.K. Kidd by the U.S. National Institute of Justice. Web site redesign was partially supported by the Forensic Technology Center of Excellence (2011-DN-BX-K564) awarded by the U.S. National Institute of Justice, Office of Investigative Sciences. The opinions, findings, and conclusions or recommendations expressed in this publication/program/exhibition are those of the author(s) and do not necessarily reflect those of the United States Department of Justice.

      Appendix A. Supplementary data

      The following is Supplementary data to this article:

      References

        • Rajeevan H.
        • Soundararajan U.
        • Pakstis A.J.
        • Kidd K.K.
        Introducing the Forensic Research/Reference on Genetics knowledge base, FROG-kb.
        Investig. Genet. 2012; 3: 18https://doi.org/10.1186/2041-2223-3-18
        • Ruitberg C.M.
        • Reeder D.J.
        • Butler J.M.
        STRBase: a short tandem repeat DNA database for the human identity testing community.
        Nucleic Acids Res. 2001; 29: 320-322
        • Gill P.
        • Foreman L.
        • Buckleton J.S.
        • Triggs C.M.
        • Allen H.
        A comparison of adjustment methods to test the robustness of an STR DNA database comprised of 24 European populations.
        Forensic Sci. Int. 2003; 131: 184-196
        • Welch L.A.
        • Gill P.
        • Phillips C.
        • Ansell R.
        • Morling N.
        • Parson W.
        • et al.
        European network of forensic science institutes (ENFSI): evaluation of new commercial STR multiplexes that include the european standard set (ESS) of markers.
        Forensic Sci. Int. Genet. 2012; 6: 819-826https://doi.org/10.1016/j.fsigen.2012.03.005
        • Algee-Hewitt B.F.
        • Edge M.D.
        • Kim J.
        • Li J.Z.
        • Rosenberg N.A.
        Individual identifiability predicts population identifiability in forensic microsatellite markers.
        Curr. Biol. 2016; 26: 935-942https://doi.org/10.1016/j.cub.2016.01.065
        • Walsh S.
        • Liu F.
        • Ballantyne K.N.
        • van Oven M.
        • Lao O.
        • Kayser M.
        IrisPlex: a sensitive DNA tool for accurate prediction of blue and brown eye colour in the absence of ancestry information.
        Forensic Sci. Int. Genet. 2011; 5: 170-180https://doi.org/10.1016/j.fsigen.2010.02.004
        • Liu F.
        • Walsh S.
        • Kayser M.
        Of sex and IrisPlex eye colour prediction: a reply to Martinez-Cadenas, et al.
        Forensic Sci. Int. Genet. 2014; 9: e5-6https://doi.org/10.1016/j.fsigen.2013.06.006
        • Soundararajan U.
        • Yun L.
        • Shi M.
        • Kidd K.K.
        Minimal SNP overlap among multiple panels of ancestry informative markers argues for more international collaboration.
        Forensic Sci. Int. Genet. 2016; 23: 25-32https://doi.org/10.1016/j.fsigen.2016.01.013
        • Pritchard J.K.
        • Stephens M.
        • Donnelly P.
        Inference of population structure using multilocus genotype data.
        Genetics. 2000; 155: 945-959
        • Kidd K.K.
        • Speed W.C.
        • Pakstis A.J.
        • Furtado M.R.
        • Fang R.
        • Madbouly A.
        • et al.
        Progress toward an efficient panel of SNPs for ancestry inference.
        Forensic Sci. Int. Genet. 2014; 10: 23-32https://doi.org/10.1016/j.fsigen.2014.01.002
        • Pakstis A.J.
        • Kang L.
        • Liu L.
        • Zhang Z.
        • Ji T.
        • Grigorenko E.L.
        • et al.
        Increasing the reference populations for the 55 AISNP panel: the need and benefits.
        Int. J. Legal Med. 2017; 131: 913-917https://doi.org/10.1007/s00414-016-1524-z
        • Kosoy R.
        • Nassir R.
        • Tian C.
        • White P.A.
        • Butler L.M.
        • Silva G.
        • et al.
        Ancestry informative marker sets for determining continental origin and admixture proportions in common populations in America.
        Hum. Mutat. 2009; 30: 69-78https://doi.org/10.1002/humu.20822
        • Kidd J.R.
        • Friedlaender F.R.
        • Speed W.C.
        • Pakstis A.J.
        • De La Vega F.M.
        • Kidd K.K.
        Analyses of a set of 128 ancestry informative single-nucleotide polymorphisms in a global set of 119 population samples.
        Investig. Genet. 2011; 2: 1https://doi.org/10.1186/2041-2223-2-1
        • Phillips C.
        • Salas A.
        • Sanchez J.J.
        • Fondevila M.
        • Gomez-Tato A.
        • Alvarez-Dios J.
        • et al.
        Inferring ancestral origin using a single multiplex assay of ancestry-informative marker SNPs.
        Forensic Sci. Int. Genet. 2007; 1: 273-280https://doi.org/10.1016/j.fsigen.2007.06.008
        • 1000 Genomes Consortium
        • Abecasis G.R.
        • Auton A.
        • Brooks L.D.
        • DePristo M.A.
        • Durbin R.M.
        • Handsaker R.E.
        • et al.
        An integrated map of genetic variation from 1,092 human genomes.
        Nature. 2012; 491: 56-65https://doi.org/10.1038/nature11632
        • Sudmant P.H.
        • Rausch T.
        • Gardner E.J.
        • Handsaker R.E.
        • Abyzov A.
        • Huddleston J.
        • et al.
        An integrated map of structural variation in 2, 504 human genomes.
        Nature. 2015; 526: 75-81https://doi.org/10.1038/nature15394
        • Kidd K.K.
        Thoughts on estimating ancestry. Chapter 7.
        in: Amorim Antonio Budowle Bruce Handbook of Forensic Genetics: Biodiversity and Heredity in Civil and Criminal Investigation. 2016