Advertisement
Research paper| Volume 50, 102395, January 2021

The impact of correlations between pigmentation phenotypes and underlying genotypes on genetic prediction of pigmentation traits

  • Yan Chen
    Affiliations
    Department of Genetic Identification, Erasmus MC University Medical Center Rotterdam, Rotterdam, the Netherlands

    Key Laboratory of Genomic and Precision Medicine, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, China

    University of Chinese Academy of Sciences, Beijing, China
    Search for articles by this author
  • Wojciech Branicki
    Affiliations
    Malopolska Centre of Biotechnology, Jagiellonian University, Krakow, Poland
    Search for articles by this author
  • Susan Walsh
    Affiliations
    Department of Biology, Indiana University Purdue University Indianapolis (IUPUI), Indianapolis, IN, USA
    Search for articles by this author
  • Michael Nothnagel
    Affiliations
    Cologne Center for Genomics, University of Cologne, Cologne, Germany

    University Hospital Cologne, Cologne, Germany
    Search for articles by this author
  • Author Footnotes
    1 these authors contributed equally to this work.
    Manfred Kayser
    Footnotes
    1 these authors contributed equally to this work.
    Affiliations
    Department of Genetic Identification, Erasmus MC University Medical Center Rotterdam, Rotterdam, the Netherlands
    Search for articles by this author
  • Author Footnotes
    1 these authors contributed equally to this work.
    Fan Liu
    Correspondence
    Corresponding author.
    Footnotes
    1 these authors contributed equally to this work.
    Affiliations
    Department of Genetic Identification, Erasmus MC University Medical Center Rotterdam, Rotterdam, the Netherlands

    Key Laboratory of Genomic and Precision Medicine, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, China

    University of Chinese Academy of Sciences, Beijing, China
    Search for articles by this author
  • on behalf of the VISAGE Consortium
  • Author Footnotes
    1 these authors contributed equally to this work.
Open AccessPublished:September 24, 2020DOI:https://doi.org/10.1016/j.fsigen.2020.102395

      Highlights

      • Pigmentation phenotypes and underlying genotypes are highly correlated.
      • Previous genetic prediction models did not consider phenotype correlations.
      • Testing impact of pigmentation trait correlation on genetic pigmentation prediction.
      • Observed correlated pigmentation phenotypes improve genetic pigmentation prediction.
      • DNA-predicted correlated phenotypes have no impact on genetic pigmentation prediction.

      Abstract

      Predicting appearance phenotypes from genotypes is relevant for various areas of human genetic research and applications such as genetic epidemiology, human history, anthropology, and particularly in forensics. Many appearance phenotypes, and thus their underlying genotypes, are highly correlated, with pigmentation traits serving as primary examples. However, all available genetic prediction models, including those for pigmentation traits currently used in forensic DNA phenotyping, ignore phenotype correlations. Here, we investigated the impact of appearance phenotype correlations on genetic appearance prediction in the exemplary case of three pigmentation traits. We used data for categorical eye, hair and skin colour as well as 41 DNA markers utilized in the recently established HIrisPlex-S system from 762 individuals with complete phenotype and genotype information. Based on these data, we performed genetic prediction modelling of eye, hair and skin colour via three different strategies, namely the established approach of predicting phenotypes solely based on genotypes while not considering phenotype correlations, and two novel approaches that considered phenotype correlations, either incorporating truly observed correlated phenotypes or DNA-predicted correlated phenotypes in addition to the DNA predictors. We found that using truly observed correlated pigmentation phenotypes as additional predictors increased the DNA-based prediction accuracies for almost all eye, hair and skin colour categories, with the largest increase for intermediate eye colour, brown hair colour, dark to black skin colour, and particularly for dark skin colour. Outcomes of dedicated computer simulations suggest that this prediction accuracy increase is due to the additional genetic information that is implicitly provided by the truly observed correlated pigmentation phenotypes used, yet not covered by the DNA predictors applied. In contrast, considering DNA-predicted correlated pigmentation phenotypes as additional predictors did not improve the performance of the genetic prediction of eye, hair and skin colour, which was in line with the results from our computer simulations. Hence, in practical applications of DNA-based appearance prediction where no phenotype knowledge is available, such as in forensic DNA phenotyping, it is not advised to use DNA-predicted correlated phenotypes as predictors in addition to the DNA predictors. In the very least, this is not recommended for the pigmentation traits and the established pigmentation DNA predictors tested here.

      Keywords

      1. Introduction

      All human appearance traits are highly heritable phenotypes, with examples of body height with up to 80 % [
      • Silventoinen K.
      • Sammalisto S.
      • Perola M.
      • Boomsma D.I.
      • Cornes B.K.
      • Davis C.
      • et al.
      Heritability of adult body height: a comparative study of twin cohorts in eight countries.
      ], facial shapes with up to 90 % [
      • Xiong Z.
      • Dankova G.
      • Howe L.J.
      • Lee M.K.
      • Hysi P.G.
      • de Jong M.A.
      • et al.
      Novel genetic loci affecting facial shape variation in humans.
      ], hair shape with up to 95 % [
      • Medland S.E.
      • Zhu G.
      • Martin N.G.
      Estimating the heritability of hair curliness in twins of European ancestry.
      ]; hair and eye colour with up to 99 % [
      • Lin B.D.
      • Mbarek H.
      • Willemsen G.
      • Dolan C.V.
      • Fedko I.O.
      • Abdellaoui A.
      • et al.
      Heritability and genome-wide association studies for hair color in a dutch twin family based sample.
      ] estimated heritability values. Various genome-wide association studies (GWASs), the more recent ones with large sample size, have revealed numerous underlying genes per each of several human appearance traits such as eye colour [
      • Sulem P.
      • Gudbjartsson D.F.
      • Stacey S.N.
      • Helgason A.
      • Rafnar T.
      • Magnusson K.P.
      • et al.
      Genetic determinants of hair, eye and skin pigmentation in Europeans.
      ,
      • Kayser M.
      • Liu F.
      • Janssens A.C.
      • Rivadeneira F.
      • Lao O.
      • van Duijn K.
      • et al.
      Three genome-wide association studies and a linkage analysis identify HERC2 as a human iris color gene.
      ,
      • Liu F.
      • Wollstein A.
      • Hysi P.G.
      • Ankra-Badu G.A.
      • Spector T.D.
      • Park D.
      • et al.
      Digital quantification of human eye color highlights genetic association of three new loci.
      ], hair colour [
      • Sulem P.
      • Gudbjartsson D.F.
      • Stacey S.N.
      • Helgason A.
      • Rafnar T.
      • Magnusson K.P.
      • et al.
      Genetic determinants of hair, eye and skin pigmentation in Europeans.
      ,
      • Han J.
      • Kraft P.
      • Nan H.
      • Guo Q.
      • Chen C.
      • Qureshi A.
      • et al.
      A genome-wide association study identifies novel alleles associated with hair color and skin pigmentation.
      ,
      • Hysi P.G.
      • Valdes A.M.
      • Liu F.
      • Furlotte N.A.
      • Evans D.M.
      • Bataille V.
      • et al.
      Genome-wide association meta-analysis of individuals of European ancestry identifies new loci explaining a substantial fraction of hair color variation and heritability.
      ]; eyebrow colour [
      • Peng F.
      • Zhu G.
      • Hysi P.G.
      • Eller R.J.
      • Chen Y.
      • Li Y.
      • et al.
      Genome-wide association studies identify multiple genetic loci influencing eyebrow color variation in Europeans.
      ], skin colour [
      • Sulem P.
      • Gudbjartsson D.F.
      • Stacey S.N.
      • Helgason A.
      • Rafnar T.
      • Magnusson K.P.
      • et al.
      Genetic determinants of hair, eye and skin pigmentation in Europeans.
      ,
      • Han J.
      • Kraft P.
      • Nan H.
      • Guo Q.
      • Chen C.
      • Qureshi A.
      • et al.
      A genome-wide association study identifies novel alleles associated with hair color and skin pigmentation.
      ,
      • Visconti A.
      • Duffy D.L.
      • Liu F.
      • Zhu G.
      • Wu W.
      • Chen Y.
      • et al.
      Genome-wide association study in 176,678 Europeans reveals genetic loci for tanning response to sun exposure.
      ], hair structure [
      • Medland S.E.
      • Nyholt D.R.
      • Painter J.N.
      • McEvoy B.P.
      • McRae A.F.
      • Zhu G.
      • et al.
      Common variants in the trichohyalin gene are associated with straight hair in Europeans.
      ,
      • Liu F.
      • Chen Y.
      • Zhu G.
      • Hysi P.G.
      • Wu S.
      • Adhikari K.
      • et al.
      Meta-analysis of genome-wide association studies identifies 8 novel loci involved in shape variation of human head hair.
      ], hair loss in men [
      • Hagenaars S.P.
      • Hill W.D.
      • Harris S.E.
      • Ritchie S.J.
      • Davies G.
      • Liewald D.C.
      • et al.
      Genetic prediction of male pattern baldness.
      ], facial shape [
      • Xiong Z.
      • Dankova G.
      • Howe L.J.
      • Lee M.K.
      • Hysi P.G.
      • de Jong M.A.
      • et al.
      Novel genetic loci affecting facial shape variation in humans.
      ], body height [
      • Lango Allen H.
      • Estrada K.
      • Lettre G.
      • Berndt S.I.
      • Weedon M.N.
      • Rivadeneira F.
      • et al.
      Hundreds of variants clustered in genomic loci and biological pathways affect human height.
      ,
      • Wood A.R.
      • Esko T.
      • Yang J.
      • Vedantam S.
      • Pers T.H.
      • Gustafsson S.
      • et al.
      Defining the role of common variation in the genomic and biological architecture of adult human height.
      ,
      • Marouli E.
      • Graff M.
      • Medina-Gomez C.
      • Lo K.S.
      • Wood A.R.
      • Kjaer T.R.
      • et al.
      Rare and low-frequency coding variants alter human adult height.
      ], explaining varying portions of their heritability and highlighting their complex genetic nature. Some appearance phenotypes are highly correlated with each other in human populations, such as certain facial shapes [
      • Xiong Z.
      • Dankova G.
      • Howe L.J.
      • Lee M.K.
      • Hysi P.G.
      • de Jong M.A.
      • et al.
      Novel genetic loci affecting facial shape variation in humans.
      ], and especially the different human pigmentation traits [
      • Peng F.
      • Zhu G.
      • Hysi P.G.
      • Eller R.J.
      • Chen Y.
      • Li Y.
      • et al.
      Genome-wide association studies identify multiple genetic loci influencing eyebrow color variation in Europeans.
      ]. It is common knowledge that, for instance, blue eye colour shows a tendency to co-occur with blond hair and light skin colour in Europeans, red hair colour typically co-occurs with pale skin colour and often with freckles in Europeans, and dark skin colour usually co-occurs with black hair and brown eye colour in Africans, certain Asian populations, New Guineans and Australian Aborigines. Moreover, GWAS outcomes have revealed that the different human pigmentation traits share a large (but not complete) proportion of underlying genetic components that explains their phenotypic correlations [
      • Sulem P.
      • Gudbjartsson D.F.
      • Stacey S.N.
      • Helgason A.
      • Rafnar T.
      • Magnusson K.P.
      • et al.
      Genetic determinants of hair, eye and skin pigmentation in Europeans.
      ,
      • Han J.
      • Kraft P.
      • Nan H.
      • Guo Q.
      • Chen C.
      • Qureshi A.
      • et al.
      A genome-wide association study identifies novel alleles associated with hair color and skin pigmentation.
      ,
      • Peng F.
      • Zhu G.
      • Hysi P.G.
      • Eller R.J.
      • Chen Y.
      • Li Y.
      • et al.
      Genome-wide association studies identify multiple genetic loci influencing eyebrow color variation in Europeans.
      ].
      Previously, predictive DNA markers were identified and statistical prediction models were developed for all pigmentation-related traits, namely eye colour [
      • Liu F.
      • van Duijn K.
      • Vingerling J.R.
      • Hofman A.
      • Uitterlinden A.G.
      • Janssens A.C.
      • et al.
      Eye color and the prediction of complex phenotypes from genotypes.
      ,
      • Walsh S.
      • Liu F.
      • Ballantyne K.N.
      • van Oven M.
      • Lao O.
      • Kayser M.
      IrisPlex: a sensitive DNA tool for accurate prediction of blue and brown eye colour in the absence of ancestry information.
      ,
      • Ruiz Y.
      • Phillips C.
      • Gomez-Tato A.
      • Alvarez-Dios J.
      • Casares de Cal M.
      • Cruz R.
      • et al.
      Further development of forensic eye color predictive tests.
      ], head hair colour [
      • Hysi P.G.
      • Valdes A.M.
      • Liu F.
      • Furlotte N.A.
      • Evans D.M.
      • Bataille V.
      • et al.
      Genome-wide association meta-analysis of individuals of European ancestry identifies new loci explaining a substantial fraction of hair color variation and heritability.
      ,
      • Branicki W.
      • Liu F.
      • van Duijn K.
      • Draus-Barini J.
      • Pospiech E.
      • Walsh S.
      • et al.
      Model-based prediction of human hair color using DNA variants.
      ,
      • Walsh S.
      • Liu F.
      • Wollstein A.
      • Kovatsi L.
      • Ralf A.
      • Kosiniak-Kamysz A.
      • et al.
      The HIrisPlex system for simultaneous prediction of hair and eye colour from DNA.
      ,
      • Sochtig J.
      • Phillips C.
      • Maronas O.
      • Gomez-Tato A.
      • Cruz R.
      • Alvarez-Dios J.
      • et al.
      Exploration of SNP variants affecting hair colour prediction in Europeans.
      ], skin colour [
      • Maronas O.
      • Phillips C.
      • Sochtig J.
      • Gomez-Tato A.
      • Cruz R.
      • Alvarez-Dios J.
      • et al.
      Development of a forensic skin colour predictive test.
      ,
      • Walsh S.
      • Chaitanya L.
      • Breslin K.
      • Muralidharan C.
      • Bronikowska A.
      • Pospiech E.
      • et al.
      Global skin colour prediction from DNA.
      ,
      • Chaitanya L.
      • Breslin K.
      • Zuniga S.
      • Wirken L.
      • Pospiech E.
      • Kukla-Bartoszek M.
      • et al.
      The HIrisPlex-S system for eye, hair and skin colour prediction from DNA: introduction and forensic developmental validation.
      ], eyebrow colour [
      • Peng F.
      • Zhu G.
      • Hysi P.G.
      • Eller R.J.
      • Chen Y.
      • Li Y.
      • et al.
      Genome-wide association studies identify multiple genetic loci influencing eyebrow color variation in Europeans.
      ], and freckles [
      • Hernando B.
      • Ibanez M.V.
      • Deserio-Cuesta J.A.
      • Soria-Navarro R.
      • Vilar-Sastre I.
      • Martinez-Cadenas C.
      Genetic determinants of freckle occurrence in the Spanish population: towards ephelides prediction from human DNA samples.
      ,
      • Kukla-Bartoszek M.
      • Pospiech E.
      • Wozniak A.
      • Boron M.
      • Karlowska-Pik J.
      • Teisseyre P.
      • et al.
      DNA-based predictive models for the presence of freckles.
      ]. The recently established HIrisPlex-S system [
      • Chaitanya L.
      • Breslin K.
      • Zuniga S.
      • Wirken L.
      • Pospiech E.
      • Kukla-Bartoszek M.
      • et al.
      The HIrisPlex-S system for eye, hair and skin colour prediction from DNA: introduction and forensic developmental validation.
      ] currently represents the most complete DNA-based pigmentation prediction tool, allowing simultaneous prediction of eye, head hair, and skin colour from DNA, including low quality and low quantity forensic DNA, based on 41 carefully selected DNA markers and three separate prediction models. HIrisPlex-S reflects an extension of the previously developed IrisPlex system for eye colour [
      • Walsh S.
      • Liu F.
      • Ballantyne K.N.
      • van Oven M.
      • Lao O.
      • Kayser M.
      IrisPlex: a sensitive DNA tool for accurate prediction of blue and brown eye colour in the absence of ancestry information.
      ] and the previous HIrisPlex system for eye and hair colour prediction [
      • Walsh S.
      • Liu F.
      • Wollstein A.
      • Kovatsi L.
      • Ralf A.
      • Kosiniak-Kamysz A.
      • et al.
      The HIrisPlex system for simultaneous prediction of hair and eye colour from DNA.
      ]. Genotyping assays of the HIrisPlex-S system are available based on SNaPshot single base extension technology and capillary electrophoresis (CE) [
      • Chaitanya L.
      • Breslin K.
      • Zuniga S.
      • Wirken L.
      • Pospiech E.
      • Kukla-Bartoszek M.
      • et al.
      The HIrisPlex-S system for eye, hair and skin colour prediction from DNA: introduction and forensic developmental validation.
      ], in addition to two widely used massively parallel sequencing (MPS) technologies: Ion Torrent and Illumina [
      • Breslin K.
      • Wills B.
      • Ralf A.
      • Ventayol Garcia M.
      • Kukla-Bartoszek M.
      • Pospiech E.
      • et al.
      HIrisPlex-S system for eye, hair, and skin color prediction from DNA: massively parallel sequencing solutions for two common forensically used platforms.
      ]. Moreover, the VISAGE Consortium recently incorporated the HIrisPlex-S DNA markers together with continental ancestry informative DNA markers to function as all-in-one tools for both MPS platforms separately (Xavier et al. under review; Palencia-Madrid et al. under review). All available genotyping assays of the IrisPlex, the HIrisPlex, and the HIrisPlex-S systems have been forensically validated [
      • Chaitanya L.
      • Breslin K.
      • Zuniga S.
      • Wirken L.
      • Pospiech E.
      • Kukla-Bartoszek M.
      • et al.
      The HIrisPlex-S system for eye, hair and skin colour prediction from DNA: introduction and forensic developmental validation.
      ,
      • Breslin K.
      • Wills B.
      • Ralf A.
      • Ventayol Garcia M.
      • Kukla-Bartoszek M.
      • Pospiech E.
      • et al.
      HIrisPlex-S system for eye, hair, and skin color prediction from DNA: massively parallel sequencing solutions for two common forensically used platforms.
      ,
      • Walsh S.
      • Lindenbergh A.
      • Zuniga S.B.
      • Sijen T.
      • de Knijff P.
      • Kayser M.
      • et al.
      Developmental validation of the IrisPlex system: determination of blue and brown iris colour for forensic intelligence.
      ,
      • Walsh S.
      • Chaitanya L.
      • Clarisse L.
      • Wirken L.
      • Draus-Barini J.
      • Kovatsi L.
      • et al.
      Developmental validation of the HIrisPlex system: DNA-based eye and hair colour prediction for forensic and anthropological usage.
      ]. The three statistical prediction models, i.e., the IrisPlex model for eye colour [
      • Walsh S.
      • Liu F.
      • Ballantyne K.N.
      • van Oven M.
      • Lao O.
      • Kayser M.
      IrisPlex: a sensitive DNA tool for accurate prediction of blue and brown eye colour in the absence of ancestry information.
      ], the HIrisPlex model for hair colour [
      • Walsh S.
      • Liu F.
      • Wollstein A.
      • Kovatsi L.
      • Ralf A.
      • Kosiniak-Kamysz A.
      • et al.
      The HIrisPlex system for simultaneous prediction of hair and eye colour from DNA.
      ], and the HIrisPlex-S model for skin colour prediction [
      • Chaitanya L.
      • Breslin K.
      • Zuniga S.
      • Wirken L.
      • Pospiech E.
      • Kukla-Bartoszek M.
      • et al.
      The HIrisPlex-S system for eye, hair and skin colour prediction from DNA: introduction and forensic developmental validation.
      ] are all publically available in their most updated versions [
      • Chaitanya L.
      • Breslin K.
      • Zuniga S.
      • Wirken L.
      • Pospiech E.
      • Kukla-Bartoszek M.
      • et al.
      The HIrisPlex-S system for eye, hair and skin colour prediction from DNA: introduction and forensic developmental validation.
      ] via the website https://hirisplex.erasmusmc.nl/.
      These laboratory and/or statistical tools are already of practical relevance for DNA-based pigmentation trait prediction in several different applications such as forensic investigation [
      • Kayser M.
      Forensic DNA Phenotyping: predicting human appearance from crime scene material for investigative purposes.
      ,
      • Schneider P.M.
      • Prainsack B.
      • Kayser M.
      The use of forensic DNA phenotyping in predicting appearance and biogeographic ancestry.
      ], human history inference [
      • King T.E.
      • Fortes G.G.
      • Balaresque P.
      • Thomas M.G.
      • Balding D.
      • Delser P.M.
      • et al.
      Identification of the remains of king richard III.
      ], and anthropological research [
      • Brace S.
      • Diekmann Y.
      • Booth T.J.
      • van Dorp L.
      • Faltyskova Z.
      • Rohland N.
      • et al.
      Ancient genomes indicate population replacement in early neolithic Britain.
      ], with more applications being expected in the future. Within the concept of Forensic DNA Phenotyping, predicting pigmentation traits of an unknown crime scene sample donor directly from crime scene DNA can provide useful investigative leads to find unknown perpetrators of the crime, in cases without a DNA profile match with a known suspect [
      • Kayser M.
      Forensic DNA Phenotyping: predicting human appearance from crime scene material for investigative purposes.
      ,
      • Schneider P.M.
      • Prainsack B.
      • Kayser M.
      The use of forensic DNA phenotyping in predicting appearance and biogeographic ancestry.
      ]. In human history investigations, DNA-based pigmentation prediction allows us to reveal the pigment of historical individuals from analysing their remains [
      • King T.E.
      • Fortes G.G.
      • Balaresque P.
      • Thomas M.G.
      • Balding D.
      • Delser P.M.
      • et al.
      Identification of the remains of king richard III.
      ]. In anthropological and human evolutionary research, genetic pigmentation prediction captures how humans and human populations may have looked in the past, including the distant past [
      • Brace S.
      • Diekmann Y.
      • Booth T.J.
      • van Dorp L.
      • Faltyskova Z.
      • Rohland N.
      • et al.
      Ancient genomes indicate population replacement in early neolithic Britain.
      ], and allows a deeper understanding of the evolutionary history of human pigmentation traits [
      • Wilde S.
      • Timpson A.
      • Kirsanow K.
      • Kaiser E.
      • Kayser M.
      • Unterlander M.
      • et al.
      Direct evidence for positive selection of skin, hair, and eye pigmentation in Europeans during the last 5,000 y.
      ].
      However, all currently available genetic prediction models for pigmentation traits ignore the well-known correlations between the different pigmentation phenotypes and their underlying genotypes. It could be expected, however, that considering phenotype correlations in the genetic prediction modelling may increase the accuracy of DNA-based prediction for eye, hair and skin colour, which to the best of our knowledge has not been reported as of yet. Here, we empirically tested for the impact of correlations between appearance phenotypes and their underlying genotypes on DNA-based appearance prediction using pigmentation traits as classical example. We applied categorical eye, hair and skin colour phenotype data and genotype data for the 41 HIrisPlex-S DNA markers from 762 individuals for whom complete phenotype and genotype information was available. Based on these data, we empirically estimated pigmentation phenotype correlations and their proportions that were attributable to the 41 HIrisPlex-S DNA markers. Next, we performed genetic prediction modelling of eye, hair and skin colour via three different strategies, namely the established approach of predicting phenotypes solely based on genotypes while not considering phenotype correlations, and two novel approaches considering phenotype correlations, either incorporating truly observed correlated phenotypes or DNA-predicted correlated phenotypes as additional predictors, and compared the prediction accuracies of these different models by empirical observation. Finally, we conducted computer simulations, emulating the three different prediction strategies, to better understand the impact of phenotype and genotype correlations on DNA-based phenotype prediction in an effort to interpret the outcomes of the empirical pigmentation prediction modelling we obtained.

      2. Materials and methods

      2.1 Phenotype and genotype data

      The data used here were all from the previous IrisPlex, HIrisPlex and HIrisPlex-S projects (https://hirisplex.erasmusmc.nl/) as described elsewhere [
      • Walsh S.
      • Liu F.
      • Ballantyne K.N.
      • van Oven M.
      • Lao O.
      • Kayser M.
      IrisPlex: a sensitive DNA tool for accurate prediction of blue and brown eye colour in the absence of ancestry information.
      ,
      • Walsh S.
      • Liu F.
      • Wollstein A.
      • Kovatsi L.
      • Ralf A.
      • Kosiniak-Kamysz A.
      • et al.
      The HIrisPlex system for simultaneous prediction of hair and eye colour from DNA.
      ,
      • Chaitanya L.
      • Breslin K.
      • Zuniga S.
      • Wirken L.
      • Pospiech E.
      • Kukla-Bartoszek M.
      • et al.
      The HIrisPlex-S system for eye, hair and skin colour prediction from DNA: introduction and forensic developmental validation.
      ] and represent a subset of 762 individual datasets of different populations from Europe and the US for which a complete pigmentation phenotype profile (i.e., categorical eye, hair, and skin colour) and a complete genotype profile (i.e., all 41 HIrisPlex-S DNA markers) were jointly available (Table 1). Samples had been collected for the purpose of appearance genetic research under written informed consent, and sample collections where approved by the Ethics Committee of the Jagiellonian University (KBET/17/B/2005), the Commission on Bioethics of the Regional Board of Medical Doctors in Krakow (48 KBL/OIL/2008), and by the Indiana University Ethical Institutional Review Board (#1409306349). As previously described in detail [
      • Walsh S.
      • Liu F.
      • Ballantyne K.N.
      • van Oven M.
      • Lao O.
      • Kayser M.
      IrisPlex: a sensitive DNA tool for accurate prediction of blue and brown eye colour in the absence of ancestry information.
      ,
      • Walsh S.
      • Liu F.
      • Wollstein A.
      • Kovatsi L.
      • Ralf A.
      • Kosiniak-Kamysz A.
      • et al.
      The HIrisPlex system for simultaneous prediction of hair and eye colour from DNA.
      ,
      • Chaitanya L.
      • Breslin K.
      • Zuniga S.
      • Wirken L.
      • Pospiech E.
      • Kukla-Bartoszek M.
      • et al.
      The HIrisPlex-S system for eye, hair and skin colour prediction from DNA: introduction and forensic developmental validation.
      ], eye colour was classified into three categories: blue, intermediate, and brown; hair colour into four categories: red, blond, brown, and black; and skin colour into five categories: very pale, pale, intermediate, dark, and dark to black. The 41 HIrisPlex-S DNA markers were described elsewhere [
      • Walsh S.
      • Liu F.
      • Wollstein A.
      • Kovatsi L.
      • Ralf A.
      • Kosiniak-Kamysz A.
      • et al.
      The HIrisPlex system for simultaneous prediction of hair and eye colour from DNA.
      ,
      • Chaitanya L.
      • Breslin K.
      • Zuniga S.
      • Wirken L.
      • Pospiech E.
      • Kukla-Bartoszek M.
      • et al.
      The HIrisPlex-S system for eye, hair and skin colour prediction from DNA: introduction and forensic developmental validation.
      ].
      Table 1Characteristics of the study population representing a subset of the HIrisPlex-S dataset with complete data on eye, hair, and skin colour phenotypes and 41 HIrisPlex-S SNP genotypes.
      NProportion of trait categories (%)Males/Females
      762328/434
      Eye colourBlue32843.04158/170
      Intermediate9712.7337/60
      Brown33744.23133/204
      Hair colourRed172.237/10
      Blond33043.31145/185
      Brown32843.04124/204
      Blond8711.4252/35
      Skin colourVery Pale283.6713/15
      Pale35546.59142/213
      Intermediate31541.34145/170
      Dark364.7219/17
      Dark to Black283.679/19

      2.2 Statistical analyses

      Since we considered here, for subsequent genetic prediction modelling, a subset of the original data sets previously applied for the initial IrisPlex eye colour, HIrisPlex hair colour and HIrisPlex-S skin colour models, we first evaluated this reduced data subset with respect to sufficient statistical power and concordant SNP effects. To this end, ordinal phenotypes were considered as continuous variables by assigning ascending integer values, i.e. 1, 2 and so forth (from the lightest to the darkest levels), to the ordered trait categories and Z-transformed (Z=y-y¯sd(y),y is the trait, y¯ is the mean value of y, sd(y) is the standard deviation of y). Then we tested each SNP for a phenotypic association using a linear regression model adjusted for sex, thereby taking advantage of the robustness of the linear regression approach in the absence of a normally distributed dependent variable. In addition, multivariate analysis of variance (MANOVA) was conducted to test the association between a given SNP and the three pigmentation phenotypes simultaneously. For association testing, two missense MC1R DNA variants rs1805007_T (R151C) and rs1805008_T (R160W), which are known to be involved in human red hair and related pigmentation phenotypes in a compound heterozygote manner [
      • Liu F.
      • Struchalin M.V.
      • Duijn K.
      • Hofman A.
      • Uitterlinden A.G.
      • Duijn C.
      • et al.
      Detecting low frequent loss-of-function alleles in genome wide association studies with red hair color as example.
      ] were collapsed into three possible genotypes wt/wt/, wt/R, and R/R, where R is the risk haplotype consisting of at least one minor allele from any of two MC1R variants and wt is the wild-type haplotype consisting of only major alleles. Then, this MC1R compound marker was treated as a discrete bi-allelic marker in the association analysis, while the remaining MC1R SNPs were used as discrete markers.
      The 41 HIrisPlex-S DNA markers are located at 11 different genetic loci. For each locus, the associated SNP with the smallest P-value (i.e. the strongest association effect) in a MANOVA was selected and its allelic effects across different phenotypes were investigated. In this analysis, ordinal phenotypes were considered as continuous variables and Z-transformed as introduced before. The effect alleles were selected to have a colour darkening effect and the allelic effect was estimated as E= 2f(1-f)β2, where f is the frequency of the effect allele and β is the regression coefficient from linear models. The E ranges from 0 for no effect to 1 for fully explaining the phenotype variance and can be interpreted as the genetically explained proportion of the phenotype variance. Ordinal phenotypes (from the lightest to the darkest levels) were considered as continuous variables and Pearson’s correlation coefficient (r) was calculated for each pair of phenotypes.
      To quantify the extent to which the correlation between two phenotypes can be explained by their shared explanatory factors under investigation, we derived a statistic
      C=cor(y1,yˆ1)cor(yˆ1,yˆ2)cor(yˆ2,y2)cor(y1,y2)


      where y1 and y2 represent any two of the three correlated pigmentation phenotypes, and yˆ1 and yˆ2 represent the predicted values from respective linear models. C typically ranges between 0 and 1, although values outside this interval are possible under extreme scenarios. A value of 0 indicates that the considered explanatory factors cannot explain any of the observed phenotypic correlation while 1 represents the case that the observed correlation can be perfectly explained by the considered factors. For example, with values of 0.5 for C and 0.8 for the observed phenotypic correlation, 50 % of this correlation can be explained by the considered set of predictors, whereas the remaining 50 % are due to other shared genetic factors, such as other SNPs or other forms of genetic variation, and non-genetic factors, such as age, sex or environmental factors. We then applied this approach to the 41 HIrisPlex-S SNPs constituting the set of considered predictors to assess the proportion of phenotypic correlation that can be explained by these genetic markers.
      Moreover, we empirically assessed the statistical properties of C by computer simulations in order to ensure that C yields an unbiased prediction of the explained correlation. To this end, we decomposed the set of all contributing factors into three mutually exclusive sets (i.e. a variance decomposition by orthogonal factors), namely genetic factors that are shared between phenotypes and known, accessible and therefore included in the model, s1, genetic and environmental factors that are shared between phenotypes but are unknown, s2, and genetic and environmental factors that are unique to a particular phenotype, u. We therefore assume that accessible shared environment has a negligible effect in our model. More specific, we generated two correlated traits, y1 and y2, for 1000 individuals by simulating their shared and unique explanatory components with identical effect sizes.
      y1=u1+αs1+s2


      y2=u2+αs1+s2


      u1,u2independentandidenticallydistributedi.i.d.N0,1+α29;


      s1,s2i.i.d.N0,1;


      α=0.00,0.33,0.50,0.65,0.82,1.00,1.22,1.53,2.00,3.00


      where u1 and u2 represent explanatory factors having effects unique to y1 and y2, respectively, while s1 represents accessible genetic factors and s2 unknown factors, whereas α is used to regulate the shared component variance proportion explained by s1. For pigmentation traits, u1 and u2 may represent genetic and environmental factors having an effect on one pigmentation trait but not on another and vice versa, s1 may combine all 41 SNPs and s2 may represent yet to be discovered genetic factors influencing both traits. The expectation of C is α21+α2 (see detailed derivation in Supplementary Materials). Then the estimated C was investigated under a range of expectation of C,E(C)=0%,10%,20%,30%,40%,50%,60%,70%,80%,90% (correspondingα=0.00,0.33,0.50,0.65,0.82,1.00,1.22,1.53,2.00,3.00). The simulation was conducted with 1000 replicates for each E(C).
      To evaluate the impact of incorporating additional correlated phenotypes as predictors in the model on prediction accuracy for the targeted phenotype, we compared three strategies by including different sets of predictors. In strategy I, we accessed the base-line prediction accuracies when solely using the 41 HIrisPlex-S DNA markers as the predictors. In strategy II, in addition to the 41 HIrisPlex-S DNA markers, the fitted values of the correlated phenotypes that were predicted from the 41 DNA markers were used as additional predictors. In strategy III, in addition to the 41 HIrisPlex-S DNA markers, the truly observed correlated phenotypes were used as additional predictors. In particular for strategy II and III, for predicting eye colour, DNA-predicted (II) or truly observed (III) hair colour and skin colour phenotypes were included as additional predictors; for predicting hair colour, DNA-predicted (II) or truly observed (III) eye and skin colour phenotypes were included as additional predictors; and when predicting skin colour, DNA-predicted (II) or truly observed (III) eye and hair colour phenotypes were included as additional predictors in the genetic prediction modelling.
      All predictions were made via multinomial logistic regression models, while using standard leave-one-out (LOO) cross-validation (CV). However, in contrast to previous IrisPlex, HIrisPlex, and HIrisPlex-S studies that used 6, 22, and 36 of the 41 HIrisPlex-S SNPs, respectively, for eye, hair and skin colour prediction, respectively, we considered all 41 HIrisPlex-S SNPs for the prediction of all three pigmentation traits in the current study in order to test for the complete effect of correlated phenotypes and genotypes. Therefore, the prediction outcomes obtained here for the standard approach not considering correlated phenotypes (strategy I) are not directly comparable with previously reported eye, hair, and skin colour prediction outcomes based on 6 IrisPlex, 22 HIrisPlex, and 36 HIrisPlex-S, respectively. The prediction accuracies were derived using the Area Under the receiver operating characteristic Curves (AUC) as well as other commonly used prediction statistics including sensitivity, specificity, negative predictive value (NPV), and positive predictive value (PPV). All statistical analyses and result visualization were conducted in R version 3.5.3 [https://www.R-project.org/] using following packages: stats of version 3.5.3, nnet of version 7.3-12, pROC of version 1.15.0.

      2.3 Computer simulations

      To better understand and interpret the results from the empirical prediction modelling described above, we conducted computer simulations as specified in the following. We generated two correlated traits y1 and y2 for 1000 individuals by simulating their shared and unique explanatory components with the identical effect sizes,
      y1=u11+u12+s1+αs2


      y2=u21+u22+s1+αs2


      u11,u12,u21,u22,s1,s2i.i.d.N(0,1)


      where the u11, u12 represents explanatory factors having effects unique for y1, and the u21 and u22 are explanatory factors unique for y2, while only u11 and u21 are accessible but u12 and u22 are inaccessible by investigators. Similarly, s1 and s2 are shared explanatory factors having effects on both y1 and y2 while only s1 is accessible by investigators but s2 is not. αis used to regulate the phenotype variance proportion (V=α23+α2) explained by the inaccessible factor of s2. The variance explained (R [
      • Xiong Z.
      • Dankova G.
      • Howe L.J.
      • Lee M.K.
      • Hysi P.G.
      • de Jong M.A.
      • et al.
      Novel genetic loci affecting facial shape variation in humans.
      ]) was derived from linear models fitted using 3 different sets of predictors as specified below,
      ModelI:y1u11+s1


      ModelII:y1u11+s1+yˆ2


      ModelIII:y1u11+s1+y2


      where yˆ2 is the fitted y2 using u21ands1. Thus, Model I mimics a typical genotype-phenotype prediction analysis without considering phenotype and genotype correlations, analogue to strategy I in our empirical prediction analysis. Model II mimics the scenario when additional correlated phenotypes are not available, but were predicted from the same set of pre-selected SNPs and used as additional predictors, analogue to strategy II in our empirical prediction analysis. Model III mimics the scenario when truly observed correlated phenotypes are available for prediction and used as additional predictors, analogue to strategy III in our empirical prediction analysis. All simulations were conducted under V=0%,10%,30%,50%,70%(α=0.00,0.58,1.131.73,2.65). In addition, we repeated the simulation process by dichotomizing y using the mean value and estimated the AUC values using logistic models, which mimics scenarios of binary trait analysis. All simulations were conducted for 1000 replicates under each investigated model/scenario.

      3. Results

      3.1 Data suitability check via genetic association testing

      Due to the need for complete eye, hair, and skin data and 41 HIrisPlex-SNP data in this study, we used data from 762 HIrisPlex-S subjects for whom such complete phenotype and genotype data was jointly available (Table 1). This dataset represents a subset of the data previously applied to develop the IrisPlex eye colour, HIrisPlex hair colour and HIrisPlex-S skin colour prediction models. Therefore, we first assessed the suitability of this data subset for genetic prediction modelling by testing if the previously reported genetic associations and SNP effects can be replicated in this particular dataset. To this end, we conducted genetic association testing using MANOVA and linear regression for all 41 HIrisPlex-S SNPs regarding eye, hair and skin colour. While with MANOVA all three pigmentation traits were considered in a combined way, with linear regression analysis each pigmentation trait was tested separately. As may be expected, there was a high correlation between the genetic association outcomes from both approaches (Table 2).
      Table 2Association between the 41 HIrisPlex-S DNA markers and eye, hair, and skin colour in the study dataset (N = 762).
      MANOVALinear Regression
      Eye/hair/skin colourEye ColourHair ColourSkin Colour
      Nr.SNPEAOAFreqMBpLocusGeneP-valueP-valueP-valueP-value
      1rs16891982CG0.1233.955p13.2SLC45A21.44E-875.89E-235.03E-325.50E-81
      2rs28777CA0.133.965p13.2SLC45A25.90E-691.31E-192.02E-255.03E-65
      3rs12203592TC0.10.46p25.3IRF48.48E-087.55E-021.23E-011.32E-05
      4rs4959270AC0.440.466p25.3EXOC21.16E-022.89E-021.09E-011.93E-03
      5rs683GT0.3912.719p23TYRP11.53E-132.04E-042.50E-082.15E-13
      6rs10756819GA0.416.869p22.2BNC21.08E-028.80E-012.64E-011.51E-03
      7rs1042602TG0.3388.9111q14.3TYR6.27E-062.48E-037.57E-041.12E-06
      8rs1393350TC0.2189.0111q14.3TYR3.57E-022.01E-013.31E-025.66E-03
      9rs1126809AG0.1989.0211q14.3TYR9.20E-031.39E-011.59E-031.18E-02
      10rs12821256GA0.0889.3312q21.33KITLG1.54E-033.82E-044.99E-037.44E-03
      11rs12896399TG0.4192.7714q32.12SLC24A46.42E-047.26E-033.63E-029.52E-05
      12rs2402130GA0.1992.814q32.12SLC24A45.90E-038.20E-037.82E-033.52E-03
      13rs17128291CT0.1992.8814q32.12SLC24A45.06E-023.06E-018.70E-026.99E-03
      14rs1545397TA0.1128.1915q13.1OCA23.81E-042.98E-043.60E-046.73E-03
      15rs1800414CT0.0128.215q13.1OCA26.73E-063.99E-042.82E-063.74E-04
      16rs1800407AG0.0728.2315q13.1OCA23.65E-049.06E-034.05E-012.36E-02
      17rs12441727AG0.1728.2715q13.1OCA21.85E-034.60E-042.61E-025.75E-03
      18rs1470608AC0.1928.2915q13.1OCA22.28E-231.09E-111.48E-082.92E-22
      19rs1129038GA0.2928.3615q13.1HERC27.02E-469.06E-328.29E-193.36E-31
      20rs12913832TC0.2928.3715q13.1HERC21.81E-1666.24E-1441.11E-471.57E-47
      21rs2238289CT0.1728.4515q13.1HERC26.75E-243.38E-147.92E-123.17E-20
      22rs6497292CT0.0928.515q13.1HERC21.81E-206.56E-101.18E-091.62E-19
      23rs1667394CT0.2328.5315q13.1HERC25.19E-359.11E-194.03E-143.84E-30
      24rs1426654GA0.0748.4315q21.1SLC24A51.82E-732.73E-172.01E-241.69E-71
      25rs3114908TC0.3289.3816q24.3ANKRD111.36E-015.00E-011.07E-017.25E-01
      26rs3212355AG089.9816q24.3MC1R5.42E-012.38E-017.44E-013.73E-01
      27rs312262906A---089.9916q24.3MC1R4.27E-021.61E-012.25E-014.41E-01
      28rs1805005TG0.0989.9916q24.3MC1R1.44E-017.12E-015.55E-012.20E-02
      29rs1805006AC089.9916q24.3MC1R6.62E-016.48E-017.94E-013.88E-01
      30rs2228479AG0.189.9916q24.3MC1R2.29E-024.92E-013.09E-012.16E-03
      31rs11547464AG0.0189.9916q24.3MC1R3.61E-011.86E-011.33E-018.87E-01
      32rs1805007TC0.0689.9916q24.3MC1R4.77E-063.74E-012.02E-041.89E-05
      33rs201326893AC089.9916q24.3MC1R5.47E-032.37E-022.08E-015.62E-01
      34rs1110400CT0.0189.9916q24.3MC1R1.45E-013.32E-021.94E-011.08E-01
      35rs1805008TC0.0689.9916q24.3MC1R5.59E-054.23E-011.35E-035.07E-05
      Compound------0.13---16q24.3MC1R1.77E-122.00E-011.35E-072.03E-10
      36rs885479TC0.0589.9916q24.3MC1R5.10E-056.96E-031.03E-051.33E-03
      37rs1805009CG0.0189.9916q24.3MC1R7.76E-068.28E-013.11E-045.48E-04
      38rs8051733CT0.3190.0216q24.3DEF81.82E-012.51E-018.15E-026.78E-02
      39rs6059655TC0.0532.6720q11.22RALY3.01E-048.15E-013.95E-033.38E-04
      40rs6119471CG0.0232.7920q11.22ASIP8.59E-311.26E-064.58E-058.20E-33
      41rs2378249CT0.1633.2220q11.22ASIP/PIGU3.52E-012.96E-015.78E-014.48E-01
      Statistical significance threshold: nominally significant p < 0.05 in italic, significant after Bonferroni correction of multiple testing p < 0.0012 in bold, and genome-wide significant p<5 × 10−8 in bold and italic.
      With MANOVA and by combining the three pigmentation traits, 32 (78 %) of the 41 HIrisPlex-S DNA markers from all the 11 different genetic loci showed nominally significant association (p < 0.05), 22 (54 %) DNA markers from 9 loci showed significant association after Bonferroni correction for multiple testing (p < 0.0012), and 11 (27 %) SNPs from 5 loci even showed significant association on the genome-wide level (p<5 × 10−8) representing an over-conservative significance threshold in a candidate SNP approach as applied here (Table 2). On chr16, the MC1R compound marker showed stronger association (p = 1.77e-12) than both markers separately i.e., rs1805007 (p = 4.77e-6) and rs1805008 (p = 5.59e-5). For eye colour and hair colour, the most significant genetic association was seen for HERC2 rs12913832 (eye colour p = 6.24 × 10−144, and hair colour p = 1.11 × 10−47), while for skin colour the most significant association was observed for SLC45A2 rs16891982 (p = 5.5 × 10−81).
      With linear regression analysis and by treating the three pigmentation phenotypes separately, we found 19 (46 %) DNA markers from 9 loci significantly associated with eye, hair and skin colour on the nominal significance level, 12 (29 %) DNA markers from 5 loci significantly associated on the Bonferroni significance level, and 9 (22 %) from 3 loci significantly associated on the over-conservative genome-wide significance level. In addition, 34 (83 %) DNA markers from all 11 loci showed nominal significant association for at least one of the three pigmentation phenotypes, 23 (56 %) from 10 loci based on the Bonferroni level, and 11 (27 %) from 5 loci on the over-conservative genome-wide level. Only 7 (17 %) of the 41 HIrisPlex-S DNA markers showed no association with any of the three pigmentation traits in this sample set. These 7 non-significant DNA markers were from two of the 11 genetic loci, MC1R (6 markers) and ASIP/RALY (one marker). However, both of these genetic loci showed significant associations with other SNPs in the HIrisPlex-S marker set.
      Next, for each of the 11 genetic loci covered by the 41 HIrisPlex-S DNA markers, the top-associated SNP from MANOVA was investigated with respect to the contribution of its alleles in explaining the observed variance of the three different pigmentation traits (Fig. 1 and Supplementary Table 1). HERC2 s12913832_T explained an extraordinarily large proportion (48 %) of eye colour variance. Except HERC2 rs12913832, the majority of the SNPs tested showed their largest effect on skin colour (1.4 %–25.6 % explained variance). The compound marker in MC1R explained 4 %–6 % of the phenotypic variance of hair and skin colour but had little effect on eye colour. The relatively small effect (4% explained variance) of the MC1R compound marker on hair colour may be caused by the low frequency of red hair (2.2 %) in our dataset and that the remaining causal variants of MC1R were not considered in this compound marker. All of the allelic effects were on the same direction of darkening, except for IRF4 with a non-significant effect on lightening of hair colour, indicating the homogeneous effect of genetic variants on pigmentation phenotypes.
      Fig. 1
      Fig. 1Effects of the 11 top-associated HIrisPlex-S SNPs from the 11 genetic loci covered by the 41 HIrisPlex-S DNA markers on eye, hair, and skin colour (N = 762). Compound represents a collapsed compound heterozygosity marker based on a haplotype analysis of two pre-selected MC1R coding DNA variants rs1805007_T (R151C) and rs1805008_T (R160W). Note that this compound MC1R marker has a colour lightening effect on pigmentation phenotypes, but this effect was reversely depicted in the figure for convenience reasons. For underlying data see Supplementary Table 1 (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article).
      Given that statistically significant associations and genetic effects we observed with this dataset are in broad agreement with findings from previous studies based on larger sample size [
      • Liu F.
      • Wollstein A.
      • Hysi P.G.
      • Ankra-Badu G.A.
      • Spector T.D.
      • Park D.
      • et al.
      Digital quantification of human eye color highlights genetic association of three new loci.
      ,
      • Han J.
      • Kraft P.
      • Nan H.
      • Guo Q.
      • Chen C.
      • Qureshi A.
      • et al.
      A genome-wide association study identifies novel alleles associated with hair color and skin pigmentation.
      ,
      • Hysi P.G.
      • Valdes A.M.
      • Liu F.
      • Furlotte N.A.
      • Evans D.M.
      • Bataille V.
      • et al.
      Genome-wide association meta-analysis of individuals of European ancestry identifies new loci explaining a substantial fraction of hair color variation and heritability.
      ,
      • Visconti A.
      • Duffy D.L.
      • Liu F.
      • Zhu G.
      • Wu W.
      • Chen Y.
      • et al.
      Genome-wide association study in 176,678 Europeans reveals genetic loci for tanning response to sun exposure.
      ,
      • Jacobs L.C.
      • Wollstein A.
      • Lao O.
      • Hofman A.
      • Klaver C.C.
      • Uitterlinden A.G.
      • et al.
      Comprehensive candidate gene study highlights UGT1A and BNC2 as new genes determining continuous skin color variation in Europeans.
      ,
      • Liu F.
      • Visser M.
      • Duffy D.L.
      • Hysi P.G.
      • Jacobs L.C.
      • Lao O.
      • et al.
      Genetics of skin color variation in Europeans: genome-wide association studies with functional follow-up.
      ], we deemed this dataset useful for phenotype-genotype correlation analyses and for genetic prediction modelling.

      3.2 Phenotype correlations and genetic contributions

      A high correlation between eye, hair and skin colour is generally expected in human populations. In our dataset, all three pigmentation traits showed mid-range but highly statistically significant phenotype correlations with each other, namely Pearson correlation coefficients of 0.47 (p = 5.19 × 10−43) for eye versus hair colour, 0.41 (p = 1.64 × 10−32) for hair versus skin colour, and 0.36 (p = 9.42 × 10−25) for eye versus skin colour (Table 3). That we did not obtain higher phenotype correlations may be explained by sampling errors due to the medium sample size and due to phenotype categorization from underlying quantitative variables in combination with imperfect phenotype quality based on the phenotyping methods applied.
      Table 3Quantification of pigmentation phenotype correlations and their proportions contributed by the 41 HIrisPlex-S DNA markers.
      Eye colourHair colourSkin colour
      Eye Colour*0.760.96
      Hair Colour0.47 (5.19 × 10E-43)*0.87
      Skin Colour0.36 (9.42E-25)0.41 (1.64E-32)*
      Below the diagonal line: phenotype correlations estimated by Pearson correlation coefficient and their statistical significance. Above the diagonal line: contribution of the 41 HIrisPlex-S SNPs on the observed pigmentation phenotype correlations (C statistics).
      Moreover, given the shared genetic components between the three pigmentation phenotypes seen for several of the 41 HIrisPlex DNA markers in earlier studies [
      • Sulem P.
      • Gudbjartsson D.F.
      • Stacey S.N.
      • Helgason A.
      • Rafnar T.
      • Magnusson K.P.
      • et al.
      Genetic determinants of hair, eye and skin pigmentation in Europeans.
      ,
      • Han J.
      • Kraft P.
      • Nan H.
      • Guo Q.
      • Chen C.
      • Qureshi A.
      • et al.
      A genome-wide association study identifies novel alleles associated with hair color and skin pigmentation.
      ], as well as in the present study (Fig. 1), a large contribution of these genetic markers on the phenotype correlations would be expected. To investigate this empirically, we derived a statistical estimate C (see method section for details) to assess the contribution of the 41 HIrisPlex-S DNA markers on the observed pigmentation phenotype correlations. With this analysis, we estimated the proportion of the shared genetic components explained by the 41 HIrisPlex-S DNA markers over all underlying shared genetic and non-genetic components, i.e., i) from unused shared DNA markers in LD with the used markers from the same genetic loci, ii) from unused shared DNA markers from other genetic loci not used here, and iii) from underlying shared non-genetic components such as environmental effects, which may all contribute to the observed phenotype correlations. The statistical property of C was examined via computer simulations, where the estimated C was in the expected range between 0 and 1. It strikingly corresponded to the proportion of the phenotype variance explained by the shared and accessible genetic components over the variance explained by all shared genetic and non-genetic components (Supplementary Figure S1). Based on our empirical data, the 41 HIrisPlex-S SNPs explained 96 % of the phenotype correlation between eye and skin colour, 87 % of the phenotype correlation between hair and skin colour, and 76 % of the phenotype correlation between eye and hair colour (Table 3). This finding demonstrates that a large proportion of the pairwise pigmentation phenotype correlations is explained by the HIrisPlex-S DNA markers used, but there also is a remaining proportion that remains unexplained by these genetic markers that differs between the pairwise phenotype comparisons.

      3.3 Empirical impact of phenotype correlations on genetic phenotype prediction

      To investigate the impact of the correlations between the pigmentation phenotypes and the underlying genotypes on DNA-based prediction of eye, hair, and skin colour, we performed genetic prediction modelling for the three pigmentation traits based on three different strategies (for details see methods). As evident from Fig. 2 (Supplementary Table 2), the standard genetic prediction model that does not consider pigmentation phenotype correlations (strategy 1), and the novel model incorporating DNA-predicted correlated pigmentation phenotypes as additional predictors (strategy 2), performed almost identical or very similar across all eye and hair colour categories as well as for most skin colour categories. For skin colour (almost) the same AUC values were achieved with both prediction strategies for pale, intermediate and dark, while for very pale the strategy 2 model achieved a slightly lower (0.04) and for dark to black a slightly higher (0.02) AUC. Similar findings were obtained for other prediction accuracy estimates (Supplementary Table 2). Hence, using correlated pigmentation phenotypes that were predicted from a relatively small set of selected DNA markers as additional predictors had no or almost no impact on DNA-based prediction accuracy of eye, hair, and skin colour.
      Fig. 2
      Fig. 2Accuracies of pigmentation phenotype prediction achieved by considering the 41 HIrisplex-S SNPs (N = 762) based on three strategies Strategy I (depicted in red) included the 41 HIrisplex-S SNPs as sole predictors; strategy II (depicted in yellow) included the 41 HIrisPlex-S SNPs as predictors and as additional predictors the two respective other correlated pigmentation phenotypes predicted from the 41 DNA markers; strategy III (depicted in blue) included the 41 HIrisplex-S SNPs as predictors and as additional predictors the two respective other correlated truly observed pigmentation phenotypes. For underlying data see Supplementary Table 2 (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article).
      This was markedly different for the strategy 3 model incorporating truly observed correlated pigmentation phenotypes in addition to the HIrisPlex-S SNPs in the genetic prediction modelling. Compared to the strategy 1 model that only used HIrisPlex-S SNPs, with the strategy 3 model we observed a slight increase in AUC for blue eye colour (0.016), blond hair colour (0.014) and black hair colour (0.016), while a slight decrease in AUC for very pale skin colour (0.02), and a noticeable increase in AUC for intermediate eye colour (0.037) and brown hair colour (0.030). However, a larger increase was seen for dark to black skin colour (0.056), and the AUC increase was largest for dark skin colour (0.176). For the remaining eye, hair and skin colour categories almost the same AUCs were obtained with both prediction strategies. Similar findings were obtained for other prediction accuracy estimates (Supplementary Table 2). The largest effects for dark and dark to black skin colour cannot necessarily be explained by low samples size, since very pale skin colour and red hair colour had similarly low sample size but showed much smaller effects (Table 1).

      3.4 Simulated impact of phenotype correlations on genetic phenotype prediction

      Next, we performed dedicated computer simulations to better understand and interpret these empirical findings using three models and considering both continuous and categorical phenotype information (for model details see method). These simulations showed for both continuous and categorical phenotypes, that model I without considering phenotype correlations (comparable with the empirical strategy 1 model) and model II with considering DNA-predicted correlated phenotypes (comparable with the empirical strategy 2 model) performed identical (Fig. 3), which agrees with our empirical findings. In contrast, model III with considering truly observed correlated phenotypes (comparable with the empirical strategy 3 model), achieved higher prediction accuracies compared to model I/II, which is in line with our empirical results. This finding suggests that additional information of unobserved shared component (s2) was included in model III using truly observed correlated phenotypes, but not in model II using DNA-predicted correlated phenotypes. Moreover, as the variance explained by s2 increases, more additional information was included and therefore increased improvement in prediction accuracy is seen with model III that uses truly observed correlated phenotypes, which was not seen with model II that uses DNA-predicted correlated phenotypes (Fig. 3).
      Fig. 3
      Fig. 3Results of a computer simulation investigating accuracy of genetic phenotype prediction without and with correlated phenotypes used as additional predictors. Model I (depicted in red) mimics the scenario of typical phenotype prediction from genotypes without considering correlated phenotypes as additional predictors, which corresponds to the strategy 1 empirical model (see red bars in ); model II (depicted in yellow) mimics the scenario of correlated phenotypes being used as additional predictors, which, however, were predicted using the same set of pre-selected DNA markers that were used for genetic prediction, which corresponds to the strategy 2 empirical model (see yellow bars in ); model III (depicted in blue) mimics the scenario of additional correlated truly observed phenotypes being used as additional predictors together with the predictive DNA markers, which corresponds to the strategy 3 empirical model (see blue bars in ). (A) Prediction accuracy evaluated by R2 for continuous phenotypes. (B) Prediction accuracy evaluated by AUC for binary phenotypes (dichotomized from the continuous ones by mean values) (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article).

      4. Discussion

      In the present study, we explored, quantified, and now understand via computer simulations, the extent of which correlated appearance phenotypes and underlying genotypes impact on DNA-based appearance prediction using pigmentation traits as example.
      We empirically demonstrated mid-range and highly statistically significant correlations between all three pigmentation phenotypes, and quantified a large contribution on these phenotype correlations provided by the 41 HIrisPlex DNA markers. That the 41 HIrisPlex-S SNPs could not explain 100 % of the observed phenotypic correlations is likely explained by phenotyping classifications and additional genetic factors besides the 41 SNPs used here, e.g., SNPs in LD within the same regions and/or other unknown but shared genetic factors, and perhaps by additional non-genetic i.e., environmental factors not considered here. For instance, that considerably lower genetically explained correlations were achieved for the two phenotype comparisons that involve hair colour, compared to the one not involving hair colour, might be influenced by age-dependent hair colour change, which is not covered by the DNA markers used as previously reported for the 22 HIrisPlex hair colour SNPs [
      • Walsh S.
      • Liu F.
      • Wollstein A.
      • Kovatsi L.
      • Ralf A.
      • Kosiniak-Kamysz A.
      • et al.
      The HIrisPlex system for simultaneous prediction of hair and eye colour from DNA.
      ], and thus not expected to be covered by the total of 41 HIrisPlex-S markers used here including these 22. Although not tested here, we expect that these estimates may decrease when applied to homogeneous population samples (e.g. Europeans only) due to their reduced phenotypic variance.
      In our empirical genetic prediction modelling based on the full set of 41 HIrisPlex-S DNA markers, we found no effect on prediction accuracy of DNA-predicted correlated pigmentation phenotypes used as additional predictors. In contrast a noticeable effect, mostly in increasing prediction accuracy, was seen when using truly observed correlated pigmentation phenotypes as additional predictors. The results from our computer simulations suggest that the prediction accuracy improvement achieved by considering truly observed correlated pigmentation phenotypes is explained by the extra genetic (and non-genetic) information that is provided by these correlated phenotypes. Such extra information is not provided when correlated pigmentation phenotypes were predicted from a limited number of DNA markers such as the 41 HIrisPlex-S markers used here. This explains why the prediction accuracies were almost identical or very similar, when comparing the genetic prediction models without considering phenotype correlations, and those that used DNA-predicted correlated phenotypes as additional predictors. Thus, it is not the prediction error of the DNA-predicted phenotypes that does not allow the prediction accuracy to improve, as one may naively assume. Instead, as our computer simulations suggest, it is the incomplete genetic information covered by the DNA markers used to predict the respective correlated pigmentation phenotype that does not provide a prediction improvement. This missing genetic information (together with unknown non-genetic information) is covered by the truly observed correlated phenotypes used as additional predictor, which in turn provides the prediction accuracy increase.
      Our results are relevant for a variety of fields where genetic appearance prediction models are used in practise, such as forensic casework, human history studies, and anthropological investigations, all of which typically lack any phenotype information, including phenotypes that are correlated with the phenotype of interest. In particular, our findings imply that it is not advisable in such practical applications to use DNA-predicted correlated phenotypes as extra factors in the prediction modelling, at least not when it comes to pigmentation prediction from a relatively small set of DNA markers such as the HIrisPlex-S DNA markers used here. Moreover, the conclusion from our simulation analysis provide valuable information for future research and development of better prediction models as it demonstrates that using more independently contributing genetic predictors, in addition to the currently used ones, will increase prediction accuracy overall. However, our findings do not provide any information on how many additional DNA predictors would be needed to achieve the prediction improvement seen when using truly-observed correlated phenotypes, which in turn would depend on the independently contributing effect size of the additional DNA predictors.
      Theoretical expectations and empirical demonstration of GWASs based on common DNA variants, such as for human pigmentation traits [
      • Hysi P.G.
      • Valdes A.M.
      • Liu F.
      • Furlotte N.A.
      • Evans D.M.
      • Bataille V.
      • et al.
      Genome-wide association meta-analysis of individuals of European ancestry identifies new loci explaining a substantial fraction of hair color variation and heritability.
      ] or body height [
      • Wood A.R.
      • Esko T.
      • Yang J.
      • Vedantam S.
      • Pers T.H.
      • Gustafsson S.
      • et al.
      Defining the role of common variation in the genomic and biological architecture of adult human height.
      ], have demonstrated that newly identified significantly associated DNA variants obtained from GWASs with larger sample size have similarly small effects or even smaller effects than those identified in previous GWASs based on smaller sample size. Therefore, it is expected that many more (common) DNA predictors will be needed to further improve the prediction accuracy of already established appearance prediction models compared to the number of DNA markers included in the established models. The first empirical evidence for this notion for human pigmentation traits comes from a recently published large GWAS on hair colour that additionally reports on the use of the many newly discovered DNA-markers for hair colour prediction [
      • Hysi P.G.
      • Valdes A.M.
      • Liu F.
      • Furlotte N.A.
      • Evans D.M.
      • Bataille V.
      • et al.
      Genome-wide association meta-analysis of individuals of European ancestry identifies new loci explaining a substantial fraction of hair color variation and heritability.
      ]. Based on a large European discovery set of 290.891 individuals, this study found 124 genetic loci significantly associated with hair colour, of which 111 were novel and not identified in previous GWASs. Moreover, the study demonstrated that a new hair colour prediction model based on 258 independently associated SNPs, most of them were newly discovered, improved the prediction accuracy for all hair colour categories, except red, compared to the previous HIrisPlex model. However, given the more than 10-times larger number of DNA predictors in the new model compared to the HIrisPlex model, the prediction accuracy increase this new model provided relative to HIrisPlex, i.e., AUC increase of 0.67 to 0.79 for blond, 0.66 to 0.76 for brown, and 0.82 to 0.91 for black, was not overwhelming. Nevertheless, in the era of targeted massively parallel sequencing (MPS), hundreds of SNPs can now be utilized for practical applications including forensic DNA analysis [
      • Ralf A.
      • van Oven M.
      • Montiel Gonzalez D.
      • de Knijff P.
      • van der Beek K.
      • Wootton S.
      • et al.
      Forensic Y-SNP analysis beyond SNaPshot: high-resolution Y-chromosomal haplogrouping from low quality and quantity DNA using Ion AmpliSeq and targeted massively parallel sequencing.
      ], and MPS tool development for forensic DNA phenotyping using hundreds of SNPs is currently underway by the VISAGE Consortium.
      Although, in our present study, we used correlated pigmentation traits and pigmentation-predictive DNA markers as example, similar findings may be expected for other correlated appearance traits. Such expectation only holds as long as the phenotype correlations are similarly high and the applied predictive DNA markers explain similarly large proportions of the phenotype correlations as seen here for eye, hair and skin colour and the HIrisPelx-S DNA markers. However, the success of DNA-based pigmentation prediction is driven by the presence of major gene effects (e.g. HERC2, SLC45A2 and MC1R) together with minor gene effects, whereas correlated non-pigmentation appearance traits such as facial shape phenotypes are characterized by the absence of major gene effects as far as currently known [
      • Xiong Z.
      • Dankova G.
      • Howe L.J.
      • Lee M.K.
      • Hysi P.G.
      • de Jong M.A.
      • et al.
      Novel genetic loci affecting facial shape variation in humans.
      ]. Therefore, we expect that the estimates we obtained here for correlated pigmentation traits may be higher than those obtained in future studies for other correlated appearance traits independent of pigmentation.

      Declaration of Competing Interest

      The authors declare that they have no competing interests.

      Acknowledgements

      We thank all volunteers of whom data were used for this study. This study received support from the European Union's Horizon 2020 Research and Innovation programme under grant agreement No 740580 within the framework of the Visible Attributes through Genomics (VISAGE) Project and Consortium. During her stay in Rotterdam, YC was supported by the UCAS Joint PhD Training Program. The IUPUI US site was supported in part by the US National Institute of Justice (NIJ) under grant number 2014-DN-BX-K031. None of the funding organizations had any influence on the design, conduct or conclusions of the study.

      Appendix A. Supplementary data

      The following is Supplementary data to this article:

      References

        • Silventoinen K.
        • Sammalisto S.
        • Perola M.
        • Boomsma D.I.
        • Cornes B.K.
        • Davis C.
        • et al.
        Heritability of adult body height: a comparative study of twin cohorts in eight countries.
        Twin Res. 2003; 6: 399-408
        • Xiong Z.
        • Dankova G.
        • Howe L.J.
        • Lee M.K.
        • Hysi P.G.
        • de Jong M.A.
        • et al.
        Novel genetic loci affecting facial shape variation in humans.
        Elife. 2019; : 8
        • Medland S.E.
        • Zhu G.
        • Martin N.G.
        Estimating the heritability of hair curliness in twins of European ancestry.
        Twin Res. Hum. Genet. 2009; 12: 514-518
        • Lin B.D.
        • Mbarek H.
        • Willemsen G.
        • Dolan C.V.
        • Fedko I.O.
        • Abdellaoui A.
        • et al.
        Heritability and genome-wide association studies for hair color in a dutch twin family based sample.
        Genes (Basel). 2015; 6: 559-576
        • Sulem P.
        • Gudbjartsson D.F.
        • Stacey S.N.
        • Helgason A.
        • Rafnar T.
        • Magnusson K.P.
        • et al.
        Genetic determinants of hair, eye and skin pigmentation in Europeans.
        Nat. Genet. 2007; 39: 1443-1452
        • Kayser M.
        • Liu F.
        • Janssens A.C.
        • Rivadeneira F.
        • Lao O.
        • van Duijn K.
        • et al.
        Three genome-wide association studies and a linkage analysis identify HERC2 as a human iris color gene.
        Am. J. Hum. Genet. 2008; 82: 411-423
        • Liu F.
        • Wollstein A.
        • Hysi P.G.
        • Ankra-Badu G.A.
        • Spector T.D.
        • Park D.
        • et al.
        Digital quantification of human eye color highlights genetic association of three new loci.
        PLoS Genet. 2010; 6e1000934
        • Han J.
        • Kraft P.
        • Nan H.
        • Guo Q.
        • Chen C.
        • Qureshi A.
        • et al.
        A genome-wide association study identifies novel alleles associated with hair color and skin pigmentation.
        PLoS Genet. 2008; 4e1000074
        • Hysi P.G.
        • Valdes A.M.
        • Liu F.
        • Furlotte N.A.
        • Evans D.M.
        • Bataille V.
        • et al.
        Genome-wide association meta-analysis of individuals of European ancestry identifies new loci explaining a substantial fraction of hair color variation and heritability.
        Nat. Genet. 2018; 50: 652-656
        • Peng F.
        • Zhu G.
        • Hysi P.G.
        • Eller R.J.
        • Chen Y.
        • Li Y.
        • et al.
        Genome-wide association studies identify multiple genetic loci influencing eyebrow color variation in Europeans.
        J. Invest. Dermatol. 2019; 139: 1601-1605
        • Visconti A.
        • Duffy D.L.
        • Liu F.
        • Zhu G.
        • Wu W.
        • Chen Y.
        • et al.
        Genome-wide association study in 176,678 Europeans reveals genetic loci for tanning response to sun exposure.
        Nat. Commun. 2018; 9: 1684
        • Medland S.E.
        • Nyholt D.R.
        • Painter J.N.
        • McEvoy B.P.
        • McRae A.F.
        • Zhu G.
        • et al.
        Common variants in the trichohyalin gene are associated with straight hair in Europeans.
        Am. J. Hum. Genet. 2009; 85: 750-755
        • Liu F.
        • Chen Y.
        • Zhu G.
        • Hysi P.G.
        • Wu S.
        • Adhikari K.
        • et al.
        Meta-analysis of genome-wide association studies identifies 8 novel loci involved in shape variation of human head hair.
        Hum. Mol. Genet. 2018; 27: 559-575
        • Hagenaars S.P.
        • Hill W.D.
        • Harris S.E.
        • Ritchie S.J.
        • Davies G.
        • Liewald D.C.
        • et al.
        Genetic prediction of male pattern baldness.
        PLoS Genet. 2017; 13e1006594
        • Lango Allen H.
        • Estrada K.
        • Lettre G.
        • Berndt S.I.
        • Weedon M.N.
        • Rivadeneira F.
        • et al.
        Hundreds of variants clustered in genomic loci and biological pathways affect human height.
        Nature. 2010; 467: 832-838
        • Wood A.R.
        • Esko T.
        • Yang J.
        • Vedantam S.
        • Pers T.H.
        • Gustafsson S.
        • et al.
        Defining the role of common variation in the genomic and biological architecture of adult human height.
        Nat. Genet. 2014; 46: 1173-1186
        • Marouli E.
        • Graff M.
        • Medina-Gomez C.
        • Lo K.S.
        • Wood A.R.
        • Kjaer T.R.
        • et al.
        Rare and low-frequency coding variants alter human adult height.
        Nature. 2017; 542: 186-190
        • Liu F.
        • van Duijn K.
        • Vingerling J.R.
        • Hofman A.
        • Uitterlinden A.G.
        • Janssens A.C.
        • et al.
        Eye color and the prediction of complex phenotypes from genotypes.
        Curr. Biol. 2009; 19: R192-3
        • Walsh S.
        • Liu F.
        • Ballantyne K.N.
        • van Oven M.
        • Lao O.
        • Kayser M.
        IrisPlex: a sensitive DNA tool for accurate prediction of blue and brown eye colour in the absence of ancestry information.
        Forensic Sci. Int. Genet. 2011; 5: 170-180
        • Ruiz Y.
        • Phillips C.
        • Gomez-Tato A.
        • Alvarez-Dios J.
        • Casares de Cal M.
        • Cruz R.
        • et al.
        Further development of forensic eye color predictive tests.
        Forensic Sci. Int. Genet. 2013; 7: 28-40
        • Branicki W.
        • Liu F.
        • van Duijn K.
        • Draus-Barini J.
        • Pospiech E.
        • Walsh S.
        • et al.
        Model-based prediction of human hair color using DNA variants.
        Hum. Genet. 2011; 129: 443-454
        • Walsh S.
        • Liu F.
        • Wollstein A.
        • Kovatsi L.
        • Ralf A.
        • Kosiniak-Kamysz A.
        • et al.
        The HIrisPlex system for simultaneous prediction of hair and eye colour from DNA.
        Forensic Sci. Int. Genet. 2013; 7: 98-115
        • Sochtig J.
        • Phillips C.
        • Maronas O.
        • Gomez-Tato A.
        • Cruz R.
        • Alvarez-Dios J.
        • et al.
        Exploration of SNP variants affecting hair colour prediction in Europeans.
        Int. J. Legal Med. 2015; 129: 963-975
        • Maronas O.
        • Phillips C.
        • Sochtig J.
        • Gomez-Tato A.
        • Cruz R.
        • Alvarez-Dios J.
        • et al.
        Development of a forensic skin colour predictive test.
        Forensic Sci. Int. Genet. 2014; 13: 34-44
        • Walsh S.
        • Chaitanya L.
        • Breslin K.
        • Muralidharan C.
        • Bronikowska A.
        • Pospiech E.
        • et al.
        Global skin colour prediction from DNA.
        Hum. Genet. 2017; 136: 847-863
        • Chaitanya L.
        • Breslin K.
        • Zuniga S.
        • Wirken L.
        • Pospiech E.
        • Kukla-Bartoszek M.
        • et al.
        The HIrisPlex-S system for eye, hair and skin colour prediction from DNA: introduction and forensic developmental validation.
        Forensic Sci. Int. Genet. 2018; 35: 123-135
        • Hernando B.
        • Ibanez M.V.
        • Deserio-Cuesta J.A.
        • Soria-Navarro R.
        • Vilar-Sastre I.
        • Martinez-Cadenas C.
        Genetic determinants of freckle occurrence in the Spanish population: towards ephelides prediction from human DNA samples.
        Forensic Sci. Int. Genet. 2018; 33: 38-47
        • Kukla-Bartoszek M.
        • Pospiech E.
        • Wozniak A.
        • Boron M.
        • Karlowska-Pik J.
        • Teisseyre P.
        • et al.
        DNA-based predictive models for the presence of freckles.
        Forensic Sci. Int. Genet. 2019; 42: 252-259
        • Breslin K.
        • Wills B.
        • Ralf A.
        • Ventayol Garcia M.
        • Kukla-Bartoszek M.
        • Pospiech E.
        • et al.
        HIrisPlex-S system for eye, hair, and skin color prediction from DNA: massively parallel sequencing solutions for two common forensically used platforms.
        Forensic Sci. Int. Genet. 2019; 43102152
        • Walsh S.
        • Lindenbergh A.
        • Zuniga S.B.
        • Sijen T.
        • de Knijff P.
        • Kayser M.
        • et al.
        Developmental validation of the IrisPlex system: determination of blue and brown iris colour for forensic intelligence.
        Forensic Sci. Int. Genet. 2011; 5: 464-471
        • Walsh S.
        • Chaitanya L.
        • Clarisse L.
        • Wirken L.
        • Draus-Barini J.
        • Kovatsi L.
        • et al.
        Developmental validation of the HIrisPlex system: DNA-based eye and hair colour prediction for forensic and anthropological usage.
        Forensic Sci. Int. Genet. 2014; 9: 150-161
        • Kayser M.
        Forensic DNA Phenotyping: predicting human appearance from crime scene material for investigative purposes.
        Forensic Sci. Int. Genet. 2015; 18: 33-48
        • Schneider P.M.
        • Prainsack B.
        • Kayser M.
        The use of forensic DNA phenotyping in predicting appearance and biogeographic ancestry.
        Dtsch. Arztebl. Int. 2019; 51-52: 873-880
        • King T.E.
        • Fortes G.G.
        • Balaresque P.
        • Thomas M.G.
        • Balding D.
        • Delser P.M.
        • et al.
        Identification of the remains of king richard III.
        Nat. Commun. 2014; 5: 5631
        • Brace S.
        • Diekmann Y.
        • Booth T.J.
        • van Dorp L.
        • Faltyskova Z.
        • Rohland N.
        • et al.
        Ancient genomes indicate population replacement in early neolithic Britain.
        Nat. Ecol. Evol. 2019; 3: 765-771
        • Wilde S.
        • Timpson A.
        • Kirsanow K.
        • Kaiser E.
        • Kayser M.
        • Unterlander M.
        • et al.
        Direct evidence for positive selection of skin, hair, and eye pigmentation in Europeans during the last 5,000 y.
        Proc. Natl. Acad. Sci. U. S. A. 2014; 111: 4832-4837
        • Liu F.
        • Struchalin M.V.
        • Duijn K.
        • Hofman A.
        • Uitterlinden A.G.
        • Duijn C.
        • et al.
        Detecting low frequent loss-of-function alleles in genome wide association studies with red hair color as example.
        PLoS One. 2011; 6e28145
        • Jacobs L.C.
        • Wollstein A.
        • Lao O.
        • Hofman A.
        • Klaver C.C.
        • Uitterlinden A.G.
        • et al.
        Comprehensive candidate gene study highlights UGT1A and BNC2 as new genes determining continuous skin color variation in Europeans.
        Hum. Genet. 2013; 132: 147-158
        • Liu F.
        • Visser M.
        • Duffy D.L.
        • Hysi P.G.
        • Jacobs L.C.
        • Lao O.
        • et al.
        Genetics of skin color variation in Europeans: genome-wide association studies with functional follow-up.
        Hum. Genet. 2015; 134: 823-835
        • Ralf A.
        • van Oven M.
        • Montiel Gonzalez D.
        • de Knijff P.
        • van der Beek K.
        • Wootton S.
        • et al.
        Forensic Y-SNP analysis beyond SNaPshot: high-resolution Y-chromosomal haplogrouping from low quality and quantity DNA using Ion AmpliSeq and targeted massively parallel sequencing.
        Forensic Sci. Int. Genet. 2019; 41: 93-106