Advertisement
Research paper| Volume 33, P66-71, March 2018

Inferring Chinese surnames with Y-STR profiles

  • Author Footnotes
    1 These authors contributed equally to this study.
    Cheng-Min Shi
    Footnotes
    1 These authors contributed equally to this study.
    Affiliations
    CAS Key Laboratory of Genomic and Precision Medicine, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China
    Search for articles by this author
  • Author Footnotes
    1 These authors contributed equally to this study.
    Changzhen Li
    Footnotes
    1 These authors contributed equally to this study.
    Affiliations
    Jining Public Security Bureau, Shandong Province 272100, China
    Search for articles by this author
  • Author Footnotes
    1 These authors contributed equally to this study.
    Liang Ma
    Footnotes
    1 These authors contributed equally to this study.
    Affiliations
    CAS Key Laboratory of Genomic and Precision Medicine, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China
    Search for articles by this author
  • Lianjiang Chi
    Affiliations
    CAS Key Laboratory of Genomic and Precision Medicine, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China
    Search for articles by this author
  • Jing Zhao
    Affiliations
    CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, 100101, China

    Collaborative Innovation Center of Forensic Genomics, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China

    University of Chinese Academy of Sciences, Beijing 100049, China
    Search for articles by this author
  • Wuzhou Yuan
    Affiliations
    CAS Key Laboratory of Genomic and Precision Medicine, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China

    University of Chinese Academy of Sciences, Beijing 100049, China
    Search for articles by this author
  • Zhendiao Zhou
    Affiliations
    CAS Key Laboratory of Genomic and Precision Medicine, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China

    University of Chinese Academy of Sciences, Beijing 100049, China
    Search for articles by this author
  • Jiang-Wei Yan
    Correspondence
    Corresponding authors at: Collaborative Innovation Center for Forensic Genomics, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China
    Affiliations
    CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, 100101, China

    Collaborative Innovation Center of Forensic Genomics, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China

    University of Chinese Academy of Sciences, Beijing 100049, China
    Search for articles by this author
  • Hua Chen
    Correspondence
    Corresponding authors at: Collaborative Innovation Center for Forensic Genomics, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China
    Affiliations
    CAS Key Laboratory of Genomic and Precision Medicine, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China

    Collaborative Innovation Center of Forensic Genomics, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China

    University of Chinese Academy of Sciences, Beijing 100049, China
    Search for articles by this author
  • Author Footnotes
    1 These authors contributed equally to this study.
Published:November 24, 2017DOI:https://doi.org/10.1016/j.fsigen.2017.11.014

      Highlights

      • Two efficient computational methods were developed to infer surnames from Y-STR profiles.
      • More than 19,000 men bearing 266 surnames were typed for 17 Y-STR loci to demonstrate the performance of the methods.
      • The possibility of inferring surnames from Y-STR profiles reliably enables promising applications in forensics.

      Abstract

      Co-ancestry of human surnames and Y-chromosomes in most human populations and social groups suggests the possibility of inferring one from the other. However, such an intuitive perspective remains to be formally explored. In the present study, we develop two computational methods, based on cosine distance (dcos) and coalescence distance (dcoal) respectively, to infer surnames from Y-STR profiles. We also survey Y-STR variations at 15 loci for 19,009 individuals of Shandong Province in China. For a total of 266 surnames included in the data set, our methods can pinpoint to a single surname with an average accuracy of 65%, and with an average accuracy higher than 80% when providing >4 candidate surnames. We also demonstrate that increasing the sample size of surnames and the number of STR loci improves the accuracy of surname inference. Our results indicate that the 15 non-duplicated Y-STR loci contain information from which surname can be reliably inferred for Chinese populations, showing a promising application in forensics.

      Keywords

      To read this article in full you will need to make a payment

      Purchase one-time access:

      Academic & Personal: 24 hour online accessCorporate R&D Professionals: 24 hour online access
      One-time access price info
      • For academic or personal research use, select 'Academic and Personal'
      • For corporate R&D use, select 'Corporate R&D Professionals'

      Subscribe:

      Subscribe to Forensic Science International: Genetics
      Already a print subscriber? Claim online access
      Already an online subscriber? Sign in
      Institutional Access: Sign in to ScienceDirect

      References

        • King T.E.
        • Ballereau S.J.
        • Schürer K.E.
        • Jobling M.A.
        Genetic signatures of coancestry within surnames.
        Curr. Biol. 2006; 16: 384-388
        • Gymrek M.
        • McGuire A.L.
        • Golan D.
        • Halperin E.
        • Erlich Y.
        Identifying personal genomes by surname inference.
        Science. 2013; 339: 321-324
        • Martinez-Cadenas C.
        • Blanco-Verea A.
        • Hernando B.
        • Busby G.B.J.
        • Brion M.
        • Carracedo A.
        • Salas A.
        • Capelli C.
        The relationship between surname frequency and Y chromosome variation in Spain.
        Eur. J. Hum. Genet. 2016; 24: 120-128
        • Gill P.
        • Ivanov P.L.
        • Kimpton C.
        • Piercy R.
        • Benson N.
        • Tully G.
        • Evett I.
        • et al.
        Identification of the remains of the Romanov family by DNA analysis.
        Nat. Genet. 1994; 6: 130-135
        • Kayser M.
        • de Knijff P.
        Improving human forensics through advances in genetics, genomics and molecular biology.
        Nat. Rev. Genet. 2012; 12: 179-192
        • Queller D.C.
        • Strassmann J.E.
        • Hughes C.R.
        Microsatellites and kinship.
        Trends Ecol. Evol. 1993; 8: 285-288
        • Butler J.M.
        Forensic DNA Typing: Biology, Technology, and Genetics of STR Markers.
        2nd ed. Elsevier, 2005
        • Gill P.
        • Brenner C.
        • Brinkmann B.
        • Budowle B.
        • Carracdeo A.
        • Jobling M.A.
        • et al.
        DNA commission of the International Society of Forensic Genetics: recommendations on forensic analysis using Y-chromosome short tandem repeats.
        Leg. Med. 2001; 3: 252-257
        • Du R.F.
        • Yuan Y.D.
        • Juliana H.
        • Joanna M.
        • Cavalli-Sforza L.L.
        Chinese surnames and the genetic differences between North and South China.
        J. Chin. Ling. Monogr. Ser. 1992; 5
        • Jobling M.A.
        In the name of the father: surnames and genetics.
        Trends Genet. 2001; 17: 353-357
        • Liu Y.
        • Chen L.
        • Yuan Y.
        • Chen J.
        A study of surnames in China through isonymy.
        Am. J. Phys. Anthropol. 2012; 148: 341-350
        • King T.E.
        • Jobling M.A.
        What’s in a name? Y chromosomes, surnames and the genetic genealogy revolution.
        Trends Genet. 2009; 25: 351-360
        • Yuan Y.D.
        • Zhang C.
        Chinese Surnames: Community Heredity and Population Distribution.
        East China Normal University Press, Shanghai2002 (in Chinese)
        • Yuan Y.D.
        • Zhang C.
        • Ma Q.
        • Yang H.
        Population genetics of Chinese surnames I. Surname frequency distribution and genetic diversity in Chinese.
        Acta Genet. Sin. 2000; 27 (in Chinese): 471-476
        • Chen J.
        • Zheng H.
        • Bei J.-X.
        • Sun L.
        • Jia W.-H.
        • Li T.
        • Zhang F.
        • Seielstad M.
        • Zeng Y.-X.
        • Zhang X.
        • Liu J.
        Genetic structure of the Han Chinese population revealed by genome-wide SNP variation.
        Am. J. Hum. Genet. 2009; 85: 775-785
        • Manni F.
        • Toupance B.
        • Sabbagh A.
        • Heyer E.
        New method for surname studies of ancient patrilineal population structures, and possible application to improvement of Y-chromosome sampling.
        Am. J. Hum. Phys. Anthrop. 2005; 126: 214-228
        • Calderón R.
        • Hernández C.L.
        • Cuesta P.
        • Dugoujon J.M.
        Surnames and Y-chromosomal markers reveal low relationship in Southern Spain.
        PLoS One. 2015; 10: e0123098
        • Winney B.
        • Boumertit A.
        • Day T.
        • Davison D.
        • Echeta C.
        • et al.
        People of the British Isles: preliminary analysis of genotypes and surnames in a UK-control population.
        Eur. J. Hum. Genet. 2012; 20: 203-210
        • Baek S.K.
        • Kiet H.A.T.
        • Kim B.J.
        Family name distributions: master equation approach.
        Phys. Rev. E. 2007; 76: 046113
        • Walsh P.S.
        • Metzger D.A.
        • Higuchi R.
        Chelex 100 as a medium for simple extraction of DNA for PCR-based typing from forensic material.
        Biotechniques. 1991; 10: 506-513
        • Gusmão L.
        • Butler J.M.
        • Carracedo A.
        • Gill P.
        • Kayser M.
        • Mayr W.R.
        • Morling N.
        • Prinz M.
        • Roewer L.
        • Tyler-Smith C.
        • et al.
        DNA commission of the International Society of Forensic Genetics (ISFG): an update of the recommendations on the use of Y-STRs in forensic analysis.
        Forensic Sci. Int. 2006; 157: 187-197
        • Walsh B.
        Estimating the time to the most recent common ancestor for the Y chromosome or mitochondrial DNA for a pair of individuals.
        Genetics. 2001; 158: 897-912
        • Gusmão L.
        • Sánchez-Diz P.
        • Calafell F.
        • Martín P.
        • Alonso C.A.
        • Álvarez-Fernández F.
        • Alves C.
        • Borjas-Fajardo L.
        • Bozzo W.R.
        • Bravo M.L.
        • Builes J.J.
        • Capilla J.
        • Carvalho M.
        • Castillo C.
        • Catanesi C.I.
        • Corach D.
        • Di Lonardo A.M.
        • Espinheira R.
        • Fagundes de Carvalho E.
        • Farfán M.J.
        • Figueiredo H.P.
        • Gomes I.
        • Lojo M.M.
        • Marino M.
        • Pinheiro M.F.
        • Pontes M.L.
        • Prieto V.
        • Ramos-Luis E.
        • Riancho J.A.
        • Souza Góes A.C.
        • Santapa O.A.
        • Sumita D.R.
        • Vallejo G.
        • Vidal Rioja L.
        • Vide M.C.
        • Vieira da Silva C.I.
        • Whittle M.R.
        • Zabala W.
        • Zarrabeitia M.T.
        • Alonso A.
        • Carracedo A.
        • Amorim A.
        Mutation rates at Y chromosome specific microsatellites.
        Hum. Mutat. 2005; 26: 520-528
        • Goedbloed M.
        • Vermeulen M.
        • Fang R.N.
        • Lembring M.
        • Wollstein A.
        • Ballantyne K.
        • Lao O.
        • Brauer S.
        • Krüger C.
        • Roewer L.
        • Lessig R.
        • Ploski R.
        • Dobosz T.
        • Henke L.
        • Henke J.
        • Furtado M.R.
        • Kayser M.
        Comprehensive mutation analysis of 17 Y-chromosomal short tandem repeat polymorphisms included in the AmpFlSTR® Yfiler® PCR amplification kit.
        Int. J. Legal Med. 2009; 123: 471-482
        • Ballantyne K.N.
        • Goedbloed M.
        • Fang R.
        • Schaap O.
        • Lao O.
        • Wollstein A.
        • Choi Y.
        • von Duijn K.
        • Vermeulen M.
        • Brauer S.
        • Decorte R.
        • Poetsch M.
        • von Wurmb-Schwark N.
        • de Knijff P.
        • Labuda D.
        • Vézina H.
        • Knoblauch H.
        • Lessig R.
        • Roewer L.
        • Ploski R.
        • Dobosz T.
        • Henke L.
        • Henke J.
        • Furtado M.R.
        • Kayser M.
        Mutability of Y-chromosomal microsatellites: rates, characteristics, molecular bases, and forensic implication.
        Am. J. Hum. Genet. 2010; 87: 341-353
        • Chen F.-C.
        • Li W.-H.
        Genomic divergences between humans and other hominoids and the effective population size of the common ancestor of humans and chimpanzees.
        Am. J. Hum. Genet. 2001; 68: 444-456
        • Ohta T.
        • Kimura M.
        A model of mutation appropriate to estimate the number of electrophoretically detectable alleles in a finite population.
        Genet. Res. 1973; 22: 201-204
        • Olver F.W.J.
        Bessel functions of integer order.
        in: Abramowitz M. Stegun I.A. Handbook of Mathematical Functions. National Bureau of Standards, Washington, DC1964: 355-434
        • Excoffier L.
        • Lischer H.E.L.
        Arlequin suite ver 3.5: A new series of programs to perform population genetics analyses under Linux and Windows.
        Mol. Ecol. Resour. 2010; 10: 564-567
        • Bandelt H.J.
        • Forster P.
        • Rohl A.
        Median-joining networks for inferring intraspecific phylogenies.
        Mol. Biol. Evol. 1999; 16: 37-48
        • Chandler J.F.
        Estimating per-locus mutation rates.
        J. Genet. Gen. 2006; 2: 27-33
        • Sánchez-Diz D.
        • Alves E.
        • Carvalho M.
        • Carvalho R.
        • Espinheira O.
        • García M.F.
        • Pinheiro L.
        • Pontes M.J.
        • Porto O.
        • Santapa C.
        • Silva D.
        • Sumita S.
        • Valente M.
        • Whittle I.
        • Yurrebaso A.
        • Carracedo A.
        • Amorim L.
        GEP-ISFG, population and segregation data on 17 Y-STRs: results of a GEP-ISFG collaborative study.
        Int. J. Legal Med. 2008; 122: 529-533
        • R Core Team
        R: A Language and Environment for Statistical Computing.
        R Foundation for Statistical Computing, Vienna Austria2016
        URL https://www.R-project.org/
        • Chen K.-H.
        • Cavalli-Sforza L.L.
        Surname in Taiwan: interpretations based on geography and history.
        Hum. Biol. 1983; 55: 367-374