Highlights
- •New South Asian-informative forensic ancestry marker panel of 36 SNPs compiled called Eurasiaplex-2.
- •SNPs selected to have zero or near zero South Asian-specific allele frequencies in all other populations located outside Indian sub-continent.
- •Survey of 4097 worldwide samples show average 11–14 South Asian-specific genotypes in South Asians vs. 0.2 in all other population samples.
- •Forensic ancestry markers with near absolute specificity like the SNPs of Eurasiaplex-2 offer potential for highly informative panels differentiating worldwide populations.
Abstract
Keywords
1. Introduction
2. Methods and materials
2.1 Changing the concept of population informativeness when selecting forensic ancestry SNPs

2.2 Marker selection
M. Byrska-Bishop, U.S. Evani, X. Zhao, A.O. Basile, H.J. Abel, A.A. Regier, A. André Corvelo, W.E. Clarke, R. Musunuri, K. Nagulapalli, et al., High coverage whole genome sequencing of the expanded 1000 Genomes Project cohort including 602 trios, bioRxiv preprint, posted February 7 2021 doi: 〈https://doi.org/10.1101/2021.02.06.430068〉.
2.3 Human genome variant databases accessed
2.4 Statistical considerations
3. Results
3.1 Screening South Asian-specific candidate SNPs

3.2 Selecting a core set of South Asian-specific SNPs for Eurasiaplex-2
3.2.1 Patterns of South Asian-specific genotype distributions
Genomic details | Frequency of the alternative (South Asian-specific) allele in 1000 Genomes groups/South Asian populations | HGDP-CEPH | gnomAD | |||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
No. | Code | Chr. | GRCh37 | GRCh38 | rs-number | Ref. | Alt. | Gene | African | European | East Asian | BEB | GIH | ITU | PJL | STU | Pakistani | South Asian |
1 | 1A | 1 | 27588988 | 27262497 | rs191008849 | T | C | WDTC1 | 0 | 0.0010 | 0.0010 | 0.2267 | 0.2500 | 0.2353 | 0.1927 | 0.2353 | 0.2171 | 0.1866 |
2 | 1B | 1 | 207023473 | 206850128 | rs370300597 | C | G | – | 0 | 0 | 0 | 0.2093 | 0.1977 | 0.1961 | 0.1302 | 0.2549 | 0.2072 | 0.1459 |
3 | 2A | 2 | 18702265 | 18520999 | rs373262633 | A | G | LOC105373454 | 0 | 0 | 0 | 0.1802 | 0.1977 | 0.2451 | 0.1354 | 0.1765 | 0.1836 | 0.1336 |
4 | 2B | 2 | 98816440 | 98199977 | rs183145214 | A | G | VWA3B | 0 | 0 | 0.0050 | 0.2151 | 0.1744 | 0.1520 | 0.2031 | 0.2157 | 0.1873 | 0.1861 |
5 | 3A | 3 | 44250057 | 44208565 | rs578118259 | T | G | – | 0 | 0 | 0.0010 | 0.2558 | 0.2791 | 0.2843 | 0.1875 | 0.2353 | 0.2444 | 0.1817 |
6 | 3B | 3 | 159603038 | 159885249 | rs369609492 | C | A | SCHIP1 | 0 | 0 | 0 | 0.2500 | 0.2209 | 0.2157 | 0.0938 | 0.1912 | 0.1836 | 0.1572 |
7 | 3C | 3 | 167730032 | 168012244 | rs375081853 | A | G | GOLIM4 | 0 | 0 | 0.0010 | 0.1570 | 0.1512 | 0.2059 | 0.1615 | 0.1912 | 0.1873 | 0.1588 |
8 | 4A | 4 | 115827968 | 114906812 | rs182767282 | T | C | NDST4 | 0 | 0.0010 | 0.0010 | 0.1453 | 0.1919 | 0.2157 | 0.2031 | 0.2598 | 0.2270 | 0.1703 |
9 | 4B | 4 | 152007237 | 151086085 | rs146398591 | A | G | – | 0 | 0.0010 | 0.0030 | 0.2733 | 0.2500 | 0.3235 | 0.1875 | 0.2794 | 0.2457 | 0.2192 |
10 | 4C | 4 | 167677008 | 166755857 | rs554572765 | A | C | SPOCK3 | 0 | 0 | 0 | 0.1860 | 0.1919 | 0.1912 | 0.1458 | 0.2010 | 0.1811 | 0.1068 |
11 | 5 | 5 | 33704125 | 33704020 | rs375710694 | T | C | ADAMTS12 | 0 | 0 | 0.0020 | 0.1512 | 0.1628 | 0.2206 | 0.1615 | 0.2059 | 0.1923 | 0.1468 |
12 | 6A | 6 | 117206569 | 116885406 | rs186371551 | G | A | RFX6 | 0 | 0.0050 | 0.0050 | 0.1279 | 0.1686 | 0.2304 | 0.1823 | 0.1716 | 0.2084 | 0.1703 |
13 | 6B | 6 | 130116672 | 129795527 | rs368650154 | C | T | – | 0 | 0 | 0.0010 | 0.1686 | 0.1395 | 0.2255 | 0.1615 | 0.2402 | 0.1935 | 0.1578 |
14 | 6C | 6 | 154459950 | 154138815 | rs368661757 | C | A | OPRM1 | 0 | 0 | 0.0010 | 0.2384 | 0.2500 | 0.2451 | 0.2135 | 0.2549 | 0.2444 | 0.1946 |
15 | 7 | 7 | 50238881 | 50199285 | rs368444091 | C | T | – | 0 | 0 | 0 | 0.1686 | 0.1860 | 0.1814 | 0.1667 | 0.1912 | 0.1774 | 0.1396 |
16 | 9 | 9 | 97560517 | 94798235 | rs187619767 | C | T | AOPEP | 0 | 0 | 0.0020 | 0.2442 | 0.2849 | 0.1618 | 0.1458 | 0.2255 | 0.1849 | 0.1740 |
17 | 10 | 10 | 122095086 | 120335574 | rs77510889* | A | G | – | 0.0714 | 0 | 0.0020 | 0.2267 | 0.2093 | 0.2304 | 0.2031 | 0.2206 | 0.2171 | 0.1593 |
18 | 11A | 11 | 29262859 | 29241312 | rs370097977 | T | C | – | 0 | 0 | 0.0050 | 0.1628 | 0.1686 | 0.1912 | 0.2031 | 0.2451 | 0.2258 | 0.1529 |
19 | 11B | 11 | 59462759 | 59695286 | rs375766368 | G | A | – | 0 | 0 | 0.0010 | 0.1453 | 0.1512 | 0.2157 | 0.1771 | 0.1520 | 0.1824 | 0.1222 |
20 | 11C | 11 | 72175159 | 72464115 | rs377589165 | G | A | – | 0 | 0 | 0 | 0.1512 | 0.1860 | 0.2206 | 0.1615 | 0.1863 | 0.1873 | 0.1514 |
21 | 12A | 12 | 4268703 | 4159537 | rs376263717 | C | T | – | 0 | 0.0020 | 0.0010 | 0.2267 | 0.2035 | 0.2598 | 0.1510 | 0.2010 | 0.1911 | 0.1655 |
22 | 12B | 12 | 22570119 | 22417185 | rs371763923 | A | G | – | 0 | 0 | 0.0020 | 0.2500 | 0.2442 | 0.2059 | 0.1510 | 0.2304 | 0.2022 | 0.1636 |
23 | 12C | 12 | 50428379 | 50034596 | rs368764180 | A | G | RACGAP1 | 0 | 0 | 0.0020 | 0.1570 | 0.1395 | 0.1961 | 0.0885 | 0.1863 | 0.1787 | 0.1388 |
24 | 13 | 13 | 56057499 | 55483364 | rs184748067 | G | A | – | 0 | 0.0060 | 0.0069 | 0.1744 | 0.1802 | 0.2353 | 0.1615 | 0.2304 | 0.1998 | 0.1600 |
25 | 14 | 14 | 65712298 | 65245580 | rs189013802 | G | A | – | 0 | 0 | 0.0079 | 0.1744 | 0.1744 | 0.2402 | 0.1823 | 0.2059 | 0.1873 | 0.1372 |
26 | 15 | 15 | 83236825 | 82568075 | rs17158407* | C | T | CPEB1 | 0 | 0.0010 | 0.0020 | 0.2558 | 0.2558 | 0.2990 | 0.3125 | 0.3333 | 0.2990 | 0.2681 |
27 | 16A | 16 | 3178971 | 3128970 | rs368479296 | C | T | ZNF213-AS1 | 0 | 0 | 0.0020 | 0.1802 | 0.1802 | 0.1422 | 0.1667 | 0.2108 | 0.1836 | 0.1645 |
28 | 16B | 16 | 23053815 | 23042494 | rs376893831 | G | T | – | 0 | 0 | 0.0010 | 0.2849 | 0.2674 | 0.1569 | 0.1823 | 0.1961 | 0.2146 | 0.1991 |
29 | 16C | 16 | 28588059 | 28576738 | rs370130302 | C | G | SGF29 | 0 | 0 | 0.0010 | 0.2907 | 0.2674 | 0.1961 | 0.1719 | 0.2206 | 0.2134 | 0.2024 |
30 | 16D | 16 | 33921593 | 34119126 | rs368738705 | C | T | – | 0 | 0 | 0 | 0.4186 | 0.4302 | 0.4853 | 0.3385 | 0.4412 | 0.4007 | 0.3329 |
31 | 16E | 16 | 46499858 | 46465946 | rs368538881 | C | T | – | 0 | 0 | 0 | 0.4128 | 0.4070 | 0.4951 | 0.3021 | 0.4608 | 0.4020 | 0.3549 |
32 | 16F | 16 | 48327788 | 48293877 | rs377323011 | A | G | LONP2 | 0 | 0 | 0.0010 | 0.3314 | 0.2965 | 0.4314 | 0.2760 | 0.3775 | 0.3362 | 0.2742 |
33 | 17A | 17 | 43964966 | 45887600 | rs369091847 | A | T | MAPT-AS1 | 0 | 0 | 0 | 0.1337 | 0.1512 | 0.2402 | 0.1250 | 0.2451 | 0.2035 | 0.1604 |
34 | 17B | 17 | 80660204 | 82702328 | rs376153825 | G | C | LOC105376791 | 0 | 0 | 0.0010 | 0.1279 | 0.1279 | 0.1765 | 0.1771 | 0.1716 | 0.1762 | 0.1273 |
35 | 19 | 19 | 8371240 | 8306356 | rs374908464 | A | G | CD320 | 0 | 0 | 0.0020 | 0.1802 | 0.2093 | 0.2500 | 0.1719 | 0.1078 | 0.1985 | 0.1740 |
36 | 20 | 20 | 4987550 | 5006904 | rs186201674 | C | T | SLC23A2 | 0.001 | 0.0040 | 0.0050 | 0.2093 | 0.2209 | 0.2255 | 0.1875 | 0.2451 | 0.2109 | 0.1545 |
* SNP also identified in 1000 Genomes Phase I | Average: | 0.002 | 0.001 | 0.002 | 0.214 | 0.216 | 0.240 | 0.182 | 0.233 | 0.219 | 0.178 |

3.2.2 South Asian-specific allele frequency estimates from Eurasiaplex-2 SNP genotypes


3.3 Analysis of the six Eurasiaplex-2 SNPs on chromosome 16
Internal ID | SNP | GRCh37 position | GRCh38 position | cM inter-SNP distance | Kosambi-adjusted Rc |
---|---|---|---|---|---|
16A | rs368479296 | 3178971 | 3128970 | ||
16B | rs376893831 | 23053815 | 23042494 | 38.681 | 0.324515 |
16C | rs370130302 | 28588059 | 28576738 | 11.3898 | 0.111968 |
16D | rs368738705 | 33921593 | 34119126 | 1.793 | 0.017922 |
16E | rs368538881 | 46499858 | 46465946 | 0.0422 | 0.000422 |
16F | rs377323011 | 48327788 | 48293877 | 0.3118 | 0.003118 |

3.4 Statistical analyses
3.4.1 Conventional Bayes analysis of South Asian population variability
3.4.2 Genetic cluster analysis with STRUCTURE comparing Eurasiaplex and Eurasiaplex-2 SNPs
3.4.3 Exploration of a simple South Asian-specific allele counting system

4. Discussion
Acknowledgements
Appendix A. Supplementary material
Supplementary material
Supplementary material
Supplementary material
References
- Eurasiaplex: a forensic SNP assay for differentiating European and South Asian ancestries.Forensic Sci. Int. Genet. 2013; 7: 359-366
- Reconstructing Indian population history.Nature. 2009; 461: 489-494
- The human genetic history of South Asia.Curr. Biol. 2010; 20: R184-187
- Building a forensic ancestry panel from the ground up: the EUROFORGEN Global AIM-SNP set.Forensic Sci. Int. Genet. 2014; 11: 13-25
- Analyses of a set of 128 ancestry informative single-nucleotide polymorphisms in a global set of 119 population samples.Invest. Genet. 2011; 2: 1
- Development and evaluation of the ancestry informative marker panel of the VISAGE basic tool.Genes. 2021; 12: 1284
- Informativeness of genetic markers for inference of ancestry.Am. J. Hum. Genet. 2003; 73: 1402-1422
- How to choose sets of ancestry informative markers: a supervised feature selection approach.Forensic Sci. Int. Genet. 2020; 46102259
- A global reference for human genetic variation.Nature. 2015; 526: 68-74
M. Byrska-Bishop, U.S. Evani, X. Zhao, A.O. Basile, H.J. Abel, A.A. Regier, A. André Corvelo, W.E. Clarke, R. Musunuri, K. Nagulapalli, et al., High coverage whole genome sequencing of the expanded 1000 Genomes Project cohort including 602 trios, bioRxiv preprint, posted February 7 2021 doi: 〈https://doi.org/10.1101/2021.02.06.430068〉.
- AIM-SNPtag: a computationally efficient approach for developing ancestry-informative SNP panels.Forensic Sci. Int. Genet. 2019; 38: 245-253
- Development and evaluations of the ancestry informative markers of the VISAGE enhanced tool for appearance and ancestry.Forensic Sc. Int. Genet. 2022;
〈http://www.ensembl.org/Homo_sapiens/Variation/Explore?r=6:60527829–60528829;v=rs3857620;vdb=variation;vf=169483878〉 (Accessed June 2022).
- Analysis of protein-coding genetic variation in 60,706 humans.Nature. 2016; 536: 285-291
- Insights into human genetic variation and population history from 929 diverse genomes.Science. 2020; 367: 1339-1349
- The Simons Genome Diversity Project: 300 genomes from 142 diverse populations.Nature. 2016; 538: 201-206
- Genomic analyses inform on migration events during the peopling of Eurasia.Nature. 2016; 538: 238-242
- Deep whole-genome sequencing of 100 southeast Asian Malays.Am. J. Hum. Genet. 2013; 92: 52-66
- Insights into the genetic structure and diversity of 38 South Asian Indians from deep whole-genome sequencing.PLoS Genet. 2014; 10e1004377
〈http://mathgene.usc.es/snipper/analysismultipleprofiles.html〉.
- The recombination landscape around forensic STRs: accurate measurement of genetic distances between syntenic STR pairs using HapMap high density SNP data.Forensic Sci. Int. Genet. 2012; 6: 354-365
- An overview of STRUCTURE: applications, parameter settings, and supporting software.Front. Genet. 2013; 4: 98
- ENGINES: exploring single nucleotide variation in entire human genomes.BMC Bioinf. 2011; 12: 105
- Indian signatures in the westernmost edge of the European Romani diaspora: New insight from mitogenomes.PLoS One. 2013; 8e75397
- The Global AIMs Nano set: a 31-plex SNaPshot assay of ancestry-informative SNPs.Forensic Sci. Int. Genet. 2016; 22: 81-88
- The MASTiFF panel - a versatile multiple-allele SNP test for forensics.Int. J. Leg. Med. 2020; 134: 441-450
- Ecologically and evolutionarily important SNPs identified in natural populations.Mol. Biol. Evol. 2011; 28: 1817-1826
- The date of interbreeding between Neandertals and modern humans.PLoS Genet. 2012; 8e1002947
- Archaic inheritance: supporting high-altitude life in Tibet.J. Appl. Physiol. 1985; 119: 1129-1134
- Population differentiation as a test for selective sweeps.Genome Res. 2010; 20: 393-402
- The formation of human populations in South and Central Asia.Science. 2019; 365: eaat7487
- Novel insights on demographic history of tribal and caste groups from West Maharashtra (India) using genome-wide data.Sci. Rep. 2020; 10: 10075
Article info
Publication history
Identification
Copyright
User license
Creative Commons Attribution – NonCommercial – NoDerivs (CC BY-NC-ND 4.0) |
Permitted
For non-commercial purposes:
- Read, print & download
- Redistribute or republish the final article
- Text & data mine
- Translate the article (private use only, not for distribution)
- Reuse portions or extracts from the article in other works
Not Permitted
- Sell or re-use for commercial purposes
- Distribute translations or adaptations of the article
Elsevier's open access license policy