Advertisement
Research Article| Volume 7, ISSUE 4, P409-417, July 2013

Download started.

Ok

STRait Razor: A length-based forensic STR allele-calling tool for use with second generation sequencing data

      Abstract

      Recent studies have demonstrated the capability of second generation sequencing (SGS) to provide coverage of short tandem repeats (STRs) found within the human genome. However, there are relatively few bioinformatic software packages capable of detecting these markers in the raw sequence data. The extant STR-calling tools are sophisticated, but are not always applicable to the analysis of the STR loci commonly used in forensic analyses. STRait Razor is a newly developed Perl-based software tool that runs on the Linux/Unix operating system and is designed to detect forensically-relevant STR alleles in FASTQ sequence data, based on allelic length. It is capable of analyzing STR loci with repeat motifs ranging from simple to complex without the need for extensive allelic sequence data. STRait Razor is designed to interpret both single-end and paired-end data and relies on intelligent parallel processing to reduce analysis time. Users are presented with a number of customization options, including variable mismatch detection parameters, as well as the ability to easily allow for the detection of alleles at new loci. In its current state, the software detects alleles for 44 autosomal and Y-chromosome STR loci. The study described herein demonstrates that STRait Razor is capable of detecting STR alleles in data generated by multiple library preparation methods and two Illumina® sequencing instruments, with 100% concordance. The data also reveal noteworthy concepts related to the effect of different preparation chemistries and sequencing parameters on the bioinformatic detection of STR alleles.

      Keywords

      To read this article in full you will need to make a payment

      Purchase one-time access:

      Academic & Personal: 24 hour online accessCorporate R&D Professionals: 24 hour online access
      One-time access price info
      • For academic or personal research use, select 'Academic and Personal'
      • For corporate R&D use, select 'Corporate R&D Professionals'

      Subscribe:

      Subscribe to Forensic Science International: Genetics
      Already a print subscriber? Claim online access
      Already an online subscriber? Sign in
      Institutional Access: Sign in to ScienceDirect

      References

        • Edwards A.
        • Civitello A.
        • Hammond H.A.
        • Caskey C.T.
        DNA typing and genetic mapping with trimeric and tetrameric tandem repeats.
        Am. J. Hum. Genet. 1991; 49: 746-756
        • Edwards A.
        • Hammond H.A.
        • Jin L.
        • Caskey C.T.
        • Chakraborty R.
        Genetic variation at five trimeric and tetrameric repeat loci in four human population groups.
        Genomics. 1992; 12: 241-253
        • Ellegren H.
        Microsatellites: simple sequences with complex evolution.
        Nat. Rev. Genet. 2004; 5: 435-445
        • Lazaruk K.
        • Walsh P.S.
        • Oaks F.
        • Gilbert D.
        • Rosenblum B.B.
        • Menchen S.
        • Scheibler D.
        • Wenz H.M.
        • Holt C.
        • Wallin J.
        Genotyping of forensic short tandem repeat (STR) systems based on sizing precision in a capillary electrophoresis instrument.
        Electrophoresis. 1998; 19: 86-93
        • Gill P.
        • Werrett D.J.
        • Budowle B.
        • Guerrieri R.
        An assessment of whether SNPs will replace STRs in national DNA databases – joint considerations of the DNA working group of the European Network of Forensic Science Institutes (ENFSI) and the Scientific Working Group on DNA Analysis Methods (SWGDAM).
        Sci. Justice. 2004; 44: 51-53
        • Collins P.J.
        • Hennessy L.K.
        • Leibelt C.S.
        • Roby R.K.
        • Reeder D.J.
        • Foxall P.A.
        Developmental validation of a single-tube amplification of the 13 CODIS STR loci, D2S1338, D19S433, and amelogenin: the AmpFlSTR® Identifiler® PCR Amplification Kit.
        J. Forensic Sci. 2004; 49: 1265-1277
        • Oostdik K.
        • French J.
        • Yet D.
        • Smalling B.
        • Nolde C.
        • Vallone P.M.
        • Butts E.L.
        • Hill C.R.
        • Kline M.C.
        • Rinta T.
        • Gerow A.M.
        • Allen S.R.
        • Huber C.K.
        • Teske J.
        • Krenke B.
        • Ensenberger M.
        • Fulmer P.
        • Sprecher C.
        Developmental validation of the PowerPlex® 18D system, a rapid STR multiplex for analysis of reference samples.
        Forensic Sci. Int. Genet. 2013; 7: 129-135
        • Gymrek M.
        • Golan D.
        • Rosset S.
        • Erlich Y.
        lobSTR: a short tandem repeat profiler for personal genomes.
        Genome Res. 2012; 22: 1154-1162
        • Bornman D.M.
        • Hester M.E.
        • Schuetter J.M.
        • Kasoji M.D.
        • Minard-Smith A.
        • Barden C.A.
        • Nelson S.C.
        • Godbold G.D.
        • Baker C.H.
        • Yang B.
        • Walther J.E.
        • Tornes I.E.
        • Yan P.S.
        • Rodriguez B.
        • Bundschuh R.
        • Dickens M.L.
        • Young B.A.
        • Faith S.A.
        Short-read, high-throughput sequencing technology for STR genotyping.
        Biotechniques – Rapid Dispatches. 2012; : 1-6
        • Fordyce S.L.
        • Ávila-Arcos M.C.
        • Rockenbauer E.
        • Børsting C.
        • Frank-Hansen R.
        • Petersen F.T.
        • Willerslev E.
        • Hansen A.J.
        • Morling N.
        • Gilbert M.T.
        High-throughput sequencing of core STR loci for forensic genetic investigations using the Roche Genome Sequencer FLX platform.
        Biotechniques. 2011; 51: 127-133
        • Holland M.M.
        • McQuillan M.R.
        • O’Hanlon K.A.
        Second generation sequencing allows for mtDNA mixture deconvolution and high resolution detection of heteroplasmy.
        Croat. Med. J. 2011; 52: 299-313
        • Nielsen R.
        • Paul J.S.
        • Albrechtsen A.
        • Song Y.S.
        Genotype and SNP calling from next-generation sequencing data.
        Nat. Rev. Genet. 2011; 12: 443-451
        • Craig D.W.
        • Pearson J.V.
        • Szelinger S.
        • Sekar A.
        • Redman M.
        • Corneveaux J.J.
        • Pawlowski T.L.
        • Laub T.
        • Nunn G.
        • Stephan D.A.
        • Homer N.
        • Huentelman M.J.
        Identification of genetic variants using bar-coded multiplexed sequencing.
        Nat. Methods. 2008; 5: 887-893
        • Koboldt D.C.
        • Chen K.
        • Wylie T.
        • Larson D.E.
        • McLellan M.D.
        • Mardis E.R.
        • Weinstock G.M.
        • Wilson R.K.
        • Ding L.
        VarScan: variant detection in massively parallel sequencing of individual and pooled samples.
        Bioinformatics. 2009; 25: 2283-2285
        • McKenna A.
        • Hanna M.
        • Banks E.
        • Sivachenko A.
        • Cibulskis K.
        • Kernytsky A.
        • Garimella K.
        • Altshuler D.
        • Gabriel S.
        • Daly M.
        • DePristo M.A.
        The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data.
        Genome Res. 2010; 20: 1297-1303
        • Berglund E.C.
        • Kiialainen A.
        • Syvänen A.C.
        Next-generation sequencing technologies and applications for human genetic history and forensics.
        Investig. Genet. 2011; 2: 1-15
      1. Illumina® GAIIx™ Specifications: http://www.illumina.com/documents/products/datasheets/datasheet_genome_analyzerIIx.pdf.

      2. Illumina® MiSeq™ Specifications: http://www.illumina.com/documents/products/datasheets/datasheet_miseq.pdf.

        • Langmead B.
        • Trapnell C.
        • Pop M.
        • Salzberg S.L.
        Ultrafast and memory-efficient alignment of short DNA sequences to the human genome.
        Genome Biol. 2009; 10: R25
      3. CASAVA v.1.8.2: http://support.illumina.com/downloads/casava_182.ilmn.

      4. MiSeq Reporter: http://support.illumina.com/sequencing/sequencing_software/miseq_reporter/downloads.ilmn.

      5. AGREP: http://laurikari.net/tre/download/.

      6. PPSS: http://code.google.com/p/ppss/.

      7. Agilent Technologies® HaloPlex™ Specifications: http://www.genomics.agilent.com/GenericB.aspx?pagetype=Custom&subpagetype=Custom&pageid=3081.

      8. Illumina TruSeq Specifications: http://www.illumina.com/Documents/%5Cproducts%5Cdatasheets%5Cdatasheet_truseq_custom_enrichment_kit.pdf.

      9. Penta D Facts Sheet (STRBase): http://www.cstl.nist.gov/strbase/str_Penta_D.htm.

      10. STRBase: http://www.cstl.nist.gov/strbase/str_fact.htm.

      11. Y-Chromosome Haplotype Reference Database: http://www.yhrd.org/Research/Loci.