Advertisement
Research paper| Volume 31, P40-47, November 2017

Download started.

Ok

Secure and robust cloud computing for high-throughput forensic microsatellite sequence analysis and databasing

  • Sarah F. Bailey
    Affiliations
    NC State University, Molecular Biomedical Sciences, 1060 William Moore Dr., Raleigh NC, 27607, United States

    NC State University, Forensic Sciences Institute, 1060 William Moore Dr., Raleigh NC, 27607, United States
    Search for articles by this author
  • Melissa K. Scheible
    Affiliations
    NC State University, Molecular Biomedical Sciences, 1060 William Moore Dr., Raleigh NC, 27607, United States

    NC State University, Forensic Sciences Institute, 1060 William Moore Dr., Raleigh NC, 27607, United States
    Search for articles by this author
  • Christopher Williams
    Affiliations
    NC State University, Molecular Biomedical Sciences, 1060 William Moore Dr., Raleigh NC, 27607, United States

    NC State University, Forensic Sciences Institute, 1060 William Moore Dr., Raleigh NC, 27607, United States
    Search for articles by this author
  • Deborah S.B.S. Silva
    Affiliations
    NC State University, Molecular Biomedical Sciences, 1060 William Moore Dr., Raleigh NC, 27607, United States

    NC State University, Forensic Sciences Institute, 1060 William Moore Dr., Raleigh NC, 27607, United States
    Search for articles by this author
  • Marina Hoggan
    Affiliations
    NC State University, Molecular Biomedical Sciences, 1060 William Moore Dr., Raleigh NC, 27607, United States
    Search for articles by this author
  • Christopher Eichman
    Affiliations
    NC State University, College of Veterinary Medicine, Office of Information Technology, 1060 William Moore Dr., Raleigh NC, 27607, United States
    Search for articles by this author
  • Seth A. Faith
    Correspondence
    Corresponding author at: NC State University, Molecular Biomedical Sciences, 1060 William Moore Dr., Raleigh, NC, 27607, United States.
    Affiliations
    NC State University, Molecular Biomedical Sciences, 1060 William Moore Dr., Raleigh NC, 27607, United States

    NC State University, Forensic Sciences Institute, 1060 William Moore Dr., Raleigh NC, 27607, United States
    Search for articles by this author

      Highlights

      • A Cloud-based system was developed to analyze next-generation sequencing data.
      • Data from a variety of bench top and portable DNA sequencers were analyzed.
      • Results are reported following ISFG guidelines and stored in a relational database.
      • Benefits included on-demand scalability, ease-of-use, access controls, and security.

      Abstract

      Next-generation Sequencing (NGS) is a rapidly evolving technology with demonstrated benefits for forensic genetic applications, and the strategies to analyze and manage the massive NGS datasets are currently in development. Here, the computing, data storage, connectivity, and security resources of the Cloud were evaluated as a model for forensic laboratory systems that produce NGS data. A complete front-to-end Cloud system was developed to upload, process, and interpret raw NGS data using a web browser dashboard. The system was extensible, demonstrating analysis capabilities of autosomal and Y-STRs from a variety of NGS instrumentation (Illumina MiniSeq and MiSeq, and Oxford Nanopore MinION). NGS data for STRs were concordant with standard reference materials previously characterized with capillary electrophoresis and Sanger sequencing. The computing power of the Cloud was implemented with on-demand auto-scaling to allow multiple file analysis in tandem. The system was designed to store resulting data in a relational database, amenable to downstream sample interpretations and databasing applications following the most recent guidelines in nomenclature for sequenced alleles. Lastly, a multi-layered Cloud security architecture was tested and showed that industry standards for securing data and computing resources were readily applied to the NGS system without disadvantageous effects for bioinformatic analysis, connectivity or data storage/retrieval. The results of this study demonstrate the feasibility of using Cloud-based systems for secured NGS data analysis, storage, databasing, and multi-user distributed connectivity.

      Keywords

      To read this article in full you will need to make a payment

      Purchase one-time access:

      Academic & Personal: 24 hour online accessCorporate R&D Professionals: 24 hour online access
      One-time access price info
      • For academic or personal research use, select 'Academic and Personal'
      • For corporate R&D use, select 'Corporate R&D Professionals'

      Subscribe:

      Subscribe to Forensic Science International: Genetics
      Already a print subscriber? Claim online access
      Already an online subscriber? Sign in
      Institutional Access: Sign in to ScienceDirect

      References

        • Holland M.M.
        Molecular analysis of the human mitochondrial DNA control region for forensic identity testing.
        Curr. Protoc. Hum. Genet. 2012; (Chapter 14 Unit14.7)
        • van der Gaag K.J.
        • de Leeuw R.H.
        • Hoogenboom J.
        • Patel J.
        • Storts D.R.
        • Laros J.F.
        • et al.
        Massively parallel sequencing of short tandem repeats-Population data and mixture analysis results for the PowerSeq system.
        Forensic Sci. Int. Genet. 2016; 24: 86-96
        • McElhoe J.A.
        • Holland M.M.
        • Makova K.D.
        • Su M.S.
        • Paul I.M.
        • Baker C.H.
        • et al.
        Development and assessment of an optimized next-generation DNA sequencing approach for the mtgenome using the Illumina MiSeq.
        Forensic Sci. Int. Genet. 2014; 13: 20-29
        • Zeng X.
        • King J.
        • Hermanson S.
        • Patel J.
        • Storts D.R.
        • Budowle B.
        An evaluation of the PowerSeq Auto System: a multiplex short tandem repeat marker kit compatible with massively parallel sequencing.
        Forensic Sci. Int. Genet. 2015; 19: 172-179
        • Xavier C.
        • Parson W.
        Evaluation of the Illumina ForenSeq DNA Signature Prep Kit − MPS forensic application for the MiSeq FGx benchtop sequencer.
        Forensic Sci. Int. Genet. 2017; 28: 188-194
        • Just R.S.
        • Moreno L.I.
        • Smerick J.B.
        • Irwin J.A.
        Performance and concordance of the ForenSeq system for autosomal and Y chromosome short tandem repeat sequencing of reference-type specimens.
        Forensic Sci. Int. Genet. 2017; 28: 1-9
        • Churchill J.D.
        • Schmedes S.E.
        • King J.L.
        • Budowle B.
        Evaluation of the Illumina((R)) beta version ForenSeq DNA signature prep kit for use in genetic profiling.
        Forensic Sci. Int. Genet. 2016; 20: 20-29
        • Jager A.C.
        • Alvarez M.L.
        • Davis C.P.
        • Guzman E.
        • Han Y.
        • Way L.
        • et al.
        Developmental validation of the MiSeq FGx forensic genomics system for targeted next generation sequencing in forensic DNA casework and database laboratories.
        Forensic Sci. Int. Genet. 2017; 28: 52-70
        • Pereira V.
        • Mogensen H.S.
        • Borsting C.
        • Morling N.
        Evaluation of the Precision ID Ancestry Panel for crime case work: a SNP typing assay developed for typing of 165 ancestral informative markers.
        Forensic Sci. Int. Genet. 2017; 28: 138-145
        • Vilsen S.B.
        • Tvedebrink T.
        • Mogensen H.S.
        • Morling N.
        Statistical modeling of ion PGM HID STR 10-plex MPS data.
        Forensic Sci. Int. Genet. 2017; 28: 82-89
        • Illumina Inc
        Compare Illumina Sequencers.
        2017 (https://www.illumina.com/systems/sequencing-platforms.html (2017). Accessed 10 Mar 2017)
        • van Dijk E.L.
        • Auger H.
        • Jaszczyszyn Y.
        • Thermes C.
        Ten years of next-generation sequencing technology.
        Trends Genet. 2014; 30: 418-426
        • Cornelis S.
        • Gansemans Y.
        • Deleye L.
        • Deforce D.
        • Van Nieuwerburgh F.
        Forensic SNP genotyping using nanopore MinION sequencing.
        Sci. Rep. 2017; 7: 41759
        • Marr B.
        Big Data.
        1 st ed. John Wiley & Sons Inc., GB2015
        • Coles C.
        • Yeoh J.
        Cloud Adoption Practices and Priorities Survey Report.
        Computer Security Alliance, 2015: 13
        • Amazon Web Services
        AWS Case Studies.
        2017 (https://aws.amazon.com/solutions/case-studies/(2017). Accessed 10 Mar 2017)
        • VansonBourne
        The Business Impact of the Cloud.
        2012: 19
        • Bornman D.M.
        • Hester M.E.
        • Schuetter J.M.
        • Kasoji M.D.
        • Minard-Smith A.
        • Barden C.A.
        • et al.
        Short-read, high-throughput sequencing technology for STR genotyping.
        Biotech. Rapid Dispatches. 2012; 2012: 1-6
        • Warshauer D.H.
        • Lin D.
        • Hari K.
        • Jain R.
        • Davis C.
        • Larue B.
        • et al.
        STRait razor: a length-based forensic STR allele-calling tool for use with second generation sequencing data.
        Forensic Sci. Int. Genet. 2013; 7: 409-417
        • Warshauer D.H.
        • King J.L.
        • Budowle B.
        STRait Razor v2.0: the improved STR Allele Identification Tool.
        Forensic Sci. Int. Genet. 2015; 14: 182-186
        • Van Neste C.
        • Vandewoestyne M.
        • Van Criekinge W.
        • Deforce D.
        • Van Nieuwerburgh F.
        My-Forensic-Loci-queries (MyFLq) framework for analysis of forensic STR data generated by massive parallel sequencing.
        Forensic Sci. Int. Genet. 2014; 9: 1-8
        • Hoogenboom J.
        • van der Gaag K.J.R.H.
        • de Leeuw T.
        • Sijen P.
        • de Knijff J.F.
        • Laros J.F.
        FDSTools: A software package for analysis of massively parallel sequencing data with the ability to recognise and correct STR stutter and other PCR or sequencing noise.
        Forensic Sci. Int. Genet. 2017; 27: 27-40
        • Gymrek M.
        • Golan D.
        • Rosset S.
        • Erlich Y.
        lobSTR: A short tandem repeat profiler for personal genomes.
        Genome Res. 2012; 22: 1154-1162
        • Gymrek M.
        • Erlich Y.
        Profiling short tandem repeats from short reads.
        Methods Mol. Biol. 2013; 1038: 113-135
        • Holland M.M.
        • Pack E.D.
        • McElhoe J.A.
        Evaluation of GeneMarker(R) HTS for improved alignment of mtDNA MPS data haplotype determination, and heteroplasmy assessment.
        Forensic Sci. Int. Genet. 2017; 28: 90-98
        • Parson W.
        • Ballard D.
        • Budowle B.
        • Butler J.M.
        • Gettings K.B.
        • Gill P.
        • et al.
        Massively parallel sequencing of forensic STRs: considerations of the DNA commission of the International Society for Forensic Genetics (ISFG) on minimal nomenclature requirements.
        Forensic Sci. Int. Genet. 2016; 22: 54-63
        • Loman N.J.
        • Quinlan A.R.
        Poretools: a toolkit for analyzing nanopore sequence data.
        Bioinformatics. 2014; 30: 3399-3401
        • King J.L.
        • Wendt F.R.
        • Sun J.
        • Budowle B.
        STRait Razor v2s: advancing sequence-based STR allele reporting and beyond to other marker systems.
        Forensic Sci. Int. Genet. 2017; 29: 21-28
        • Amazon Web Services
        Introduction to AWS Security.
        2015
        • Amazon Web Services
        AWS White Paper: Introduction to AWS Security.
        2016: 79
        • Ogden R.
        Forensic science, genetics and wildlife biology: getting the right mix for a wildlife DNA forensics lab.
        Forensic Sci. Med. Pathol. 2010; 6: 172-179