Advertisement

Combining artificial neural network classification with fully continuous probabilistic genotyping to remove the need for an analytical threshold and electropherogram reading

  • Duncan Taylor
    Correspondence
    Corresponding author at: Forensic Science SA, GPO Box 2790, Adelaide, SA 5001, Australia.
    Affiliations
    Forensic Science SA, GPO Box 2790, Adelaide, SA 5001, Australia

    School of Biological Sciences, Flinders University, GPO Box 2100, Adelaide, SA 5001, Australia
    Search for articles by this author
  • John Buckleton
    Affiliations
    Institute of Environmental Science and Research Limited, Private Bag 92021, Auckland 1142, New Zealand

    University of Auckland, Department of Statistics, Auckland, New Zealand
    Search for articles by this author
Published:October 09, 2022DOI:https://doi.org/10.1016/j.fsigen.2022.102787

      Highlights

      • EPGs peaks are assigned probabilistically as being artefactual or not using neural networks.
      • Models in STRmix are extended to incorporate these peak label probabilities.
      • FaSTR processed mixtures without an analytical threshold, AT, or human intervention.
      • These mixtures, with peak label probabilities, were analysed in STRmix.
      • Performance exceeded a ‘standard’ analysis using an AT and human reading.

      Abstract

      Standard processing of electrophoretic data within a forensic DNA laboratory is for one (or two) analysts to designate peaks as either artefactual or non-artefactual in a process commonly referred to as profile ‘reading’. Recently, FaSTR™ DNA has been developed to use artificial neural networks to automatically classify fluorescence within an electropherogram as baseline, allele, stutter or pull-up. These classifications are based on probabilities assigned to each timepoint (scan) within the electropherogram. Instead of using the probabilities to assign fluorescence into a category they can be used directly in the profile analysis. This has a number of advantages; increased objectivity in DNA profile processing, the removal for the need for analysts to read profiles, the removal for the need of an analytical threshold. Models within STRmix™ were extended to incorporate the peak label probabilities assigned by FaSTR™ DNA. The performance of the model extensions was tested on a DNA mixture dataset, comprising 2–4 person samples. This dataset was processed in a ‘standard’ manner using an analytical threshold of 50rfu, analyst peak designations and STRmix™ V2.9 models. The same dataset was then processed in an automated manner using no analytical threshold, no analysts reading the profile and using the STRmix™ models extended to incorporate peak label probabilities. Both datasets were compared to the known DNA donors and a set of non-donors. The result between the two processes was a very close performance, but with a large efficiency gain in the 0rfu process. Utilising peak label probabilities opens up the possibility for a range of workflow process efficiency gains, but beyond this allows full use of all data within an electropherogram.

      Keywords

      To read this article in full you will need to make a payment

      Purchase one-time access:

      Academic & Personal: 24 hour online accessCorporate R&D Professionals: 24 hour online access
      One-time access price info
      • For academic or personal research use, select 'Academic and Personal'
      • For corporate R&D use, select 'Corporate R&D Professionals'

      Subscribe:

      Subscribe to Forensic Science International: Genetics
      Already a print subscriber? Claim online access
      Already an online subscriber? Sign in
      Institutional Access: Sign in to ScienceDirect

      References

        • Woldegebriel M.
        • Asten Av
        • Kloosterman A.
        • Vivó-Truyols G.
        Probabilistic peak detection in CE-LIF for STR DNA typing.
        Electrophoresis. 2017;
        • Taylor D.
        • Powers D.
        Teaching artificial intelligence to read electropherograms.
        Forensic Sci. Int.: Genet. 2016; 25: 10-18
        • Taylor D.
        • Harrison A.
        • Powers D.
        An artificial neural network system to identify alleles in reference electropherograms.
        Forensic Sci. Int.: Genet. 2017; 30: 114-126
        • Taylor D.
        Using a multi-head, convolutional neural network with data augmentation to improve electropherogram classification performance.
        Forensic Sci. Int.: Genet. 2022; 56102605
        • Cowell R.G.
        • Graversen T.
        • Lauritzen S.L.
        • Mortera J.
        Analysis of forensic DNA mixtures with artefacts.
        J. R. Stat. Soc.: Ser. C (Appl. Stat.). 2015; 64: 1-48
        • Taylor D.
        • Bright J-A.
        • Kelly H.
        • Lin M-H.
        • Buckleton J.
        A fully continuous system of DNA profile evidence evaluation that can utilise STR profile data produced under different conditions within a single analysis.
        Forensic Sci. Int.: Genet. 2017; 31: 149-154
        • Hansson O.
        • Egeland T.
        • Gill P.
        Characterization of degradation and heterozygote balance by simulation of the forensic DNA analysis process.
        Int. J. Leg. Med. 2017; 131: 303-317
        • Buckleton J.
        • Kelly H.
        • Bright J.-A.
        • Taylor D.
        • Tvedebrink T.
        • Curran J.M.
        Utilising allelic dropout probabilities estimated by logistic regression in casework.
        Forensic Sci. Int.: Genet. 2014; 9: 9-11
        • Taylor D.
        • Buckleton J.
        • Bright J.-A.
        Factors affecting peak height variability for short tandem repeat data.
        Forensic Sci. Int.: Genet. 2016; 21: 126-133
        • Taylor D.
        • Bright J.-A.
        • Scandrett L.
        • Abarno D.
        • Lee S.-I.
        • Wivell R.
        • Kelly H.
        • Buckleton J.
        Validation of a top-down DNA profile analysis for database searching using a fully continuous probabilistic genotyping model.
        Forensic Sci. Int.: Genet. 2021; 52102479
        • Gill P.
        • Gusmão L.
        • Haned H.
        • Mayr W.R.
        • Morling N.
        • Parson W.
        • Prieto L.
        • Prinz M.
        • Schneider H.
        • Schneider P.M.
        • Weir B.S.
        DNA commission of the International Society of Forensic Genetics: Recommendations on the evaluation of STR typing results that may include drop-out and/or drop-in using probabilistic methods.
        Forensic Sci. Int.: Genet. 2012; 6: 679-688
        • Fujii K.
        • Fukagawa T.
        • Watahiki H.
        • Mita Y.
        • Kitayama T.
        • Mizuno N.
        Ratios and distances of pull-up peaks observed in GlobalFiler kit data.
        Leg. Med. 2018; 34: 58-63
        • Puch-Solis R.
        A dropin peak height model.
        Forensic Sci. Int.: Genet. 2014; 11: 80-84
        • Taylor D.
        • Bright J.-A.
        • McGoven C.
        • Hefford C.
        • Kalafut T.
        • Buckleton J.
        Validating multiplexes for use in conjunction with modern interpretation strategies.
        Forensic Sci. Int.: Genet. 2016; 20: 6-19
        • Lin M.-H.
        • Lee S.-I.
        • Zhang X.
        • Russell L.
        • Kelly H.
        • Cheng K.
        • Cooper S.
        • Wivell R.
        • Kerr Z.
        • Morawitz J.
        • Bright J.-A.
        Developmental validation of FaSTR™ DNA: software for the analysis of forensic DNA profiles.
        Forensic Sci. Int.: Rep. 2021; 3100217
        • Taylor D.
        • Bright J.-A.
        • McGovern C.
        • Neville S.
        • Grover D.
        Allele frequency database for GlobalFiler(TM) STR loci in Australian and New Zealand populations.
        Forensic Sci. Int.: Genet. 2017; 28: e38-e40
        • Moore D.
        • Clayton T.
        • Thomson J.
        A comprehensive study of allele drop-in over an extended period of time.
        Forensic Sci. Int.: Genet. 2020; 48
        • Taylor D.
        Using continuous DNA interpretation methods to revisit likelihood ratio behaviour.
        Forensic Sci. Int.: Genet. 2014; 11: 144-153
        • Kruijver M.
        • Bright J.-A.
        • Kelly H.
        • Buckleton J.
        Exploring the probative value of mixed DNA profiles.
        Forensic Sci. Int.: Genet. 2019; 41: 1-10
        • Volgin L.
        • Taylor D.
        • Bright J.-A.
        • Lin M.-H.
        Validation of a neural network approach for STR typing to replace human reading.
        Forensic Sci. Int.: Genet. 2021; 55102591
        • Swaminathan H.
        • Grgicak C.M.
        • Medard M.
        • Lun D.S.
        NOCIt: A computational method to infer the number of contributors to DNA samples analyzed by STR genotyping.
        Forensic Sci. Int.: Genet. 2015; 16: 172-180
        • Marciano M.
        • Adelman J.
        PACE: probabilistic assessment for contributor estimation - a machine learning-based assessment of the number of contributors in DNA mixtures.
        Forensic Science International: Genetics. 27. 2017: 82-91
        • Kruijver M.
        • Kelly H.
        • Cheng K.
        • Lin M.-H.
        • Morawitz J.
        • Russell L.
        • Buckleton J.
        • Bright J.-A.
        Estimating the number of contributors to a DNA profile using decision trees.
        Forensic Sci. Int.: Genet. 2021; 50102407
        • Alotaibi H.
        • Alsolami F.
        • Abozinadah E.
        • Mehmood R.
        TAWSEEM: A Deep-Learning-Based Tool for Estimating the Number of Unknown Contributors in DNA Profiling.
        Electronics. 2022; 11