Advertisement

Epigenetic age estimation in saliva and in buccal cells

Open AccessPublished:August 27, 2022DOI:https://doi.org/10.1016/j.fsigen.2022.102770

      Highlights

      • A tissue prediction model for saliva vs buccal cells was developed based on logistic regression.
      • An age prediction model covering both saliva and buccal swabs was developed based on multivariate quantile regression.
      • The robustness of the final age prediction model was tested by simulating single-marker losses.
      • For the markers chosen a minimum of 10 ng of input DNA could be used for reproducible and viable DNA methylation testing.

      Abstract

      Age estimation based on epigenetic markers is a DNA intelligence tool with the potential to provide relevant information for criminal investigations, as well as to improve the inference of age-dependent physical characteristics such as male pattern baldness or hair color. Age prediction models have been developed based on different tissues, including saliva and buccal cells, which show different methylation patterns as they are composed of different cell populations. On many occasions in a criminal investigation, the origin of a sample or the proportion of tissues is not known with certainty, for example the provenance of cigarette butts, so use of combined models can provide lower prediction errors.
      In the present study, two tissue-specific and seven age-correlated CpG sites were selected from publicly available data from the Illumina HumanMethylation 450 BeadChip and bibliographic searches, to help build a tissue-dependent, and an age-prediction model, respectively. For the development of both models, a total of 184 samples (N = 91 saliva and N = 93 buccal cells) ranging from 21 to 86 years old were used. Validation of the models was performed using either k-fold cross-validation and an additional set of 184 samples (N = 93 saliva and N = 91 buccal cells, 21–86 years old).
      The tissue prediction model was developed using two CpG sites (HUNK and RUNX1) based on logistic regression that produced a correct classification rate for saliva and buccal swab samples of 88.59 % for the training set, and 83.69 % for the testing set. Despite these high success rates, a combined age prediction model was developed covering both saliva and buccal cells, using seven CpG sites (cg10501210, LHFPL4, ELOVL2, PDE4C, HOXC4, OTUD7A and EDARADD) based on multivariate quantile regression giving a median absolute error (MAE): ± 3.54 years and a correct classification rate ( %CP±PI) of 76.08 % for the training set, and an MAE of ± 3.66 years and a %CP±PI of 71.19 % for the testing set. The addition of tissue-of origin as a co-variate to the model was assessed, but no improvement was detected in age predictions. Finally, considering the limitations usually faced by forensic DNA analyses, the robustness of the model and the minimum recommended amount of input DNA for bisulfite conversion were evaluated, considering up to 10 ng of genomic DNA for reproducible results. The final multivariate quantile regression age predictor based on the models we developed has been placed in the open-access Snipper forensic classification website.

      Keywords

      1. Introduction

      Age estimation can provide key information in criminal, legal and anthropological investigations [
      • Freire-Aradas A.
      • Phillips C.
      • Lareu M.
      Forensic individual age estimation with DNA: from initial approaches to methylation tests.
      ]. In cases where there are no suspects and the DNA profiles recovered from forensic biological samples do not match with any profile stored in national DNA databases, age prediction can play an important role guiding police investigations, which can reduce the number of potential suspects [
      • Parson W.
      Age estimation with DNA: From forensic DNA fingerprinting to forensic (Epi) genomics: a mini-review.
      ]. Age estimation may also improve the prediction of phenotypic characteristics related to aging, e.g. hair colour [
      • Walsh S.
      • Liu F.
      • Wollstein A.
      • Kovatsi L.
      • Ralf A.
      • Kosiniak-kamysz A.
      • et al.
      The HIrisPlex system for simultaneous prediction of hair and eye colour from DNA.
      ] or male pattern baldness [
      • Marcińska M.
      • Pośpiech E.
      • Abidi S.
      • Andersen J.D.
      • van den Berge M.
      • Carracedo Á.
      • et al.
      Evaluation of DNA variants associated with androgenetic alopecia and their potential to predict male pattern baldness.
      ]. Additionally, if the prediction models develop enough accuracy, legal disputes could potentially be supported by age estimation [
      • Abbott A.
      DNA clock may aid refugee age check.
      ]. In all these cases, chronological age rather than biological age needs to be inferred [
      • Noroozi R.
      • Ghafouri-Fard S.
      • Pisarek A.
      • Rudnicka J.
      • Spólnicka M.
      • Branicki W.
      • et al.
      DNA methylation-based age clocks: From age prediction to age reversion.
      ].
      DNA methylation has become the gold standard biomarker for human age estimation. This epigenetic signature consists of the addition of a methyl group (-CH3) to the 5′ carbon of cytosines positioned next to guanines (CpG nucleotides) [
      • Smith Z.D.
      • Meissner A.
      DNA methylation: roles in mammalian development.
      ]. Age correlation with DNA methylation has been largely confirmed by a broad range of epigenetic studies [
      • Rakyan V.K.
      • Down T.A.
      • Maslau S.
      • Andrew T.
      • Yang T.P.
      • Beyan H.
      • et al.
      Human aging-associated DNA hypermethylation occurs preferentially at bivalent chromatin domains.
      ,
      • Bocklandt S.
      • Lin W.
      • Sehl M.E.
      • Sa F.J.
      • Sinsheimer J.S.
      • Horvath S.
      • et al.
      Epigenetic predictor of age.
      ,
      • Bell J.T.
      • Tsai P.-C.
      • Yang T.-P.
      • Pidsley R.
      • Nisbet J.
      • Glass D.
      • et al.
      Epigenome-wide scans identify differentially methylated regions for age and age-related phenotypes in a healthy ageing population.
      ,
      • Heyn H.
      • Esteller M.
      DNA methylation profiling in the clinic: applications and challenges.
      ,
      • Hannum G.
      • Guinney J.
      • Zhao L.
      • Zhang L.
      • Hughes G.
      Genome-wide methylation profiles reveal quantitative views of human aging rates.
      ,
      • Horvath S.
      DNA methylation age of human tissues and cell types.
      ,
      • Johansson A.
      • Enroth S.
      • Gyllensten U.
      Continuous aging of the human DNA methylome throughout the human lifespan.
      ,
      • Reynolds C.A.
      • Tan Q.
      • Munoz E.
      • Jylhävä J.
      • Hjelmborg J.
      • Christiansen L.
      • et al.
      A decade of epigenetic change in aging twins: genetic and environmental contributions to longitudinal DNA methylation.
      ,
      • Wang Y.
      • Karlsson R.
      • Lampa E.
      • Zhang Q.
      • Hedman Å.K.
      • Almgren M.
      Epigenetic influences on aging: a longitudinal genome-wide methylation study in old Swedish twins.
      ]. Based on the DNA methylation values of age correlated CpG sites, multiple forensic age prediction models have been developed to date, reviewed in [
      • Freire-Aradas A.
      • Phillips C.
      • Lareu M.
      Forensic individual age estimation with DNA: from initial approaches to methylation tests.
      ]. Since DNA methylation is tissue-specific [
      • Moore L.D.
      • Le T.
      • Fan G.
      DNA methylation and its basic function.
      ], most of these epigenetic clocks have been based on specific forensic tissues, including blood [
      • Weidner C.I.
      • Lin Q.
      • Koch C.M.
      • Eisele L.
      • Beier F.
      • Ziegler P.
      • et al.
      Aging of blood can be tracked by DNA methylation changes at just three CpG sites.
      ,
      • Zbieć-Piekarska R.
      • Sólnicka M.
      • Kupiec T.
      • Parys-proszek A.
      • Makowska Z.
      • Paleczka A.
      • et al.
      Development of a forensically useful age prediction method based on DNA methylation analysis.
      ,
      • Freire-Aradas A.
      • Phillips C.
      • Mosquera-Miguel A.
      • Girón-Santamaría L.
      • Gómez-Tato A.
      • Casares De Cal M.
      • et al.
      Development of a methylation marker set for forensic age estimation using analysis of public methylation data and the Agena Bioscience EpiTYPER system.
      ,
      • Aliferi A.
      • Ballard D.
      • Gallidabino M.D.
      • Thurtle H.
      • Barron L.
      • Syndercombe-Court D.
      DNA methylation-based age prediction using massively parallel sequencing data and multiple machine learning models.
      ], buccal swabs [
      • Eipel M.
      • Mayer F.
      • Arent T.
      • Ferreira M.R.P.
      • Birkhofer C.
      • Gerstenmaier U.
      • et al.
      Epigenetic age predictions based on buccal swabs are more precise in combination with cell type-specific DNA methylation signatures.
      ,
      • Jung S.-E.
      • Min S.
      • Rom S.
      • Hee E.
      • Shin K.
      • Young H.
      DNA methylation of the ELOVL2, FHL2, KLF14, C1orf132/MIR29B2C, and TRIM59 genes for age prediction from blood, saliva, and buccal swab samples.
      ,
      • Schwender K.
      • Holländer O.
      • Klopfleisch S.
      • Eveslage M.
      • Danzer M.F.
      • Pfeiffer H.
      • et al.
      Development of two age estimation models for buccal swab samples based on 3 CpG sites analyzed with pyrosequencing and minisequencing.
      ], saliva [
      • Jung S.-E.
      • Min S.
      • Rom S.
      • Hee E.
      • Shin K.
      • Young H.
      DNA methylation of the ELOVL2, FHL2, KLF14, C1orf132/MIR29B2C, and TRIM59 genes for age prediction from blood, saliva, and buccal swab samples.
      ,
      • Hong S.R.
      • Jung S.E.
      • Lee E.H.
      • Shin K.J.
      • Yang W.I.
      • Lee H.Y.
      DNA methylation-based age prediction from saliva: high age predictability by combination of 7 CpG markers.
      ] and semen [
      • Lee W.J.
      • Choung C.M.
      • Jung Y.J.
      • Lee H.Y.
      • Lim S.-K.
      A validation study of DNA methylation-based age prediction using semen in forensic casework samples.
      ,
      • Jenkins T.G.
      • Aston K.I.
      • Cairns B.
      • Smith A.
      • Carrell D.T.
      Paternal germ line aging: DNA methylation age prediction from human sperm.
      ]. More recently, skeletal remains, e.g., bones and teeth have been studied [
      • Bekaert B.
      • Kamalandua A.
      • Zapico S.C.
      • Voorde W.Van De
      • Bekaert B.
      Improved age determination of blood and teeth samples using a selected set of DNA methylation markers.
      ,
      • Lee H.Y.
      • Hong S.R.
      • Lee J.E.
      • Hwang I.K.
      • Kim N.Y.
      • Lee J.M.
      • et al.
      Epigenetic age signatures in bones.
      ].
      Whole blood is not uniformly composed of identical cell types, but consists of distinct cell populations in varying proportions. As methylation profiles of peripheral blood mononuclear cells and granulocytes have been identified [
      • Reinius L.E.
      • Acevedo N.
      • Joerink M.
      • Pershagen G.
      • Dahlén S.-E.
      • Greco D.
      • et al.
      Differential DNA methylation in purified human blood cells: Implications for cell lineage and studies on disease susceptibility.
      ], cell heterogeneity could act as a confounder. However, studies have observed that DNA methylation for age correlated CpG sites does not vary significantly across sorted blood cells from healthy subjects [
      • Horvath S.
      DNA methylation age of human tissues and cell types.
      ], and subsequently, most forensic age prediction models were based on whole blood treated as a homogeneous tissue.
      Another tissue source that lacks cellular homogeneity is the oral cavity, where saliva and buccal swabs have different varied proportions of leucocytes and epithelial cells [
      • Theda C.
      • Hwang S.H.
      • Czajko A.
      • Loke Y.J.
      • Leong P.
      • Craig J.M.
      Quantitation of the cellular content of saliva and buccal swab samples.
      ]. This difference in cell content could potentially create differences in DNA methylation for specific CpG sites, and this phenomenon was previously observed for ELOVL2 and FHL2 [
      • Jung S.-E.
      • Min S.
      • Rom S.
      • Hee E.
      • Shin K.
      • Young H.
      DNA methylation of the ELOVL2, FHL2, KLF14, C1orf132/MIR29B2C, and TRIM59 genes for age prediction from blood, saliva, and buccal swab samples.
      ], indicating that both sample types cannot be considered a single biological source beforehand.
      Nevertheless, considering that deconvolution to assign the specific biological source - saliva or buccal swabs - to forensic oral cavity specimens is difficult to achieve, e.g., cigarette butts, the development of a single age prediction model covering both tissues represents a practical approach.
      A similar approach has already been proposed by Horvath et al. [
      • Horvath S.
      • Oshima J.
      • Martin G.M.
      • Lu A.T.
      • Quach A.
      • Felton S.
      • et al.
      Epigenetic clock for skin and blood cells applied to Hutchinson Gilford progeria syndrome and ex vivo studies.
      ], developing the “skin & blood clock”, an epigenetic clock based on 391 CpGs that covers samples originating from blood, skin, saliva, buccal cells, as well as from four additional somatic tissues. The age prediction model reported by Jung et al., is more focused on forensic specimens, and is based on 5 CpG sites applicable to either blood, saliva or buccal cells [
      • Jung S.-E.
      • Min S.
      • Rom S.
      • Hee E.
      • Shin K.
      • Young H.
      DNA methylation of the ELOVL2, FHL2, KLF14, C1orf132/MIR29B2C, and TRIM59 genes for age prediction from blood, saliva, and buccal swab samples.
      ].
      In the present study, we focused on specimens from the oral cavity aiming to develop a tissue prediction model that can differentiate saliva from buccal cells, as well as an age prediction model covering both tissues, since most forensic samples related to the oral cavity will comprise a mixture of saliva and buccal cells. Additionally, to include the tissue-of-origin as a co-variable do not improve age predictions. Selection of candidate tissue-specific and age correlated CpG sites was based on the assessment of public data from Illumina HumanMethylation 450 K. Then, 184 volunteers (21–86 years old) were analyzed using SNaPshot™, after collection of either saliva and buccal swabs from the same individual (N = 368). A proportion of the analyzed samples were used to develop the training set (N = 184), while an additional part was used as a testing set for model validation purposes (N = 184). As a result, a tissue prediction model (saliva vs buccal cells) using logistic regression and based on 2 CpG sites was developed. In parallel, an age prediction model covering these tissues together and based on multivariate quantile regression analysis was developed for 7 CpG sites showing the highest correlation with age. Since SNaPshot™ needs a preliminary step of bisulfite conversion that degrades the DNA, requiring high levels of input DNA, we made an evaluation of serial dilutions with this detection system to determine the limits of the assay.

      2. Material and methods

      2.1 Samples, DNA extraction and quantification

      A total of 368 samples, 184 total saliva and 184 buccal cells, were collected from 184 healthy Spanish volunteers from 21 to 86 years old. Based on this set of samples, for the saliva-specific and buccal swab-specific age prediction models, the whole set of 184 saliva and 184 buccal swabs, respectively, were directly used as training sets. For the tissue-combined age prediction model, a random selection was made to generate training and test sets balanced in terms of sample size, distribution of ages and represented tissues. Each group had 184 individuals with the full age range 21–86 years. The training set consisted of 91 saliva and 93 buccal cell samples, while the testing group had 93 saliva and 91 buccal cell samples.
      All samples were taken with written informed consent obtained from the donors. Ethical approval was obtained from the ethics committee of investigation in Galicia, Spain (CAEI: 2013/543). Buccal swabs were air-dried and stored at room temperature and total saliva was collected with 15 mL falcon tubes and frozen at − 20 °C until DNA extraction. Genomic DNA was extracted from the whole swab and from 500 µL of total saliva with phenol/chloroform extraction [
      • Köchl S.
      • Niederstätter H.
      • Parson W.
      DNA extraction and quantitation of forensic samples using the phenol-chloroform method and real-time PCR.
      ]. All DNA samples were quantified by Qubit® dsDNA High Sensitivity (HS) or dsDNA Broad Range (BR) Assay kits (Thermo Fisher) following manufacturer’s guidelines.

      2.2 CpG site selection

      Selection of candidate CpG sites was based on both bibliographic searches as well as statistical assessment of NCBI GEO methylation studies using public data from the Illumina HumanMethylation450KBeadChip. Tissue-specific CpG site selection was based on the statistical assessment of the methylation β-values from GSE48472 [
      • Slieker R.C.
      • Bos S.D.
      • Goeman J.J.
      • Bovée J.V.M.G.
      • Talens R.P.
      • Breggen R.
      • Van Der
      • et al.
      Identification and systematic annotation of tissue-specific differentially methylated regions using the Illumina 450k array.
      ] (blood, saliva and buccal cells). To check for absence of correlation with age for the selected tissue-specific markers; GSE87571 [
      • Johansson A.
      • Enroth S.
      • Gyllensten U.
      Continuous aging of the human DNA methylome throughout the human lifespan.
      ] GSE92767 [
      • Hong S.R.
      • Jung S.E.
      • Lee E.H.
      • Shin K.J.
      • Yang W.I.
      • Lee H.Y.
      DNA methylation-based age prediction from saliva: high age predictability by combination of 7 CpG markers.
      ] and GSE50586 [
      • Jones M.J.
      • Farré P.
      • Mcewen L.M.
      • Macisaac J.L.
      • Watt K.
      • Neumann S.M.
      • et al.
      Distinct DNA methylation patterns of cognitive impairment and trisomy 21 in down syndrome.
      ] were used. Furthermore, the bibliographic review was focused on publications from 2011 to 2019, and searched for markers presenting a high correlation with age in different tissues: blood [
      • Weidner C.I.
      • Lin Q.
      • Koch C.M.
      • Eisele L.
      • Beier F.
      • Ziegler P.
      • et al.
      Aging of blood can be tracked by DNA methylation changes at just three CpG sites.
      ,
      • Freire-Aradas A.
      • Phillips C.
      • Mosquera-Miguel A.
      • Girón-Santamaría L.
      • Gómez-Tato A.
      • Casares De Cal M.
      • et al.
      Development of a methylation marker set for forensic age estimation using analysis of public methylation data and the Agena Bioscience EpiTYPER system.
      ,
      • Bekaert B.
      • Kamalandua A.
      • Zapico S.C.
      • Voorde W.Van De
      • Bekaert B.
      Improved age determination of blood and teeth samples using a selected set of DNA methylation markers.
      ,
      • Huang Y.
      • Yan J.
      • Hou J.
      • Fu X.
      • Li L.
      • Hou Y.
      Developing a DNA methylation assay for human age prediction in blood and bloodstain.
      ], saliva [
      • Bocklandt S.
      • Lin W.
      • Sehl M.E.
      • Sa F.J.
      • Sinsheimer J.S.
      • Horvath S.
      • et al.
      Epigenetic predictor of age.
      ,
      • Marqueta-Gracia J.J.
      • Álvarez-Álvarez M.
      • Baeta M.
      • Palencia-Madrid L.
      • Prieto-Fernández E.
      • Ordoñana R.J.
      • et al.
      Genetics differentially methylated CpG regions analyzed by PCR-high resolution melting for monozygotic twin pair discrimination.
      ], and buccal cells [
      • Eipel M.
      • Mayer F.
      • Arent T.
      • Ferreira M.R.P.
      • Birkhofer C.
      • Gerstenmaier U.
      • et al.
      Epigenetic age predictions based on buccal swabs are more precise in combination with cell type-specific DNA methylation signatures.
      ,
      • Hamano Y.
      • Manabe S.
      • Morimoto C.
      • Fujimoto S.
      • Tamaki K.
      Forensic age prediction for saliva samples using methylation-sensitive high resolution melting: exploratory application for cigarette butts.
      ]. Additionally, methylation β-values from GSE92767 [
      • Hong S.R.
      • Jung S.E.
      • Lee E.H.
      • Shin K.J.
      • Yang W.I.
      • Lee H.Y.
      DNA methylation-based age prediction from saliva: high age predictability by combination of 7 CpG markers.
      ] were statistically assessed to seek to identify additional age-correlated CpG sites.

      2.3 Primer design

      The flanking regions of the selected CpGs were screened using the UCSC genome browser (https://genome.ucsc.edu/) for the current human genome assembly (GRCh38/hg38), covering 150 bp upstream and downstream of the target CpG. The PCR primer and Single Base Extension (SBE) primer designs were made using BatchPrimer 3 v1.0 [
      • You F.M.
      • Huo N.
      • Gu Y.Q.
      • Luo M.
      • Ma Y.
      • Hane D.
      • et al.
      BatchPrimer3: a high throughput web application for PCR and sequencing primer design.
      ] applying the following parameters for PCR primers: optimal melting temperature 58 °C, optimal primer length 20 bp and optimal amplicon length 90 bp; and for the SBE primer design: optimal melting temperature 50 °C and optimal probe length 20 bp. Poly-CT tails were added to the SBE primers for size separation.

      2.4 Bisulfite conversion, PCR conditions and purification of PCR products

      Bisulfite conversion of 100 ng of extracted genomic DNA was carried out with the MethylEdge™ Bisulfite Conversion System (Promega) following manufacturer’s guidelines, obtaining an elution volume of 20 µL. A PCR multiplex amplification in 10.7 µL reaction volume adding 1.5 µL of converted DNA was carried out using 0.3 µL of 250 U AmpliTaq Gold™ DNA Polymerase, 1.5 µL of 10X Buffer II, 3.9 µL of 25 mM MgCl2 (all from Applied Biosystems, AB), 1.5 µL of 32 ng/µL bovine serum albumin, 1 µL of 10 mM GeneAmp® dNTP Mix with dTTP (AB) and 1 µL of primer mix (0.083–5 µM of each primer, Metabion International). PCR cycling used a GeneAmp® PCR system 2720 (AB) with cycling conditions: 95ºC for 11 min; 34 cycles of 94ºC for 20 s, 56ºC for 60 s and 72ºC for 30 s, and a final extension of 72ºC for 7 min
      After checking amplification yields in 1 % agarose gels, a purification of 2.5 µL of PCR product was performed adding 1 µL of ExoSAP-IT™ PCR Product Cleanup Reagent (AB) at 37 °C for 45 min and 80 °C for 15 min

      2.5 Single base extension and capillary electrophoresis

      Multiplex SBE reactions were performed in a total volume of 6 µL using 2 µL of purified PCR product, 2.5 µL of SNaPshot™ kit (AB) and 1.5 µL of SBE primers (0.51–6 µM of each primer, Metabion International) with cycling conditions: 30 cycles of 96 °C for 10 s, 55 ° for 5 s, and 60 °C for 30 s
      After the SNaPshot reaction, extension products were purified by adding 1 µL of Shrimp Alkaline Phosphatase Recombinant (AB) to the total SNaPshot reaction and incubating at 37 °C for 80 min with inactivation at 85 °C for 15 min
      Capillary electrophoresis was performed with an ABI3130xl Genetic Analyzer (AB) using 0.1 µL of GeneScan™ 120 LIZ™ dye Size Standard (Thermo Fisher) and 10 µL of HiDi™ Formamide (AB) per sample, adding 9.5 µL of load mix and 1.5 µL of purified SNaPshot product. Results were analyzed with GeneMapperID v3.2 (AB) and the DNA methylation level at each CpG was calculated by dividing the height of the methylated peak by the sum of the heights of the methylated and unmethylated peaks. The latter values were multiplied by a correction factor of 2, when working with reverse primers and 1.6 for forward primers, to overcome differences at fluorochrome signal intensities.

      2.6 Statistical analyses

      All samples were run in duplicate. The average of the DNA methylation levels in both replicates was used for the statistical analyses. Correlations between age and DNA methylation levels were evaluated using the Spearman Correlation test (rs). To analyze the reproducibility of the dilutions and the inter-individual variability, the standard deviation (SD) was used (threshold SD > 0.1). Normality was assessed using the Shapiro-Wilk test applied to the residuals of the independent linear regression models tested for each CpG (p-value < 0.05). Logistic regression was used to develop the tissue prediction model using the pROC R package [
      • Robin X.
      • Turck N.
      • Hainard A.
      • Tiberti N.
      • Lisacek F.
      • Sanchez J.
      • et al.
      pROC: an open-source package for R and S+ to analyze and compare ROC curves.
      ]. A multivariate quantile regression model was used to build the age prediction model using the quantreg R package [

      Koenker R., Portnoy S., Ng P., Zeileis A., Grosjean P., Ripley B. Package quantreg: Quantile Regression. 2015.

      ]. Cross-validation of the prediction models was performed with a k-fold cross-validation (k = 10) using the cvTools R package [

      Alfons A. Package cvTools: Cross-validation tools for regression models. 2015.

      ]. The corresponding predictive accuracy was measured with the following performance metrics: sensitivity, specificity, area under the curve (AUC) and percentage of correct classifications for tissue prediction; and the median absolute error (MAE), the mean absolute error (MAEmean), the root-mean-square error (RMSE) and percentage of correct classifications within the prediction intervals ( %CP±PI) for age prediction. The representation of predicted versus chronological age was made using the ggplot2 R package [

      Wickham H., Chang W. Package ggplot2: An implementation of the grammar of graphics. 2015.

      ]. All statistical analyses were carried out using R software v.4.0.3 [
      • Team R Core. R
      A Language and Environment for Statistical Computing.
      ] with scripts developed in-house. The sensitivity analysis was carried out using input DNA quantities for bisulfite conversion of 100 ng, 75 ng, 50 ng, 25 ng, 10 ng and 1 ng.

      3. Results

      3.1 Selection of candidate CpGs

      The selection of candidate CpGs was divided into tissue-specific CpGs and age-correlated CpGs.
      For selection of tissue-specific CpGs, the GSE48472 dataset was assessed [
      • Slieker R.C.
      • Bos S.D.
      • Goeman J.J.
      • Bovée J.V.M.G.
      • Talens R.P.
      • Breggen R.
      • Van Der
      • et al.
      Identification and systematic annotation of tissue-specific differentially methylated regions using the Illumina 450k array.
      ]. From this dataset, samples from saliva (N = 5), buccal cells (N = 5) and blood (N = 5) were selected and differences in the corresponding DNA methylation values calculated. A total of 17 CpG sites with the highest differences in DNA methylations levels were found (Table 1): 5 CpGs presenting the highest differences between blood and buccal cells (>|0.72|); 6 CpGs between blood and saliva (>|0.45|) and 6 CpGs between saliva and buccal cells (≥|0.5|).
      Table 1Summary of the 17 selected tissue-specific CpG sites based on the statistical assessment of GSE48472. CpG sites correlated with age (rs >|0.5|) are marked in bold.
      Tissue’s comparisonGeneCpG_IDGRCh38 chromosome positionDifferences between pairs of tissuesCorrelation with age (rs blood)Correlation with age (rs buccal cells)Correlation with age (rs saliva)
      Blood-Buccal cellsRUNX1cg04915566chr21:350491750.7230.0230.006-0.309
      MAML2cg08141395chr11:962542180.7480.0060.043-0.306
      RGS1cg10861751chr1:1925755860.733-0.0240.055-0.266
      EXD3cg13408086chr9:1373269450.7240.609-0.337-0.481
      NCKAP1Lcg16509569chr12:544978500.721-0.019-0.190-0.318
      Blood-SalivaCDC25Bcg02737268chr20:37995350.4830.287-0.0790.439
      DOT1Lcg04173586chr19:21674970.4590.0010.1290.408
      RIN3cg15443535chr14:926879720.476-0.1520.4110.399
      nonecg16149628chr11:17713440.4710.0230.1660.270
      RIN2cg16606773chr20:199751620.459-0.179-0.153-0.049
      WDFY1cg23363263chr2:2238872720.452-0.102-0.0430.218
      Saliva-Buccal cellsnonecg01680010chr7:970178050.5000.148-0.607-0.079
      nonecg02939659chr14:1015877330.500-0.0480.4720.389
      HUNKcg03044684chr21:318757190.503-0.0650.0550.247
      PAX9cg07459252chr14:366610070.5020.334-0.104-0.106
      nonecg08466792chr5:36031130.5120.575-0.362-0.378
      SIM2cg25446076chr21:367108490.5230.379-0.349-0.332
      Once the markers had been selected, absence of correlation with age was evaluated using the following datasets: GSE92767 (saliva) [
      • Hong S.R.
      • Jung S.E.
      • Lee E.H.
      • Shin K.J.
      • Yang W.I.
      • Lee H.Y.
      DNA methylation-based age prediction from saliva: high age predictability by combination of 7 CpG markers.
      ], GSE50586 (buccal cells) [
      • Jones M.J.
      • Farré P.
      • Mcewen L.M.
      • Macisaac J.L.
      • Watt K.
      • Neumann S.M.
      • et al.
      Distinct DNA methylation patterns of cognitive impairment and trisomy 21 in down syndrome.
      ] and GSE87571 (blood) [
      • Johansson A.
      • Enroth S.
      • Gyllensten U.
      Continuous aging of the human DNA methylome throughout the human lifespan.
      ]. From the 17 selected tissue-specific CpGs, three displayed correlations with age (rs >|0.5|): cg01680010 (rs =−0.607) in buccal cells and cg13408086 (rs =0.609) and cg08466792 (rs =0.575) in blood, so were discarded. Based on these results, one CpG site per tissue combination was selected. This selection was initially based on the highest difference displayed by the DNA methylation values observed in pairs of tissues. However, several failures in PCR primer design led to a final selection of cg04915566 (RUNX1) for blood-buccal cells, cg16606773 (RIN2) for blood-saliva and cg03044684 (HUNK) for saliva-buccal cells.
      Selection of age-correlated CpGs was based on the assessment of DNA methylation values from GSE92767 (saliva samples, N = 54, 18–73 years old) [
      • Hong S.R.
      • Jung S.E.
      • Lee E.H.
      • Shin K.J.
      • Yang W.I.
      • Lee H.Y.
      DNA methylation-based age prediction from saliva: high age predictability by combination of 7 CpG markers.
      ]. In order to select the method to be used for marker selection, normality was evaluated for GSE92767 data, obtaining that 15 % of the residuals of the models (independent linear regression models for each CpG) presented a lack of normality (p-value < 0.05), therefore, the Spearman test was used. For this analysis, CpG sites presenting a Spearman correlation coefficient equal to or greater than |0.8| were selected, providing 49 CpG sites correlated with age (Supplementary Table S1). From this preliminary set of sites, those CpGs with a minimum difference of 0.3 between the highest and lowest methylation values were selected, to give ten candidate CpGs (Table 2) for age prediction in saliva.
      Table 2Summary of the ten selected CpG sites correlated with age in saliva, based on the statistical assessment of GSE92767, as well as the 3 selected age correlated CpG sites in somatic tissues based on bibliographic review.
      GeneCpG_IDGRCh38 chromosome positionCorrelation with age (rs)Methylation differences at extreme ages
      GSE92767 assessment
      OTUD7Acg04875128chr15:314836920.8600.312
      FHL2cg06639320chr2:1053992820.8240.322
      TRIM59cg07553761chr3:1604501890.8030.305
      RHBDL2cg10500653chr1:389419790.8140.334
      nonecg10501210chr1:207823675-0.8640.674
      nonecg10804656chr10:223345310.8280.317
      LHFPL4cg11084334chr3:95525800.8460.345
      nonecg13327545chr10:223346190.8180.302
      ELOVL2cg16867657chr6:110446440.8980.385
      HOXC4cg18473521chr12:540544810.8240.389
      Bibliographic review
      PDE4Cnonechr19:18233131nana
      EDARADDcg09809672chr1:236394382nana
      ASPAcg02228185chr17:3476273nana
      As the statistical analysis for selection of age correlated CpG sites was based on saliva samples, but the study also covered buccal cells; a bibliographic search to find genes that show correlation with age in additional somatic tissues was carried out. In the reviewed publications, certain markers were repeatedly found to correlate with age in the tissues of interest (saliva, buccal cells and blood): PDE4C [
      • Weidner C.I.
      • Lin Q.
      • Koch C.M.
      • Eisele L.
      • Beier F.
      • Ziegler P.
      • et al.
      Aging of blood can be tracked by DNA methylation changes at just three CpG sites.
      ,
      • Freire-Aradas A.
      • Phillips C.
      • Mosquera-Miguel A.
      • Girón-Santamaría L.
      • Gómez-Tato A.
      • Casares De Cal M.
      • et al.
      Development of a methylation marker set for forensic age estimation using analysis of public methylation data and the Agena Bioscience EpiTYPER system.
      ,
      • Eipel M.
      • Mayer F.
      • Arent T.
      • Ferreira M.R.P.
      • Birkhofer C.
      • Gerstenmaier U.
      • et al.
      Epigenetic age predictions based on buccal swabs are more precise in combination with cell type-specific DNA methylation signatures.
      ,
      • Bekaert B.
      • Kamalandua A.
      • Zapico S.C.
      • Voorde W.Van De
      • Bekaert B.
      Improved age determination of blood and teeth samples using a selected set of DNA methylation markers.
      ,
      • Marqueta-Gracia J.J.
      • Álvarez-Álvarez M.
      • Baeta M.
      • Palencia-Madrid L.
      • Prieto-Fernández E.
      • Ordoñana R.J.
      • et al.
      Genetics differentially methylated CpG regions analyzed by PCR-high resolution melting for monozygotic twin pair discrimination.
      ], EDARADD [
      • Bocklandt S.
      • Lin W.
      • Sehl M.E.
      • Sa F.J.
      • Sinsheimer J.S.
      • Horvath S.
      • et al.
      Epigenetic predictor of age.
      ,
      • Bekaert B.
      • Kamalandua A.
      • Zapico S.C.
      • Voorde W.Van De
      • Bekaert B.
      Improved age determination of blood and teeth samples using a selected set of DNA methylation markers.
      ,
      • Hamano Y.
      • Manabe S.
      • Morimoto C.
      • Fujimoto S.
      • Tamaki K.
      Forensic age prediction for saliva samples using methylation-sensitive high resolution melting: exploratory application for cigarette butts.
      ] and ASPA [
      • Weidner C.I.
      • Lin Q.
      • Koch C.M.
      • Eisele L.
      • Beier F.
      • Ziegler P.
      • et al.
      Aging of blood can be tracked by DNA methylation changes at just three CpG sites.
      ,
      • Freire-Aradas A.
      • Phillips C.
      • Mosquera-Miguel A.
      • Girón-Santamaría L.
      • Gómez-Tato A.
      • Casares De Cal M.
      • et al.
      Development of a methylation marker set for forensic age estimation using analysis of public methylation data and the Agena Bioscience EpiTYPER system.
      ,
      • Eipel M.
      • Mayer F.
      • Arent T.
      • Ferreira M.R.P.
      • Birkhofer C.
      • Gerstenmaier U.
      • et al.
      Epigenetic age predictions based on buccal swabs are more precise in combination with cell type-specific DNA methylation signatures.
      ,
      • Bekaert B.
      • Kamalandua A.
      • Zapico S.C.
      • Voorde W.Van De
      • Bekaert B.
      Improved age determination of blood and teeth samples using a selected set of DNA methylation markers.
      ,
      • Huang Y.
      • Yan J.
      • Hou J.
      • Fu X.
      • Li L.
      • Hou Y.
      Developing a DNA methylation assay for human age prediction in blood and bloodstain.
      ,
      • Marqueta-Gracia J.J.
      • Álvarez-Álvarez M.
      • Baeta M.
      • Palencia-Madrid L.
      • Prieto-Fernández E.
      • Ordoñana R.J.
      • et al.
      Genetics differentially methylated CpG regions analyzed by PCR-high resolution melting for monozygotic twin pair discrimination.
      ], and therefore these genes were the focus of further evaluation in our study (included in Table 2).

      3.2 Development of an optimized multiplex

      From the above analyses, 16 markers were selected: 3 tissue-specific CpG sites (RUNX1, RIN2 and HUNK) and 13 age-correlated CpG sites (OTUD7A, FHL2, TRIM59, RHBDL2, cg10501210, cg10804656, LHFPL4, cg13327545, ELOVL2, HOXC4, PDE4C, EDARADD and ASPA). PCR and SBE primers were successfully designed for the selected tissue-specific markers, and 11 age-correlated CpGs, with RHBDL2 and cg10804656 discarded from subsequent analyses. A summary of PCR and SBE primer information is outlined in Supplementary Table S2.
      First, each marker was analyzed in singleplex to check for individual amplification performance. Once this initial step was accomplished, a multiplex covering all 14 CpGs was optimized. Markers TRIM59 and cg13327545 were not amplified in multiplex due to non-specific hybridizations leading to the final optimized multiplex of RUNX1, RIN2, HUNK tissue-specific markers plus OTUD7A, FHL2, cg10501210, LHFPL4, ELOVL2, HOXC4, PDE4C, EDARADD, ASPA age-correlated CpG sites. An example SNaPshot electropherogram of the optimized multiplex is shown in Fig. 1.
      Fig. 1
      Fig. 1Example electropherogram of the optimized SNaPshot™ multiplex assay containing 3 tissue-specific and nine age correlated CpG sites using 100 ng of genomic DNA.
      To check tissue-specificity and age correlation of the optimized marker set, a preliminary analysis using both saliva and buccal cells from two individuals of age extremes (23 and 86 years old) was completed (Supplementary Table S3). Tissue specificity was not detected for RIN2, since the same absence of methylation pattern was observed for both saliva and buccal cells. To check that the absence of methylation was not a technical problem, and since RIN2 was selected for detecting differences between blood and saliva, blood samples from the same individuals were tested and detected DNA methylation levels of 0.39 and 0.4, respectively.
      In the case of RUNX1, some dispersion was detected in the patterns displayed by age or tissue, preventing objective interpretation with this marker. However, HUNK had differences in average DNA methylation levels between both tissues (0.31 and 0.17 for saliva and buccal cells, respectively).
      For age-correlation, six of the nine CpG sites (cg10501210, LHFPL4, ELOVL2, PDE4C, ASPA and EDARADD) gave average DNA methylation difference between extreme ages equal to or higher than 0.19. Differences were displayed by HOXC4, OTUD7A and FHL2 at lower levels (0.1, 0.05 and 0.12, respectively).

      3.3 A statistical tissue prediction model

      The training set comprising 91 saliva samples and 93 buccal swabs was analyzed with the optimized multiplex to develop a tissue prediction model for saliva and buccal cells. The corresponding dispersion diagrams for HUNK and RUNX1 markers is shown in Fig. 2. Dispersion correlated with the tissue-of-origin is observed, with higher methylation levels for HUNK in saliva samples, and for RUNX1 in buccal cells.
      Fig. 2
      Fig. 2Dispersion diagrams (DNA methylation values in front of chronological age) for HUNK and RUNX1 (tissue-specific CpG sites) for 184 individuals from 21 to 86 years old (N = 91 saliva and N = 93 buccal swabs).
      In order to predict tissue of origin, logistic regression was applied exploring three different models: model 1 (HUNK plus RUNX1), model 2 (HUNK) and model 3 (RUNX1). The corresponding performance metrics are described in Table 3. Comparable AUC values of 0.95, 0.95 and 0.92 for model 1, 2 and 3, respectively were obtained. Similar percentage of correct classifications was also recorded, with model 1 having the highest value at 88.6 %. However, some differences were found with sensitivity and specificity values, considering buccal cell samples as the control (i.e., a high specificity indicates good classification of buccal cell samples and a high sensitivity good classification of saliva samples). Model 1 gave a higher sensitivity (0.96) compared to specificity (0.82). Therefore, model 1 results show that saliva samples classify better than swab samples. In contrast, model 2 gave a sensitivity of 0.78 and a specificity of 0.96; model 3 gave a sensitivity of 0.81 and a specificity of 0.9. Therefore, single marker models classify saliva samples less efficiently than buccal swab samples.
      Table 3Summary of the predictive performance metrics for the three logistic models tested on the training set (N = 91 saliva and N = 93 buccal swabs, 21–86 years old). AUC: Area under the curve.
      ModelCpG_IDGenAUCSensitivitySpecificityCorrect classifications
      Model 1cg03044684 & cg04915566HUNK & RUNX10.950.960.8288.59 %
      Model 2cg03044684HUNK0.950.780.9686.87 %
      Model 3cg04915566RUNX10.920.810.9085.87 %
      Considering the highest rate of correct classifications obtained (88.59 %), model 1 was selected for validation with a testing set of 184 samples (N = 93 saliva and N = 91 buccal cells. A correct tissue-of-origin prediction rate of 83.7 % for test set samples was obtained.

      3.4 A statistical age prediction model for saliva and buccal swab samples

      Tissue-independent as well as tissue-combined models were explored for age prediction. For the saliva-specific and buccal swab-specific age prediction models, 184 saliva and 184 buccal swab samples were used as training sets, respectively. The training set of 184 volunteers (N = 91 saliva and N = 93 buccal swabs) was used to develop the combined age prediction model for saliva and buccal cell samples. Dispersion plots in Fig. 3 indicate the patterns obtained for the cg10501210, LHFPL4, ELOVL2, PDE4C, HOXC4, OTUD7A, FHL2, ASPA and EDARADD markers adopted. Six markers showed hypermethylation with increased age (LHFPL4, ELOVL2, PDE4C, HOXC4, OTUD7A and FHL2); while cg10501210, ASPA and EDARADD had decreasing methylation levels with increasing age. If considering both tissues combined (saliva and buccal swabs), the highest correlation with age was found in PDE4C (rs =0.806) and LHFPL4 (rs =0.805), followed by ELOVL2 (rs =0.659), OTUD7A (rs =0.642), EDARADD (rs =−0.572) and HOXC4 (rs =0.569). However, low levels of correlation were detected in cg10501210, FHL2 and ASPA (rs =−0.313, 0.198 and −0.332, respectively). At the same time, these three markers showed the highest levels of dispersion between saliva and buccal cells, (SD> 0.1). If taking into account both tissues independently, correlations followed a similar trend (Supplementary Fig. S1-S2). Whereas the highest age correlation was displayed by LHFPL4 and PDE4C (rs =0.815 and 0.832 in saliva and buccal swabs, respectively), the lowest levels of correlation were observed in cg10501210 (rs =−0.429, −0.422), FHL2 (rs =0.392, 0.231) and ASPA (rs =−0.521, −0.44).
      Fig. 3
      Fig. 3Dispersion diagrams (DNA methylation values in front of chronological age) for cg10501210, LHFPL4, ELOVL2, PDE4C, HOXC4, OTUD7A, FHL2, ASPA and EDARADD (age correlated CpG sites) for 184 individuals from 21–86 years old (N = 91 saliva and N = 93 buccal swabs).
      Taking into account these observations, multivariate quantile regression was tested on several age prediction models consisted of different combinations of CpG sites: model 1 (9 CpGs: cg10501210, LHFPL4, ELOVL2, PDE4C, HOXC4, OTUD7A, FHL2, ASPA and EDARADD), model 2 (8 CpGs with ASPA excluded), model 3 (8 CpGs with FHL2 excluded), model 4 (8 CpGs with cg10501210 excluded), model 5 (7 CpGs with cg10501210 and FHL2 excluded), model 6 (7 CpGs with cg10501210 and ASPA excluded), model 7 (7 CpGs with FHL2 and ASPA excluded) and model 8 (6 CpGs with cg10501210, FHL2 and ASPA excluded). To evaluate the accuracy of the models, a k-fold cross-validation was carried out. The “k-fold” divides the total number of individuals into groups of similar sizes, in this case, 10 groups were created, each containing 10 % of the subjects. Each model was tested for each of the clusters, therefore, each time one of the clusters was selected as a test set, it faced the remaining nine that make up the training set. The corresponding performance metrics for the training sets are described in Table 4.
      Table 4Summary of predictive performance metrics for the eight multivariate quantile regression models tested, based on three training sets: the saliva training set (N = 184 saliva, 21–86 years old), the buccal swab training set (N = 184 buccal swabs, 21–86 years old) and the combined training set (N = 91 saliva and N = 93 buccal swabs, 21–86 years old). All data represent the k-fold cross-validation. The selected model, based on the best balance between error and correct classification, is marked in bold. MAE: median absolute error, MAEmean: mean absolute error, RMSE: root-mean-square error and %CP±PI: percentage of correct classifications within the prediction intervals.
      TissueModelCpG numberMAEMAEmeanRMSE%CP±PI
      SalivaModel 19 CpGs±3.17±4.796.4676.55 %
      Model 28 CpGs with ASPA excluded±2.98±4.666.477.11 %
      Model 38 CpGs with FHL2 excluded±3.29±4.766.4775.49 %
      Model 48 CpGs with cg10501210 excluded±3.79±5.046.7675.59 %
      Model 57 CpGs with cg10501210 and FHL2 excluded±3.85±5.176.9376.61 %
      Model 67 CpGs with cg10501210 and ASPA excluded±3.96±4.976.6974.45 %
      Model 77 CpGs with FHL2 and ASPA excluded±3.31±4.696.3778.74 %
      Model 86 CpGs with cg10501210, FHL2 and ASPA excluded±4.02±5.106.9178.74 %
      Buccal swabModel 19 CpGs±3.85±5.016.3575.47 %
      Model 28 CpGs with ASPA excluded±4.41±5.096.4574.91 %
      Model 38 CpGs with FHL2 excluded±4.13±4.906.2475.53 %
      Model 48 CpGs with cg10501210 excluded±4.45±5.276.6676.64 %
      Model 57 CpGs with cg10501210 and FHL2 excluded±4.89±5.526.9474.42 %
      Model 67 CpGs with cg10501210 and ASPA excluded±4.22±5.156.6373.86 %
      Model 77 CpGs with FHL2 and ASPA excluded±4.16±4.996.3675.47 %
      Model 86 CpGs with cg10501210, FHL2 and ASPA excluded±4.72±5.436.8676.64 %
      Combined (saliva and buccal swabs)Model 19 CpGs±3.31±4.576.0674.38 %
      Model 28 CpGs with ASPA excluded±3.66±4.756.2074.99 %
      Model 38 CpGs with FHL2 excluded±3.67±4.786.3274.36 %
      Model 48 CpGs with cg10501210 excluded±3.78±5.056.5277.72 %
      Model 57 CpGs with cg10501210 and FHL2 excluded±4.18±5.436.9677.28 %
      Model 67 CpGs with cg10501210 and ASPA excluded±3.77±4.936.3980.96 %
      Model 77 CpGs with FHL2 and ASPA excluded±3.54±4.796.2376.08 %
      Model 86 CpGs with cg10501210, FHL2 and ASPA excluded±5.23±5.937.5474.06 %
      Inter-training set comparisons show that the correct classification rates are similar among them ( %CP±PI: 76.66 %, 75.37 % and 76.23 %, for saliva, buccal swab and the combined model, respectively). However, more remarkable differences were found when comparing prediction errors, especially between the buccal swab-specific and the combined model (average MAE: ± 3.89 and ± 4.35, respectively). Nevertheless, the saliva-specific model showed a better prediction error than the combined model (average MAE: ± 3.55). Based on these results, and due to the fact that many forensic specimens will comprise a mixture of saliva and buccal cells with different cell proportions, e.g., cigarette butts, the corresponding age prediction model to be developed was selected to cover both tissues simultaneously (combined model).
      Intra-training set comparisons of the combined model showed that the highest error and lowest correct classification rate were obtained with model 8 (MAE: ± 5.23, RMSE: 7.54 and %CP±PI: 74.06 %), which lacks the 3 CpG sites with the lowest levels of correlation with age and highest dispersion between saliva and buccal cells (cg10501210, FHL2 and ASPA). When including these CpG sites (model 1), error decreased (MAE: ± 3.31) but the correct classification rate is only marginally improved (74.38 %). Among all models tested, the best balance between error and correct classification was obtained with model 7, which excludes FHL2 and ASPA (MAE: ± 3.54, RMSE: 6.23 and %CP±PI: 76.08 %). Subsequently, we selected the age prediction model for saliva and buccal cells based on CpGs cg10501210, LHFPL4, ELOVL2, PDE4C, HOXC4, OTUD7A and EDARADD. Predicted versus chronological age is plotted for the final 7-CpG age prediction model in Fig. 4. The quantiles 0.5, 0.1 and 0.9 are represented by a black line and dashed dark red lines, respectively and the gray line represents perfect correlation. The Fig. 4 plot shows that the 0.5 quantile line is more separated in older ages, possibly due to the low number of samples available for this age range. The non-parallel prediction intervals also show the reduced precision in the highest age ranges.
      Fig. 4
      Fig. 4Predicted versus chronological age for the final age prediction model for saliva and buccal cells for A) the training set composed of 184 individuals from 21–86 years old (N = 91 saliva and N = 93 buccal swabs) and for B) the testing set composed of 184 samples from 21–86 years old (N = 93 saliva and N = 91 buccal swabs). Predictions were performed under multivariate quantile regression using seven markers: cg10501210, LHFPL4, ELOVL2, PDE4C, HOXC4, OTUD7A and EDARADD. The black diagonal line represents the 0.5 quantile and the discontinuous dark red lines the corresponding 0.1 and 0.9 quantiles. The gray line represents perfect correlation. The data represent the k-fold cross-validation.
      As well as cross-validation, an additional validation step consisted of a testing set of 184 samples (N = 93 saliva and N = 91 buccal swabs) ranging from 21–86 years old that were analyzed using the final age prediction model, providing an MAE of ± 3.66 % and 71.2 % correct classifications. The final online age prediction model developed in our study has now been placed in the open‐access Snipper forensic classification website and is freely available at: http://mathgene.usc.es/cgi-bin/snps/age_tools/processmethylation-saliva-buccalswab.cgi. The underlying model equations for predicted age and prediction intervals are the following:
      Predicted age in years = 29·33 - (50·52 x cg10501210) + (9·23 x LHFPL4) + (36·46 x ELOVL2) + (74·32 x PDE4C) + (11·23 x HOXC4) + (84·74 x OTUD7A) - (15·03 x EDARADD


      Minimum Prediction (MinPred – q10) = 29·36 - (42·87 x cg10501210) + (15·41 x LHFPL4) + (11·09 x ELOVL2) + (74·17 x PDE4C) + (32·51 x HOXC4) + (29·13 x OTUD7A) - (20·54 x EDARADD


      Maximum Prediction (MaxPred – q90) = 11·3 - (43·57 x cg10501210) + (20·74 x LHFPL4) + (54·72 x ELOVL2) + (78·25 x PDE4C) - (7·06 x HOXC4) + (179·95 x OTUD7A) + (4·16 x EDARADD


      Once the age prediction model was generated, the possibility that the tissue could be considered as an additional variable was evaluated. To assess this, the prediction model was generated again by adding the tissue-of-origin of each of the samples in the training set as a co-variable. For this extended model, an MAE of ± 3.84 years, RMSE of 6.31 and %CP±PI of 78.22 % was obtained after cross-validation. Next, in order to evaluate the test set, the 2-CpG prediction model was used to predict the tissue-of-origin of the test samples. Adding the inferred tissue to the test set, produced an MAE of ± 3.78 years, RMSE of 6.6 and %CP±PI of 70.11 %. Comparing these results with those obtained when using the model without tissue source prediction, indicates the tissue as a co-variable does not improve the model.

      3.5 Forensic validation of the age prediction model

      To evaluate the predictive tests developed for the analysis of typical forensic samples with degradation and/or low-level DNA, the robustness and sensitivity of the final model were assessed.
      A chain of models was generated by deleting one of the CpGs included in the final model, simulating random loss of one of the markers. For each of the six CpGs models generated, the training set was evaluated by cross-validation and the test set. Results are outlined in Supplementary Table S4. This analysis identified those markers with the strongest contribution to the final age prediction model. The exclusion of cg10501210 increased the MAE to ± 5.23 years in the cross-validation of the training set, and exclusion of PDE4C increased the classification error to ± 4.03 years in the testing set. Excluding the four other markers did not greatly affect the errors obtained compared to the full 7-CpG model, so the impact of their loss is minimal.
      Lastly, bisulfite conversion was performed using 100 ng of genomic DNA. To evaluate if lower quantities of input DNA could produce results of comparable quality, serial dilutions were tested on two individuals (23 and 79 years old) for both saliva and buccal cells, using input DNA quantities for bisulfite conversion of 100 ng, 75 ng, 50 ng, 25 ng, 10 ng and 1 ng. The corresponding DNA methylation values and predicted ages are listed in Supplementary Table S5. To evaluate the differences detected in DNA methylation values between input DNAs, the standard deviation (SD) was used for comparisons (Supplementary Table S6). No standard deviations higher than 0.1 were observed in any of the markers up to 10 ng. For 1 ng only 4 markers presented a higher deviation than 0.1: ELOVL2 (SD=0.19 and SD=0.20) in two of the four samples analyzed, RUNX1 (SD=0.20) in one sample, cg10501210 (SD=0.13) and HOXC4 also in single samples (SD=0.16).

      4. Discussion

      Individual age estimation has been a topic of great interest in forensic genetics for the last years. DNA methylation has become the biomarker of choice for inferring this characteristic [
      • Vidaki A.
      • Daniel B.
      • Court D.S.
      Forensic DNA methylation profiling — potential opportunities and challenges.
      ], with prediction models published using several techniques [
      • Zbieć-Piekarska R.
      • Sólnicka M.
      • Kupiec T.
      • Parys-proszek A.
      • Makowska Z.
      • Paleczka A.
      • et al.
      Development of a forensically useful age prediction method based on DNA methylation analysis.
      ,
      • Freire-Aradas A.
      • Phillips C.
      • Mosquera-Miguel A.
      • Girón-Santamaría L.
      • Gómez-Tato A.
      • Casares De Cal M.
      • et al.
      Development of a methylation marker set for forensic age estimation using analysis of public methylation data and the Agena Bioscience EpiTYPER system.
      ,
      • Schwender K.
      • Holländer O.
      • Klopfleisch S.
      • Eveslage M.
      • Danzer M.F.
      • Pfeiffer H.
      • et al.
      Development of two age estimation models for buccal swab samples based on 3 CpG sites analyzed with pyrosequencing and minisequencing.
      ,
      • Hamano Y.
      • Manabe S.
      • Morimoto C.
      • Fujimoto S.
      • Tamaki K.
      Forensic age prediction for saliva samples using methylation-sensitive high resolution melting: exploratory application for cigarette butts.
      ,
      • Hong S.R.
      • Shin K.
      • Jung S.
      • Lee E.H.
      • Lee H.Y.
      Platform-independent models for age prediction using DNA methylation data.
      ,
      • Alghanim H.
      • Balamurugan K.
      • Mccord B.
      Development of DNA methylation markers for sperm, saliva and blood indentification using pyrosequencing and qPCR/HRM.
      ] and different tissues [
      • Weidner C.I.
      • Lin Q.
      • Koch C.M.
      • Eisele L.
      • Beier F.
      • Ziegler P.
      • et al.
      Aging of blood can be tracked by DNA methylation changes at just three CpG sites.
      ,
      • Zbieć-Piekarska R.
      • Sólnicka M.
      • Kupiec T.
      • Parys-proszek A.
      • Makowska Z.
      • Paleczka A.
      • et al.
      Development of a forensically useful age prediction method based on DNA methylation analysis.
      ,
      • Freire-Aradas A.
      • Phillips C.
      • Mosquera-Miguel A.
      • Girón-Santamaría L.
      • Gómez-Tato A.
      • Casares De Cal M.
      • et al.
      Development of a methylation marker set for forensic age estimation using analysis of public methylation data and the Agena Bioscience EpiTYPER system.
      ,
      • Aliferi A.
      • Ballard D.
      • Gallidabino M.D.
      • Thurtle H.
      • Barron L.
      • Syndercombe-Court D.
      DNA methylation-based age prediction using massively parallel sequencing data and multiple machine learning models.
      ,
      • Eipel M.
      • Mayer F.
      • Arent T.
      • Ferreira M.R.P.
      • Birkhofer C.
      • Gerstenmaier U.
      • et al.
      Epigenetic age predictions based on buccal swabs are more precise in combination with cell type-specific DNA methylation signatures.
      ,
      • Jung S.-E.
      • Min S.
      • Rom S.
      • Hee E.
      • Shin K.
      • Young H.
      DNA methylation of the ELOVL2, FHL2, KLF14, C1orf132/MIR29B2C, and TRIM59 genes for age prediction from blood, saliva, and buccal swab samples.
      ,
      • Schwender K.
      • Holländer O.
      • Klopfleisch S.
      • Eveslage M.
      • Danzer M.F.
      • Pfeiffer H.
      • et al.
      Development of two age estimation models for buccal swab samples based on 3 CpG sites analyzed with pyrosequencing and minisequencing.
      ,
      • Hong S.R.
      • Jung S.E.
      • Lee E.H.
      • Shin K.J.
      • Yang W.I.
      • Lee H.Y.
      DNA methylation-based age prediction from saliva: high age predictability by combination of 7 CpG markers.
      ,
      • Lee W.J.
      • Choung C.M.
      • Jung Y.J.
      • Lee H.Y.
      • Lim S.-K.
      A validation study of DNA methylation-based age prediction using semen in forensic casework samples.
      ,
      • Jenkins T.G.
      • Aston K.I.
      • Cairns B.
      • Smith A.
      • Carrell D.T.
      Paternal germ line aging: DNA methylation age prediction from human sperm.
      ,
      • Bekaert B.
      • Kamalandua A.
      • Zapico S.C.
      • Voorde W.Van De
      • Bekaert B.
      Improved age determination of blood and teeth samples using a selected set of DNA methylation markers.
      ,
      • Lee H.Y.
      • Hong S.R.
      • Lee J.E.
      • Hwang I.K.
      • Kim N.Y.
      • Lee J.M.
      • et al.
      Epigenetic age signatures in bones.
      ], although most of them have focused on blood samples. Other tissues of relevance for forensic DNA analysis, and for which age prediction models are beginning to be developed are saliva and buccal cells [
      • Eipel M.
      • Mayer F.
      • Arent T.
      • Ferreira M.R.P.
      • Birkhofer C.
      • Gerstenmaier U.
      • et al.
      Epigenetic age predictions based on buccal swabs are more precise in combination with cell type-specific DNA methylation signatures.
      ,
      • Jung S.-E.
      • Min S.
      • Rom S.
      • Hee E.
      • Shin K.
      • Young H.
      DNA methylation of the ELOVL2, FHL2, KLF14, C1orf132/MIR29B2C, and TRIM59 genes for age prediction from blood, saliva, and buccal swab samples.
      ,
      • Schwender K.
      • Holländer O.
      • Klopfleisch S.
      • Eveslage M.
      • Danzer M.F.
      • Pfeiffer H.
      • et al.
      Development of two age estimation models for buccal swab samples based on 3 CpG sites analyzed with pyrosequencing and minisequencing.
      ,
      • Hong S.R.
      • Jung S.E.
      • Lee E.H.
      • Shin K.J.
      • Yang W.I.
      • Lee H.Y.
      DNA methylation-based age prediction from saliva: high age predictability by combination of 7 CpG markers.
      ]. Cellular composition of saliva and buccal swab samples has been shown to be different, with saliva composed of a majority of leukocytes and buccal swab samples of epithelial cells [
      • Theda C.
      • Hwang S.H.
      • Czajko A.
      • Loke Y.J.
      • Leong P.
      • Craig J.M.
      Quantitation of the cellular content of saliva and buccal swab samples.
      ]. However, it has also been observed in previous studies that cellular proportions can vary greatly between individuals, with the saliva samples containing a variable quantity of leukocytes in the range 16–95 %, and buccal swab samples between 5 %–65 % [
      • Eipel M.
      • Mayer F.
      • Arent T.
      • Ferreira M.R.P.
      • Birkhofer C.
      • Gerstenmaier U.
      • et al.
      Epigenetic age predictions based on buccal swabs are more precise in combination with cell type-specific DNA methylation signatures.
      ,
      • Thiede C.
      • Prange-Krex G.
      • Freiberg-Richter J.
      • Bornhäuser M.
      • Ehninger G.
      Buccal swabs but not mouthwash samples can be used to obtain pretransplant DNA fingerprints from recipients of allogeneic bone marrow transplants.
      ]. Taking this into consideration, an initial step of the present study was to develop a prediction model in order to infer the tissue of origin.
      The selected tissue prediction markers have not been previously reported. Each marker was selected considering differences between pairs of tissues: saliva versus buccal cells (HUNK), blood versus buccal cells (RUNX1) and blood versus saliva (RIN2). From these three candidate markers, RIN2 showed no variation in the DNA methylation patterns for the tissues of interest (saliva and buccal swabs) and therefore, was discarded from subsequent analyses. In contrast, differences were distinct for RUNX1 and particularly HUNK (Fig. 2), with this pair showing opposite trends in DNA methylation levels (average DNA methylation: 0.576 for saliva versus 0.199 for buccal cells in HUNK, and 0.297 for saliva versus 0.670 for buccal cells in RUNX1). Likely due to the possible variations in the composition of the tissues collected (higher percentages of leukocytes or epithelial cells) [
      • Eipel M.
      • Mayer F.
      • Arent T.
      • Ferreira M.R.P.
      • Birkhofer C.
      • Gerstenmaier U.
      • et al.
      Epigenetic age predictions based on buccal swabs are more precise in combination with cell type-specific DNA methylation signatures.
      ], in some samples, differences were also observed within the same tissue, for example RUNX1 gave differences up to 0.34 between some saliva samples. Additional cell-specific markers such as CD6, SERPINB5 [
      • Eipel M.
      • Mayer F.
      • Arent T.
      • Ferreira M.R.P.
      • Birkhofer C.
      • Gerstenmaier U.
      • et al.
      Epigenetic age predictions based on buccal swabs are more precise in combination with cell type-specific DNA methylation signatures.
      ] and PTPN7 [
      • Hong S.R.
      • Jung S.E.
      • Lee E.H.
      • Shin K.J.
      • Yang W.I.
      • Lee H.Y.
      DNA methylation-based age prediction from saliva: high age predictability by combination of 7 CpG markers.
      ] have been reported in other studies. The selection of these different markers could be due to screens made of alternative datasets. For the selection of CD6 and SERPINB5, Eipel et. al used datasets GSE50586 [
      • Jones M.J.
      • Farré P.
      • Mcewen L.M.
      • Macisaac J.L.
      • Watt K.
      • Neumann S.M.
      • et al.
      Distinct DNA methylation patterns of cognitive impairment and trisomy 21 in down syndrome.
      ] and GSE39981 [
      • Accomando W.P.
      • Wiencke J.K.
      • Houseman E.A.
      • Nelson H.H.
      • Kelsey K.T.
      Quantitative reconstruction of leukocyte subsets using DNA methylation.
      ], the former with data from buccal swab samples and the latter from blood samples. Using these data in combination, they selected CpGs that showed differences according to the tissue of origin. It should be noted that in our case we only used GSE50586 to evaluate whether tissue-specific markers related to buccal cells were correlated with age. On the other hand, the selection of PTPN7 came from the Hong et al. study evaluating DNA methylation differences between blood and buccal cells. In our case, we selected a dataset containing samples of different tissues for each individual [
      • Slieker R.C.
      • Bos S.D.
      • Goeman J.J.
      • Bovée J.V.M.G.
      • Talens R.P.
      • Breggen R.
      • Van Der
      • et al.
      Identification and systematic annotation of tissue-specific differentially methylated regions using the Illumina 450k array.
      ], trying to limit the possible differences between individuals related to the varied cellular proportions in saliva and buccal swab samples.
      HUNK (hormonally up-regulated Neu-associated kinase) is a gene predicted to be involved in intracellular signal transduction and protein phosphorylation, while the protein encoded by RUNX1 (RUNX family transcription factor 1) is involved in the development of normal hematopoiesis. Once both genes were selected as candidate markers for the inference of the tissue-of-origin, logistic regression was an informative system to explore the most accurate combination of markers, i.e., model 1 (HUNK and RUNK1), model 2 (HUNK only) and model 3 (RUNX1 only). The main difference between double CpG-sites and each of the single CpG-site models was the detected imbalance between the sensitivity and specificity. While the 2-CpG-site model had a higher sensitivity than specificity (0.96 versus 0.82, respectively), the opposite was observed for the single-site models (0.78 versus 0.96 for model 2, and 0.81 versus 0.9 for model 3). Selection of the most accurate tissue prediction model was subsequently based on the additional metric of correct classification rate, with model 1 giving the best predictive performance of 88.59. Nevertheless, classifying these types of samples is complicated by the wide range of cellular proportions discussed above, as well as the admixed nature of some forensic specimens, e.g., cigarette butts. Therefore, for the second stage of the reported study, the generation of an age prediction model for oral cavity fluids which covered both saliva and buccal cells, was considered a better strategy than the development of different models for independent tissues. Even so, independent age prediction models for saliva and buccal cells were explored. Although the saliva-specific model showed the most accurate prediction (average MAE: ± 3.55), we decided to focus on the combined model since it will cover the maximum cell proportion variability in most forensic scenarios covering these samples.
      To identify the most accurate age prediction model amongst nine saliva/buccal cell age correlated CpGs, different combinations of CpG sites were explored under multivariate quantile regression analysis testing up to eight different combined models (Table 4). Different age prediction models have been published based on different statistical tools, including linear regression [
      • Zbieć-Piekarska R.
      • Sólnicka M.
      • Kupiec T.
      • Parys-proszek A.
      • Makowska Z.
      • Paleczka A.
      • et al.
      Development of a forensically useful age prediction method based on DNA methylation analysis.
      ,
      • Jung S.-E.
      • Min S.
      • Rom S.
      • Hee E.
      • Shin K.
      • Young H.
      DNA methylation of the ELOVL2, FHL2, KLF14, C1orf132/MIR29B2C, and TRIM59 genes for age prediction from blood, saliva, and buccal swab samples.
      ,
      • Schwender K.
      • Holländer O.
      • Klopfleisch S.
      • Eveslage M.
      • Danzer M.F.
      • Pfeiffer H.
      • et al.
      Development of two age estimation models for buccal swab samples based on 3 CpG sites analyzed with pyrosequencing and minisequencing.
      ,
      • Hong S.R.
      • Jung S.E.
      • Lee E.H.
      • Shin K.J.
      • Yang W.I.
      • Lee H.Y.
      DNA methylation-based age prediction from saliva: high age predictability by combination of 7 CpG markers.
      ,
      • Huang Y.
      • Yan J.
      • Hou J.
      • Fu X.
      • Li L.
      • Hou Y.
      Developing a DNA methylation assay for human age prediction in blood and bloodstain.
      ,
      • Hong S.R.
      • Shin K.
      • Jung S.
      • Lee E.H.
      • Lee H.Y.
      Platform-independent models for age prediction using DNA methylation data.
      ,
      • Wózniak A.
      • Heidegger A.
      • Piniewska-Róg D.
      • Pośpiech E.
      • Xavier C.
      • Pisarek A.
      • et al.
      Development of the VISAGE enhanced tool and statistical models for epigenetic age estimation in blood, buccal cells and bones.
      ], quadratic regression [
      • Bekaert B.
      • Kamalandua A.
      • Zapico S.C.
      • Voorde W.Van De
      • Bekaert B.
      Improved age determination of blood and teeth samples using a selected set of DNA methylation markers.
      ], machine learning [
      • Hong S.R.
      • Shin K.
      • Jung S.
      • Lee E.H.
      • Lee H.Y.
      Platform-independent models for age prediction using DNA methylation data.
      ] and quantile regression [
      • Freire-Aradas A.
      • Phillips C.
      • Mosquera-Miguel A.
      • Girón-Santamaría L.
      • Gómez-Tato A.
      • Casares De Cal M.
      • et al.
      Development of a methylation marker set for forensic age estimation using analysis of public methylation data and the Agena Bioscience EpiTYPER system.
      ]. Although linear regression is the most commonly applied statistical analysis for age prediction, in this study quantile regression was selected, as its main advantage is the ability to provide age-specific prediction intervals, in addition to the predicted age.
      From the CpG combinations tested, the selection of the most accurate age prediction model 7 was based on the best balance between error and the correct classification rate, with an MAE of ± 3.54, RMSE: 6.23 and %CP±PI: 76.08 %. Model 7 comprised CpG sites cg10501210, LHFPL4, ELOVL2, PDE4C, HOXC4, OTUD7A and EDARADD, and discarding FHL2 and ASPA. Their contribution to the tested models is insufficient to improve predictive performance. This was not unexpected given the low age-correlations displayed (0.198 and −0.332, respectively) plus high levels of tissue dispersion (SD= 0.145 and 0.162, respectively).
      Previous age predictors targeting the oral cavity have been developed as tissue-independent models, obtaining prediction errors close to ± 5 years; including, Bocklandt et. al [
      • Bocklandt S.
      • Lin W.
      • Sehl M.E.
      • Sa F.J.
      • Sinsheimer J.S.
      • Horvath S.
      • et al.
      Epigenetic predictor of age.
      ], Eipel et. al [
      • Eipel M.
      • Mayer F.
      • Arent T.
      • Ferreira M.R.P.
      • Birkhofer C.
      • Gerstenmaier U.
      • et al.
      Epigenetic age predictions based on buccal swabs are more precise in combination with cell type-specific DNA methylation signatures.
      ] and Schwender et. al [
      • Schwender K.
      • Holländer O.
      • Klopfleisch S.
      • Eveslage M.
      • Danzer M.F.
      • Pfeiffer H.
      • et al.
      Development of two age estimation models for buccal swab samples based on 3 CpG sites analyzed with pyrosequencing and minisequencing.
      ] with reported MAEs of ± 5.2 (saliva), ± 4.3 years (buccal cells) and ± 5.11 years (buccal cells), respectively. Common to all three studies is the use of just three CpG sites compared to the seven of the present study, which could explain the higher prediction errors observed. Additional tissue-independent models presenting prediction errors similar to the present study such as Hong et. al [
      • Hong S.R.
      • Jung S.E.
      • Lee E.H.
      • Shin K.J.
      • Yang W.I.
      • Lee H.Y.
      DNA methylation-based age prediction from saliva: high age predictability by combination of 7 CpG markers.
      ], Jung et. al [
      • Jung S.-E.
      • Min S.
      • Rom S.
      • Hee E.
      • Shin K.
      • Young H.
      DNA methylation of the ELOVL2, FHL2, KLF14, C1orf132/MIR29B2C, and TRIM59 genes for age prediction from blood, saliva, and buccal swab samples.
      ] and Wozniak et. al [
      • Wózniak A.
      • Heidegger A.
      • Piniewska-Róg D.
      • Pośpiech E.
      • Xavier C.
      • Pisarek A.
      • et al.
      Development of the VISAGE enhanced tool and statistical models for epigenetic age estimation in blood, buccal cells and bones.
      ] with MAEs of ± 3.13 (saliva), ± 3.55 years (buccal cells) and ± 2.5 years (buccal cells), respectively, were based on 5–7 CpG sites. While the models presented by Hong et al. and Wozniak et al. are uniquely focused on saliva and buccal swab samples, respectively, the combined model developed by our study covers both tissues, being more reliable in forensic scenarios where a mixture of saliva and buccal cells is under study, such as cigarette butts. A similar strategy to the present study was developed by Jung et al. [
      • Jung S.-E.
      • Min S.
      • Rom S.
      • Hee E.
      • Shin K.
      • Young H.
      DNA methylation of the ELOVL2, FHL2, KLF14, C1orf132/MIR29B2C, and TRIM59 genes for age prediction from blood, saliva, and buccal swab samples.
      ], building a 5-CpG tissue-combined age prediction model, including saliva, buccal swabs and blood samples. The prediction error obtained was MAE: ± 3.55, practically identical to the present study. Considering all these results and the fact that models of other tissues also systematically present errors close to ± 3 years, it is reasonable to conclude that the lowest error obtainable with current technologies has been reached. Independently of the tissues covered, the main improvement provided by the prediction model proposed in the present study in comparison to the previous ones is the underlying statistical method used – quantile regression – providing not only the predicted age but the age-specific prediction intervals as well. Since errors are usually narrower at younger samples rather than at older individuals, to provide a specific interval of ages could improve the accuracy of results.
      Considering the models discussed above, it is evident that certain markers appear recurrently in multiple age predictors for saliva and buccal swabs, namely ELOVL2, PDE4C, EDARADD and KLF14. Genes ELOVL2, PDE4C and EDARADD are present in our model but with different CpG positions (except cg09809672 in EDARADD, shared with Schwender’s [
      • Schwender K.
      • Holländer O.
      • Klopfleisch S.
      • Eveslage M.
      • Danzer M.F.
      • Pfeiffer H.
      • et al.
      Development of two age estimation models for buccal swab samples based on 3 CpG sites analyzed with pyrosequencing and minisequencing.
      ] model). Comparing the markers in these four genes in the other studies shows that only the CpG of PDE4C is shared between Eipel’s [
      • Eipel M.
      • Mayer F.
      • Arent T.
      • Ferreira M.R.P.
      • Birkhofer C.
      • Gerstenmaier U.
      • et al.
      Epigenetic age predictions based on buccal swabs are more precise in combination with cell type-specific DNA methylation signatures.
      ] and Schwender’s [
      • Schwender K.
      • Holländer O.
      • Klopfleisch S.
      • Eveslage M.
      • Danzer M.F.
      • Pfeiffer H.
      • et al.
      Development of two age estimation models for buccal swab samples based on 3 CpG sites analyzed with pyrosequencing and minisequencing.
      ] models. In KLF14, not used in our study, only cg14361627 is shared between Hong’s [
      • Hong S.R.
      • Jung S.E.
      • Lee E.H.
      • Shin K.J.
      • Yang W.I.
      • Lee H.Y.
      DNA methylation-based age prediction from saliva: high age predictability by combination of 7 CpG markers.
      ] and Jung’s [
      • Jung S.-E.
      • Min S.
      • Rom S.
      • Hee E.
      • Shin K.
      • Young H.
      DNA methylation of the ELOVL2, FHL2, KLF14, C1orf132/MIR29B2C, and TRIM59 genes for age prediction from blood, saliva, and buccal swab samples.
      ] models. This CpG is in the list of 49 CpGs of the preselected markers (Supplementary Table S1) but did not meet the selection criteria for our model. Our marker selection was based on the GSE92767 dataset [
      • Hong S.R.
      • Jung S.E.
      • Lee E.H.
      • Shin K.J.
      • Yang W.I.
      • Lee H.Y.
      DNA methylation-based age prediction from saliva: high age predictability by combination of 7 CpG markers.
      ], the same dataset used for Hong’s marker selection but different markers were selected by each study using the same dataset. Different approaches were used for marker selection by Hong, with linear regression and stepwise regression used to identify markers with an R2 greater than 0.65 and a difference between maximum and minimum β-scores greater than 0.1. This compares to our use of Spearman’s correlation to select markers with a correlation greater than |0.8| and a difference between extreme age donors greater than |0.3|. The motivation to change the selection criteria for marker selection when assessing the GSE92767 dataset was based on the lack of normality found for 15 % of the residuals of the models (independent linear regression models for each CpG on the dataset). Therefore, a non-parametric method such as the Spearman coefficient was found to be more suitable for this analysis.
      Regarding markers included in our prediction model, cg10501210 was reported as a marker related with aging in blood monocytes [
      • Tserel L.
      • Limbach M.
      • Saare M.
      • Kisand K.
      • Metspalu A.
      • Milani L.
      • et al.
      CpG sites associated with NRP1, NRXN2 and miR-29b-2 are hypomethylated in monocytes during ageing.
      ], showing a similar DNA methylation trend when analyzing saliva and buccal cells samples in our study. Although less evident than for FHL2 and ASPA, the correlation with age and tissue dispersion detected for this marker (rs = −0.313 and SD= 0.103) suggested exclusion form the final age prediction model. However, its removal from the final model has the greatest effect, as shown in the robustness analysis. The gene LHFPL4 (LHFPL tetraspan subfamily member 4), is a member of the superfamily of tetraspan transmembrane protein encoding genes. Mutations in one LHFP-like gene result in deafness in humans and mice, and a second LHFP-like gene is fused to a high-mobility group gene in a translocation-associated lipoma. To the best of our knowledge, our study detected this marker to be correlated with age in saliva and buccal cells for the first time. The cg11084334 CpG analyzed in LHFPL4 presented amongst the highest age correlation values (rs = 0.805), as well as showing minimal dispersion between tissues (SD= 0.039). Correlation with age in blood has been observed in other CpG positions of LHFPL4 (cg24866418 and cg12841266) [
      • Alsaleh H.
      • Haddrill P.R.
      Identifying blood-specific age-related DNA methylation markers on the Illumina methylationEPIC BeadChip.
      ]. The gene ELOVL2 (ELOVL fatty acid elongase 2) has been widely reported as a key age correlated marker [
      • Hannum G.
      • Guinney J.
      • Zhao L.
      • Zhang L.
      • Hughes G.
      Genome-wide methylation profiles reveal quantitative views of human aging rates.
      ,
      • Garagnani P.
      • Bacalini M.G.
      • Pirazzini C.
      • Gori D.
      • Giuliani C.
      • Mari D.
      • et al.
      Methylation of ELOVL2 gene as a new epigenetic marker of age Aging Cell.
      ,
      • Heyn H.
      • Li N.
      • Ferreira H.J.
      • Moran S.
      • Pisano D.G.
      • Gomez A.
      • et al.
      Distinct DNA methylomes of newborns and centenarians.
      ] and has been incorporated in most of the age prediction models developed so far. This marker has been reported to correlate with age in multiple forensic tissues such as blood [
      • Zbieć-Piekarska R.
      • Sólnicka M.
      • Kupiec T.
      • Parys-proszek A.
      • Makowska Z.
      • Paleczka A.
      • et al.
      Development of a forensically useful age prediction method based on DNA methylation analysis.
      ,
      • Freire-Aradas A.
      • Phillips C.
      • Mosquera-Miguel A.
      • Girón-Santamaría L.
      • Gómez-Tato A.
      • Casares De Cal M.
      • et al.
      Development of a methylation marker set for forensic age estimation using analysis of public methylation data and the Agena Bioscience EpiTYPER system.
      ,
      • Jung S.-E.
      • Min S.
      • Rom S.
      • Hee E.
      • Shin K.
      • Young H.
      DNA methylation of the ELOVL2, FHL2, KLF14, C1orf132/MIR29B2C, and TRIM59 genes for age prediction from blood, saliva, and buccal swab samples.
      ,
      • Bekaert B.
      • Kamalandua A.
      • Zapico S.C.
      • Voorde W.Van De
      • Bekaert B.
      Improved age determination of blood and teeth samples using a selected set of DNA methylation markers.
      ,
      • Wózniak A.
      • Heidegger A.
      • Piniewska-Róg D.
      • Pośpiech E.
      • Xavier C.
      • Pisarek A.
      • et al.
      Development of the VISAGE enhanced tool and statistical models for epigenetic age estimation in blood, buccal cells and bones.
      ], saliva [
      • Jung S.-E.
      • Min S.
      • Rom S.
      • Hee E.
      • Shin K.
      • Young H.
      DNA methylation of the ELOVL2, FHL2, KLF14, C1orf132/MIR29B2C, and TRIM59 genes for age prediction from blood, saliva, and buccal swab samples.
      ], buccal cells [
      • Jung S.-E.
      • Min S.
      • Rom S.
      • Hee E.
      • Shin K.
      • Young H.
      DNA methylation of the ELOVL2, FHL2, KLF14, C1orf132/MIR29B2C, and TRIM59 genes for age prediction from blood, saliva, and buccal swab samples.
      ,
      • Schwender K.
      • Holländer O.
      • Klopfleisch S.
      • Eveslage M.
      • Danzer M.F.
      • Pfeiffer H.
      • et al.
      Development of two age estimation models for buccal swab samples based on 3 CpG sites analyzed with pyrosequencing and minisequencing.
      ,
      • Wózniak A.
      • Heidegger A.
      • Piniewska-Róg D.
      • Pośpiech E.
      • Xavier C.
      • Pisarek A.
      • et al.
      Development of the VISAGE enhanced tool and statistical models for epigenetic age estimation in blood, buccal cells and bones.
      ], teeth [
      • Bekaert B.
      • Kamalandua A.
      • Zapico S.C.
      • Voorde W.Van De
      • Bekaert B.
      Improved age determination of blood and teeth samples using a selected set of DNA methylation markers.
      ] and bones [
      • Wózniak A.
      • Heidegger A.
      • Piniewska-Róg D.
      • Pośpiech E.
      • Xavier C.
      • Pisarek A.
      • et al.
      Development of the VISAGE enhanced tool and statistical models for epigenetic age estimation in blood, buccal cells and bones.
      ]. More specifically, it is noteworthy that the cg16867657 CpG analyzed in our study has been reported in other studies to be correlated with age either in blood [
      • Zbieć-Piekarska R.
      • Sólnicka M.
      • Kupiec T.
      • Parys-proszek A.
      • Makowska Z.
      • Paleczka A.
      • et al.
      Development of a forensically useful age prediction method based on DNA methylation analysis.
      ,
      • Freire-Aradas A.
      • Phillips C.
      • Mosquera-Miguel A.
      • Girón-Santamaría L.
      • Gómez-Tato A.
      • Casares De Cal M.
      • et al.
      Development of a methylation marker set for forensic age estimation using analysis of public methylation data and the Agena Bioscience EpiTYPER system.
      ,
      • Bekaert B.
      • Kamalandua A.
      • Zapico S.C.
      • Voorde W.Van De
      • Bekaert B.
      Improved age determination of blood and teeth samples using a selected set of DNA methylation markers.
      ], buccal cells [
      • Schwender K.
      • Holländer O.
      • Klopfleisch S.
      • Eveslage M.
      • Danzer M.F.
      • Pfeiffer H.
      • et al.
      Development of two age estimation models for buccal swab samples based on 3 CpG sites analyzed with pyrosequencing and minisequencing.
      ] or teeth [
      • Bekaert B.
      • Kamalandua A.
      • Zapico S.C.
      • Voorde W.Van De
      • Bekaert B.
      Improved age determination of blood and teeth samples using a selected set of DNA methylation markers.
      ]. Gene PDE4C (phosphodiesterase 4 C) had the strongest correlation with age in saliva and buccal cells was (rs = 0.806), and has been published in age prediction models for different tissues including saliva, buccal cells and blood [
      • Weidner C.I.
      • Lin Q.
      • Koch C.M.
      • Eisele L.
      • Beier F.
      • Ziegler P.
      • et al.
      Aging of blood can be tracked by DNA methylation changes at just three CpG sites.
      ,
      • Freire-Aradas A.
      • Phillips C.
      • Mosquera-Miguel A.
      • Girón-Santamaría L.
      • Gómez-Tato A.
      • Casares De Cal M.
      • et al.
      Development of a methylation marker set for forensic age estimation using analysis of public methylation data and the Agena Bioscience EpiTYPER system.
      ,
      • Eipel M.
      • Mayer F.
      • Arent T.
      • Ferreira M.R.P.
      • Birkhofer C.
      • Gerstenmaier U.
      • et al.
      Epigenetic age predictions based on buccal swabs are more precise in combination with cell type-specific DNA methylation signatures.
      ,
      • Bekaert B.
      • Kamalandua A.
      • Zapico S.C.
      • Voorde W.Van De
      • Bekaert B.
      Improved age determination of blood and teeth samples using a selected set of DNA methylation markers.
      ,
      • Marqueta-Gracia J.J.
      • Álvarez-Álvarez M.
      • Baeta M.
      • Palencia-Madrid L.
      • Prieto-Fernández E.
      • Ordoñana R.J.
      • et al.
      Genetics differentially methylated CpG regions analyzed by PCR-high resolution melting for monozygotic twin pair discrimination.
      ]. In gene HOXC4 (homeobox C4), the cg18473521 CpG analyzed in this work has shown correlation with age in blood samples [
      • Naue J.
      • Hoefsloot H.C.J.
      • Mook O.R.F.
      • Rijlaarsdam-hoekstra L.
      • Zwalm M.C.H.Van Der
      • Henneman P.
      • et al.
      Chronological age prediction based on DNA methylation: massive parallel sequencing and random forest regression.
      ]. Gene OTUD7A (OTU deubiquitinase 7A), which encodes a protein acting on TNF receptor associated factor 6 (TRAF6) to control nuclear factor kappa B expression, is used for the first time in an age prediction model in our study. Although OTUD7A has previously shown correlation with age in blood [
      • Freire-Aradas A.
      • Phillips C.
      • Mosquera-Miguel A.
      • Girón-Santamaría L.
      • Gómez-Tato A.
      • Casares De Cal M.
      • et al.
      Development of a methylation marker set for forensic age estimation using analysis of public methylation data and the Agena Bioscience EpiTYPER system.
      ] and saliva [
      • Hong S.R.
      • Jung S.E.
      • Lee E.H.
      • Shin K.J.
      • Yang W.I.
      • Lee H.Y.
      DNA methylation-based age prediction from saliva: high age predictability by combination of 7 CpG markers.
      ], it was not included in published any model. Finally, EDARADD has been reported to show age correlated CpG positions, with cg09809672 used in this study also reported in previous blood, saliva, buccal cell and bone models [
      • Bocklandt S.
      • Lin W.
      • Sehl M.E.
      • Sa F.J.
      • Sinsheimer J.S.
      • Horvath S.
      • et al.
      Epigenetic predictor of age.
      ,
      • Freire-Aradas A.
      • Phillips C.
      • Mosquera-Miguel A.
      • Girón-Santamaría L.
      • Gómez-Tato A.
      • Casares De Cal M.
      • et al.
      Development of a methylation marker set for forensic age estimation using analysis of public methylation data and the Agena Bioscience EpiTYPER system.
      ,
      • Schwender K.
      • Holländer O.
      • Klopfleisch S.
      • Eveslage M.
      • Danzer M.F.
      • Pfeiffer H.
      • et al.
      Development of two age estimation models for buccal swab samples based on 3 CpG sites analyzed with pyrosequencing and minisequencing.
      ,
      • Bekaert B.
      • Kamalandua A.
      • Zapico S.C.
      • Voorde W.Van De
      • Bekaert B.
      Improved age determination of blood and teeth samples using a selected set of DNA methylation markers.
      ,
      • Wózniak A.
      • Heidegger A.
      • Piniewska-Róg D.
      • Pośpiech E.
      • Xavier C.
      • Pisarek A.
      • et al.
      Development of the VISAGE enhanced tool and statistical models for epigenetic age estimation in blood, buccal cells and bones.
      ].
      Our studies showed the age predictive performance of the saliva and buccal cell model was not improved by adding tissue-of-origin information. A similar analysis was performed by Eipel et. al for buccal swab samples [
      • Eipel M.
      • Mayer F.
      • Arent T.
      • Ferreira M.R.P.
      • Birkhofer C.
      • Gerstenmaier U.
      • et al.
      Epigenetic age predictions based on buccal swabs are more precise in combination with cell type-specific DNA methylation signatures.
      ]. In Eipel’s study, combined age and cell-type prediction models reported age prediction errors with this model (training MAD ± 4.66; testing MAD ± 5.09) that improved on age correlated markers only (training MAD: ± 4.3; testing MAD: ± 7.03). This suggests that introducing the cellular composition as a co-variable has more effect than the tissue of origin. Therefore, assessment of the cellular proportions may be the most effective way to introduce tissue-of-origin information as a co-variable in an age prediction model – certainly for the buccal cavity.
      Finally, considering that in forensic DNA analysis degraded and low-level DNA concentrations are commonly encountered, our evaluations of the robustness of the model with missing data and amounts of input DNA for bisulfite conversion were particularly relevant.
      Similar predictive performance was obtained for all step-wise exclusions of markers with the exception of cg10501210 (MAE: ± 5.23, and %CP±PI=74.06 %). The absence of this CpG produced the greatest increase in error. However, it should be noted that if missing data are present, incorrect DNA methylation measurement could be also occurring at the detected methylated and unmethylated peaks. In this case, to run duplicates or even triplicates of the sample is recommended in order to double-check the methylation values obtained.
      An important factor for forensic sensitivity of methylation tests is the bisulfite conversion step, representing an aggressive reduction of the input DNA. Since use of 100 ng is not common practice in casework, the serial dilutions that were evaluated up to 10 ng showed no standard deviations greater than 0.1. For 1 ng input, some markers showed deviation values above the established limit for 3 of 4 samples. Thus, it is a viable strategy to start with a minimum of 10 ng of genomic DNA. Very similar results have been obtained by Aliferi et.al [
      • Aliferi A.
      • Ballard D.
      • Gallidabino M.D.
      • Thurtle H.
      • Barron L.
      • Syndercombe-Court D.
      DNA methylation-based age prediction using massively parallel sequencing data and multiple machine learning models.
      ] and Wózniak et.al [
      • Wózniak A.
      • Heidegger A.
      • Piniewska-Róg D.
      • Pośpiech E.
      • Xavier C.
      • Pisarek A.
      • et al.
      Development of the VISAGE enhanced tool and statistical models for epigenetic age estimation in blood, buccal cells and bones.
      ], indicating analyses with less than 10 ng of DNA caused significant variations in DNA methylation values. When comparing these studies to data reported here, it is worth noting that different technologies have been used, massive parallel sequencing versus SNaPshot, suggesting the limitation is not the detection methodology, but the DNA degradation or loss during bisulfite conversion process itself, or the stochastic variability of the analyzed molecules.

      Acknowledgements

      This project was funded by the Consellería de Cultura, Educación e Ordenación Universitaria e da Consellería de Economía, Emprego e Industria from Xunta de Galicia , Spain (Modalidade B, ED481B 2018/010 ) by a postdoctorate grant awarded to AFA. MVL is supported by the Ministerio de Educación, Cultura y Ciencia , Spain ( PID2019-107876RB-I00 ).M.d.l.P. is supported by a post-doctorate grant funded by the Consellería de Cultura, Educación e Ordenación Universitaria e da Consellería de Economía, Emprego e Industria from Xunta de Galicia, Spain ( ED481D-2021-008 ). J.R. is supported by the “Programa de axudas á etapa predoutoral” funded by the Consellería de Cultura, Educación e Ordenación Universitaria e da Consellería de Economía, Emprego e Industria from Xunta de Galicia, Spain ( ED481A-2020/039 ).

      Appendix A. Supplementary material

      References

        • Freire-Aradas A.
        • Phillips C.
        • Lareu M.
        Forensic individual age estimation with DNA: from initial approaches to methylation tests.
        Forensic Sci. Rev. 2017; 29: 121-144
        • Parson W.
        Age estimation with DNA: From forensic DNA fingerprinting to forensic (Epi) genomics: a mini-review.
        Gerontology. 2018; 64: 326-332
        • Walsh S.
        • Liu F.
        • Wollstein A.
        • Kovatsi L.
        • Ralf A.
        • Kosiniak-kamysz A.
        • et al.
        The HIrisPlex system for simultaneous prediction of hair and eye colour from DNA.
        Forensic Sci. Int Genet. 2013; 7: 98-115
        • Marcińska M.
        • Pośpiech E.
        • Abidi S.
        • Andersen J.D.
        • van den Berge M.
        • Carracedo Á.
        • et al.
        Evaluation of DNA variants associated with androgenetic alopecia and their potential to predict male pattern baldness.
        PLoS One. 2015; 10e0127852
        • Abbott A.
        DNA clock may aid refugee age check.
        Nature. 2018; 561: 15
        • Noroozi R.
        • Ghafouri-Fard S.
        • Pisarek A.
        • Rudnicka J.
        • Spólnicka M.
        • Branicki W.
        • et al.
        DNA methylation-based age clocks: From age prediction to age reversion.
        Ageing Res Rev. 2021; 68101314
        • Smith Z.D.
        • Meissner A.
        DNA methylation: roles in mammalian development.
        Nat. Rev. Genet. 2013; 14: 204-220
        • Rakyan V.K.
        • Down T.A.
        • Maslau S.
        • Andrew T.
        • Yang T.P.
        • Beyan H.
        • et al.
        Human aging-associated DNA hypermethylation occurs preferentially at bivalent chromatin domains.
        Genome Res. 2010; 20: 434-439
        • Bocklandt S.
        • Lin W.
        • Sehl M.E.
        • Sa F.J.
        • Sinsheimer J.S.
        • Horvath S.
        • et al.
        Epigenetic predictor of age.
        PLoS One. 2011; 6e14821
        • Bell J.T.
        • Tsai P.-C.
        • Yang T.-P.
        • Pidsley R.
        • Nisbet J.
        • Glass D.
        • et al.
        Epigenome-wide scans identify differentially methylated regions for age and age-related phenotypes in a healthy ageing population.
        PLoS Genet. 2012; 8e1002629
        • Heyn H.
        • Esteller M.
        DNA methylation profiling in the clinic: applications and challenges.
        Nat. Rev. Genet. 2012; 13: 679-692
        • Hannum G.
        • Guinney J.
        • Zhao L.
        • Zhang L.
        • Hughes G.
        Genome-wide methylation profiles reveal quantitative views of human aging rates.
        Mol. Cell. 2013; 49: 359-367
        • Horvath S.
        DNA methylation age of human tissues and cell types.
        Genome Biol. 2013; 14: R115
        • Johansson A.
        • Enroth S.
        • Gyllensten U.
        Continuous aging of the human DNA methylome throughout the human lifespan.
        PLoS One. 2013; 8e67378
        • Reynolds C.A.
        • Tan Q.
        • Munoz E.
        • Jylhävä J.
        • Hjelmborg J.
        • Christiansen L.
        • et al.
        A decade of epigenetic change in aging twins: genetic and environmental contributions to longitudinal DNA methylation.
        Aging Cell. 2020; 19e13197
        • Wang Y.
        • Karlsson R.
        • Lampa E.
        • Zhang Q.
        • Hedman Å.K.
        • Almgren M.
        Epigenetic influences on aging: a longitudinal genome-wide methylation study in old Swedish twins.
        Epigenetics. 2018; 13: 975-987
        • Moore L.D.
        • Le T.
        • Fan G.
        DNA methylation and its basic function.
        Neuropsychopharmacology. 2013; 38: 23-38
        • Weidner C.I.
        • Lin Q.
        • Koch C.M.
        • Eisele L.
        • Beier F.
        • Ziegler P.
        • et al.
        Aging of blood can be tracked by DNA methylation changes at just three CpG sites.
        Genome Biol. 2014; 15: R24
        • Zbieć-Piekarska R.
        • Sólnicka M.
        • Kupiec T.
        • Parys-proszek A.
        • Makowska Z.
        • Paleczka A.
        • et al.
        Development of a forensically useful age prediction method based on DNA methylation analysis.
        Forensic Sci. Int Genet. 2015; 17: 173-179
        • Freire-Aradas A.
        • Phillips C.
        • Mosquera-Miguel A.
        • Girón-Santamaría L.
        • Gómez-Tato A.
        • Casares De Cal M.
        • et al.
        Development of a methylation marker set for forensic age estimation using analysis of public methylation data and the Agena Bioscience EpiTYPER system.
        Forensic Sci. Int Genet. 2016; 24: 65-74
        • Aliferi A.
        • Ballard D.
        • Gallidabino M.D.
        • Thurtle H.
        • Barron L.
        • Syndercombe-Court D.
        DNA methylation-based age prediction using massively parallel sequencing data and multiple machine learning models.
        Forensic Sci. Int Genet. 2018; 37: 215-226
        • Eipel M.
        • Mayer F.
        • Arent T.
        • Ferreira M.R.P.
        • Birkhofer C.
        • Gerstenmaier U.
        • et al.
        Epigenetic age predictions based on buccal swabs are more precise in combination with cell type-specific DNA methylation signatures.
        Aging. 2016; 8: 1034-1048
        • Jung S.-E.
        • Min S.
        • Rom S.
        • Hee E.
        • Shin K.
        • Young H.
        DNA methylation of the ELOVL2, FHL2, KLF14, C1orf132/MIR29B2C, and TRIM59 genes for age prediction from blood, saliva, and buccal swab samples.
        Forensic Sci. Int Genet. 2019; 38: 1-8
        • Schwender K.
        • Holländer O.
        • Klopfleisch S.
        • Eveslage M.
        • Danzer M.F.
        • Pfeiffer H.
        • et al.
        Development of two age estimation models for buccal swab samples based on 3 CpG sites analyzed with pyrosequencing and minisequencing.
        Forensic Sci. Int Genet. 2021; 53102521
        • Hong S.R.
        • Jung S.E.
        • Lee E.H.
        • Shin K.J.
        • Yang W.I.
        • Lee H.Y.
        DNA methylation-based age prediction from saliva: high age predictability by combination of 7 CpG markers.
        Forensic Sci. Int Genet. 2017; 29: 118-125
        • Lee W.J.
        • Choung C.M.
        • Jung Y.J.
        • Lee H.Y.
        • Lim S.-K.
        A validation study of DNA methylation-based age prediction using semen in forensic casework samples.
        Leg. Med. 2018; 31: 74-77
        • Jenkins T.G.
        • Aston K.I.
        • Cairns B.
        • Smith A.
        • Carrell D.T.
        Paternal germ line aging: DNA methylation age prediction from human sperm.
        BMC Genom. 2018; 19: 763
        • Bekaert B.
        • Kamalandua A.
        • Zapico S.C.
        • Voorde W.Van De
        • Bekaert B.
        Improved age determination of blood and teeth samples using a selected set of DNA methylation markers.
        Epigenetics. 2015; 10: 922-930
        • Lee H.Y.
        • Hong S.R.
        • Lee J.E.
        • Hwang I.K.
        • Kim N.Y.
        • Lee J.M.
        • et al.
        Epigenetic age signatures in bones.
        Forensic Sci. Int Genet. 2020; 46102261
        • Reinius L.E.
        • Acevedo N.
        • Joerink M.
        • Pershagen G.
        • Dahlén S.-E.
        • Greco D.
        • et al.
        Differential DNA methylation in purified human blood cells: Implications for cell lineage and studies on disease susceptibility.
        PLoS One. 2012; 7e41361
        • Theda C.
        • Hwang S.H.
        • Czajko A.
        • Loke Y.J.
        • Leong P.
        • Craig J.M.
        Quantitation of the cellular content of saliva and buccal swab samples.
        Sci. Rep. 2018; 8: 6944
        • Horvath S.
        • Oshima J.
        • Martin G.M.
        • Lu A.T.
        • Quach A.
        • Felton S.
        • et al.
        Epigenetic clock for skin and blood cells applied to Hutchinson Gilford progeria syndrome and ex vivo studies.
        Aging (Albany NY). 2018; 10: 1758-1775
        • Köchl S.
        • Niederstätter H.
        • Parson W.
        DNA extraction and quantitation of forensic samples using the phenol-chloroform method and real-time PCR.
        Methods Mol. Biol. 2005; 297: 13-30
        • Slieker R.C.
        • Bos S.D.
        • Goeman J.J.
        • Bovée J.V.M.G.
        • Talens R.P.
        • Breggen R.
        • Van Der
        • et al.
        Identification and systematic annotation of tissue-specific differentially methylated regions using the Illumina 450k array.
        Epigenetics Chromatin. 2013; 6: 26
        • Jones M.J.
        • Farré P.
        • Mcewen L.M.
        • Macisaac J.L.
        • Watt K.
        • Neumann S.M.
        • et al.
        Distinct DNA methylation patterns of cognitive impairment and trisomy 21 in down syndrome.
        BMC Med Genom. 2013; 6: 58
        • Huang Y.
        • Yan J.
        • Hou J.
        • Fu X.
        • Li L.
        • Hou Y.
        Developing a DNA methylation assay for human age prediction in blood and bloodstain.
        Forensic Sci. Int Genet. 2015; 17: 129-136
        • Marqueta-Gracia J.J.
        • Álvarez-Álvarez M.
        • Baeta M.
        • Palencia-Madrid L.
        • Prieto-Fernández E.
        • Ordoñana R.J.
        • et al.
        Genetics differentially methylated CpG regions analyzed by PCR-high resolution melting for monozygotic twin pair discrimination.
        Forensic Sci. Int Genet. 2018; 37: e1-e5
        • Hamano Y.
        • Manabe S.
        • Morimoto C.
        • Fujimoto S.
        • Tamaki K.
        Forensic age prediction for saliva samples using methylation-sensitive high resolution melting: exploratory application for cigarette butts.
        Sci. Rep. 2017; 7: 10444
        • You F.M.
        • Huo N.
        • Gu Y.Q.
        • Luo M.
        • Ma Y.
        • Hane D.
        • et al.
        BatchPrimer3: a high throughput web application for PCR and sequencing primer design.
        BMC Bioinforma. 2008; 9: 253
        • Robin X.
        • Turck N.
        • Hainard A.
        • Tiberti N.
        • Lisacek F.
        • Sanchez J.
        • et al.
        pROC: an open-source package for R and S+ to analyze and compare ROC curves.
        BMC Bioinforma. 2011; 12: 77
      1. Koenker R., Portnoy S., Ng P., Zeileis A., Grosjean P., Ripley B. Package quantreg: Quantile Regression. 2015.

      2. Alfons A. Package cvTools: Cross-validation tools for regression models. 2015.

      3. Wickham H., Chang W. Package ggplot2: An implementation of the grammar of graphics. 2015.

        • Team R Core. R
        A Language and Environment for Statistical Computing.
        R Foundation for Statistical Computing,, Vienna, Austria2020 (Available from)
        • Vidaki A.
        • Daniel B.
        • Court D.S.
        Forensic DNA methylation profiling — potential opportunities and challenges.
        Forensic Sci. Int Genet. 2013; 7: 499-507
        • Hong S.R.
        • Shin K.
        • Jung S.
        • Lee E.H.
        • Lee H.Y.
        Platform-independent models for age prediction using DNA methylation data.
        Forensic Sci. Int Genet. 2019; 38: 39-47
        • Alghanim H.
        • Balamurugan K.
        • Mccord B.
        Development of DNA methylation markers for sperm, saliva and blood indentification using pyrosequencing and qPCR/HRM.
        Anal. Biochem. 2020; 611113933
        • Thiede C.
        • Prange-Krex G.
        • Freiberg-Richter J.
        • Bornhäuser M.
        • Ehninger G.
        Buccal swabs but not mouthwash samples can be used to obtain pretransplant DNA fingerprints from recipients of allogeneic bone marrow transplants.
        Bone Marrow Transpl. 2000; 25: 575-577
        • Accomando W.P.
        • Wiencke J.K.
        • Houseman E.A.
        • Nelson H.H.
        • Kelsey K.T.
        Quantitative reconstruction of leukocyte subsets using DNA methylation.
        Genome Biol. 2014; 15: R50
        • Wózniak A.
        • Heidegger A.
        • Piniewska-Róg D.
        • Pośpiech E.
        • Xavier C.
        • Pisarek A.
        • et al.
        Development of the VISAGE enhanced tool and statistical models for epigenetic age estimation in blood, buccal cells and bones.
        Aging. 2021; 13: 6459-6484
        • Tserel L.
        • Limbach M.
        • Saare M.
        • Kisand K.
        • Metspalu A.
        • Milani L.
        • et al.
        CpG sites associated with NRP1, NRXN2 and miR-29b-2 are hypomethylated in monocytes during ageing.
        Immun. Ageing. 2014; 11: 1
        • Alsaleh H.
        • Haddrill P.R.
        Identifying blood-specific age-related DNA methylation markers on the Illumina methylationEPIC BeadChip.
        Forensic Sci. Int. 2019; 303109944
        • Garagnani P.
        • Bacalini M.G.
        • Pirazzini C.
        • Gori D.
        • Giuliani C.
        • Mari D.
        • et al.
        Methylation of ELOVL2 gene as a new epigenetic marker of age Aging Cell.
        Aging. 2012; 11: 1132-1134
        • Heyn H.
        • Li N.
        • Ferreira H.J.
        • Moran S.
        • Pisano D.G.
        • Gomez A.
        • et al.
        Distinct DNA methylomes of newborns and centenarians.
        Proc. Natl. Acad. Sci. USA. 2012; 109: 10522-10527
        • Naue J.
        • Hoefsloot H.C.J.
        • Mook O.R.F.
        • Rijlaarsdam-hoekstra L.
        • Zwalm M.C.H.Van Der
        • Henneman P.
        • et al.
        Chronological age prediction based on DNA methylation: massive parallel sequencing and random forest regression.
        Forensic Sci. Int Genet. 2017; 31: 19-28