- Research
- Open access
- Published:
Development of an individualized dementia risk prediction model using deep learning survival analysis incorporating genetic and environmental factors
Alzheimer's Research & Therapy volume 16, Article number: 278 (2024)
Abstract
Background
Dementia is a major public health challenge in modern society. Early detection of high-risk dementia patients and timely intervention or treatment are of significant clinical importance. Neural network survival analysis represents the most advanced technology for survival analysis to date. However, there is a lack of deep learning-based survival analysis models that integrate both genetic and clinical factors to develop and validate individualized dynamic dementia risk prediction models.
Methods and results
This study is based on a large prospective cohort from the UK Biobank, which includes a total of 41,484 participants with an average follow-up period of 12.6 years. Initially, 364 candidate features (predictor variables) were screened. The top 30 key features were then identified by ranking the importance of each predictor variable using the Gradient Boosting Machine (GBM) model. A multi-model comparison strategy was employed to evaluate the predictive performance of four survival analysis models: DeepSurv, DeepHit, Kaplan–Meier estimation, and the Cox proportional hazards model (CoxPH). The results showed that the average Harrell's C-index for the DeepSurv model was 0.743, for the DeepHit model it was 0.633, for the CoxPH model it was 0.749, and for the Kaplan–Meier estimator model it was 0.500. In addition, the average D-Calibration Survival Measure was 6.014, 4408.086, 32274.743, and 1.508, respectively. The Brier score (BS) was used to assess the importance of features for the DeepSurv dementia prediction model, and the relationship between features and dementia was visualized using a partial dependence plot (PDP). To facilitate further research, the team deployed the DeepSurv dementia prediction model on AliCloud servers and designated it as the UKB-DementiaPre Tool.
Conclusion
This study successfully developed and validated the DeepSurv dementia prediction model for individuals aged 60 years and above, integrating both genetic and clinical data. The model was then deployed on AliCloud servers to promote its clinical translation. It is anticipated that this prediction model will provide more accurate decision support for clinical treatment and will serve as a valuable tool for the primary prevention of dementia.
Background
Dementia is a general term used to describe a range of progressive cognitive declines, characterized by the gradual loss of previously acquired cognitive abilities. The primary symptom is the progressive impairment of multiple cognitive functions, including memory, reasoning, judgment, and language [1]. According to estimates by the World Health Organization, between 5 and 8% of individuals aged 60 and above worldwide are affected by dementia. It is estimated that by 2030, the total number of dementia patients worldwide will reach 82 million, and by 2050, it will increase to 152 million [2]. Dementia encompasses more than 100 distinct diseases and conditions, with Alzheimer's disease (AD) representing the most prevalent form, accounting for 60–70% of cases [3].
Although the number of people with dementia is rising, studies have shown that the risk of dementia in some age groups in high-income countries may have actually declined over the past 25 years. This decline is likely due to improved education levels and better control of major cardiovascular risk factors, such as hypertension, diabetes, and hypercholesterolemia [4, 5]. These studies suggest that AD and other forms of dementia are not necessarily an inevitable consequence of aging. It is possible that some individuals may be able to prevent or delay the onset and progression of dementia by modifying their exposure to specific risk factors, such as hypertension, smoking, obesity, and diabetes.
Nevertheless, the current pharmacological treatments for dementia, particularly AD, are not optimal. Although some drugs can improve the symptoms of dementia or AD, they cannot completely halt the progression of the disease [6]. Consequently, the timely identification of individuals at high risk for dementia, along with the implementation of targeted interventions or treatments at an early stage, is crucial. Such approaches are expected to delay the onset of dementia, improve the prognosis for patients, reduce the overall mortality rate, and mitigate the social and familial impacts of the disease.
As research on dementia continues, an increasing number of risk factors associated with the disease have been identified. In recent years, there has been growing interest in developing new models for predicting dementia. In addition to traditional methods, such as logistic regression and the Cox proportional hazards regression model (CoxPH) for establishing dementia risk prediction models [7,8,9], the advancement of artificial intelligence has led to the application of machine learning techniques for the detection and prediction of dementia. These techniques hold great potential for enhancing our understanding of the disease and advancing the fields of psychiatry and neurology [10].
The CoxPH model is a standard survival analysis model, which is semiparametric and is used to quantify the influence of observed covariates on the risk of an event, such as mortality. The model assumes that the patient's risk of an event is a linear combination of the patient's covariates—an assumption known as the proportional hazards’ assumption [11]. However, in many applications, including the provision of personalized treatment recommendations, the assumption that the log-risk function is linear may be overly simplistic. Therefore, a more comprehensive set of survival models is required to more accurately reflect the nonlinear log-risk functions observed in survival data [12].
Neural network survival analysis represents the most advanced technology currently available for survival analysis [13]. Notable examples of this include DeepSurv, DeepHit, Logistic-Hazard, and others. The DeepSurv model employs deep learning to express the risk function of sensitive factors as a multilayer perceptron. This approach incorporates additional nonlinear activation functions and dropout techniques, which enhance the model's ability to capture the relationships between variables [12]. The complexity of the model increases when applied to real-world medical data. By considering the interactions between multi-gene information and clinical parameters, the integration of genetic data can be promoted, thereby providing insights for the primary prevention of dementia. Nevertheless, there is a lack of dynamic, personalized dementia risk prediction models that integrate genetic and clinical factors using deep learning survival analysis.
The objective of this study was to construct and validate a dynamic, personalized dementia risk prediction model based on the UK Biobank database, which contains large-scale population genetic and clinical data, using the DeepSurv model. This model can assist medical practitioners and clinical teams in more accurately assessing the risk of dementia in patients, thereby facilitating the development of more personalized prevention and treatment plans and providing a reference for early dementia prevention.
Methods
Data source: UK biobank
The UK Biobank is a large-scale, population-based prospective study designed to comprehensively investigate the genetic and non-genetic determinants of disease in middle-aged and older individuals. Its objective is to combine broad and precise exposure assessments with detailed tracking and characterization of numerous health-related outcomes, aiming to contribute to the development of innovative scientific knowledge by optimizing resource utilization. Between 2006 and 2010, the UK Biobank recruited over 500,000 individuals aged 40–69 years. The database enables the tracking of health-related events for all participants through a UK-wide networked system. Additionally, all participants provided written informed consent and were enrolled in the study only after approval from the Northwest Multicenter Research Ethics Committee (11/NW/0382). As a result, approval from the UK Biobank Ethics Committee and the Human Organisms Research Organization Bank meant that independent ethics approval was not required for the resources the researchers wished to use, unless re-engagement of participants was necessary [14, 15].
Inclusion and exclusion criteria
A total of 502,367 participants were initially included in this study. Participants were screened based on the following inclusion and exclusion criteria: Inclusion criteria: 1) Registered to participate in the research between 2006 and 2010; 2) Signed the UK Biobank subject research consent form; 3) Aged 60 years or older. Exclusion criteria: 1) Individuals with a history of all-cause dementia, AD, or vascular dementia (VD); 2) Individuals with incomplete genetic data. Finally, participants without dementia were randomly selected from those who remained dementia-free at follow-up, in a 1:5 ratio, to participate in the modeling. The flowchart for participant selection is shown in Fig. 1, and data from 41,484 participants were ultimately included in the modeling.
Definition of dementia outcome
The determination of the dementia outcome event was based on outcomes defined by the UK Biobank database algorithm and hospital diagnostic records (Field IDs: 42,018, 130,840, 130,842, 42,020, 130,836, 42,022, 130,838, and 42,024). The main causes of dementia included AD, vascular dementia, and other forms of dementia. The final follow-up time for this study was defined as the earliest occurrence of the dementia outcome event, with the recorded date of death or the final dementia outcome/death (13 November 2021) serving as the endpoint for follow-up.
Comparative analysis of dementia prediction models
Determination of features
Inclusion of features
A total of 364 features (predictor variables), including sociodemographic, family history, physical measurements, genetic data, and others, were initially included in this study. These features consisted primarily of clinically relevant data collected during the participant's baseline visit. Initial data screening was performed to: (1) exclude candidate predictor variables with missing values exceeding 10% of all participants, and (2) manually clean procedural metric variables (e.g., biospecimen processing metrics, diagnostic codes, meter IDs) that were not clinically meaningful. However, relatively lenient inclusion criteria were applied to avoid overlooking potential associations. Ultimately, 213 features were used in the study, including several generated features not directly available from the UK Biobank, notably history of myocardial infarction, history of hypertension, history of stroke, polygenic risk score (PRS) for dementia, and APOE (Ɛ4) carrier status.
The diagnosis of myocardial infarction, hypertension, and stroke at the time of study inclusion was further defined based on the timing of these diagnoses relative to participant enrollment. To assess genetic risk for dementia, this study used a multigene genetic risk score calculation method, which involves a weighted assessment of single nucleotide polymorphisms (SNPs). To minimize the risk of false-positive genetic risk scores, newly identified SNPs associated with AD in the UK Biobank database were not included in the genetic score. Instead, 29 SNP loci strongly associated with AD, as identified in previous genome-wide association studies (GWAS), were selected for this study (Supplementary material Table S1) [16,17,18]. Using the PRS calculation method published in a previous study [19], we calculated each participant's dementia PRS based on their SNPs and the corresponding weights (β coefficients) derived from the GWAS results [17]. Additionally, APOE genotypes were determined using the combined variants of rs429358 and rs7412. Participants carrying at least one APOE (Ɛ4) allele were classified as APOE (Ɛ4) carriers. All candidate features are listed in Supplementary material Table S2.
Feature filtering and missing value interpolation
To identify the best subset of predictor features contributing to model performance, we used the Gradient Boosting Machine (GBM) algorithm with default hyperparameters to calculate the importance of each feature based on a feature importance filtering method [20], and did not preprocess the data such as multiple interpolation and normalization, and only factorized the categorical features. As shown in Fig. 2, the top 30 features, ranked by their importance to the GBM model, were selected for the study. Although features with more than 10% missing values were excluded, some participant data had a very small proportion of missing values. We performed the missing completely at random (MCAR) test for missing values, which showed P < 0.001, and therefore did not meet the missing completely at random assumption. Assuming the missing data were missing at random (MAR), we interpolated the missing values using the Random Forest (RF) multiple interpolation method (using the mice package) [21]. This choice was made because the RF imputation algorithm (1) is well-suited for data missing at random, (2) can effectively handle both continuous and categorical variables, and most importantly, (3) does not require parametric forms and can effectively account for any non-linear relationships, complex interactions, and high dimensionality in the imputation model [22].
Model development and evaluation
Feature engineering
Based on the DeepSurv and DeepHit neural network models [12, 23], feature engineering is required before formal data analysis. We performed one-hot encoding to transform categorical data and applied feature scaling (standardization) to normalize the data [24, 25]. This process helps prevent certain features from disproportionately influencing the model and improves its stability and convergence speed [26].
Hyperparameter tuning
To optimize the performance and effectiveness of the neural network model, we fine-tuned the parameters of the neural network structure and training process. Hyperparameter tuning was performed, focusing on key parameters such as dropout, weight decay, learning rate, and the number of nodes per layer [23]. To obtain the optimal hyperparameter combination, this study employed an automatic tuner (mlr3 package), which automates the tuning process based on monitored metrics, specifically Harrell's C-index.
The hyperparameter tuning space is as follows: Dropout is primarily used to address the overfitting problem in neural network models by randomly "dropping" a fraction of nodes during training, preventing them from participating in updates [27]. The value of dropout typically ranges from 0 to 1. In this study, we set the dropout rate between 0 and 0.5. Weight decay is a regularization technique (L2 regularization) designed to reduce the impact of data noise and model variance by encouraging smaller weight values. This helps mitigate overfitting [28]. In this study, the weight decay parameter was set between 0 and 0.5. Learning rate controls the step size for parameter updates in each iteration of training [29]. A well-chosen learning rate ensures effective training, avoiding slow convergence (if too small) or instability (if too large). The learning rate was adjusted within a range of 0 to 1 in this study. Number of nodes per layer refers to the number of units in each hidden layer of the neural network [30]. In this study, the range for the number of nodes was set between 1 and 32.
Hyperparameter tuning
In the hyperparameter tuning process, this study employs a random search strategy with an iteration termination condition set to 60 iterations. Additionally, since both hyperparameter selection and model performance estimation are performed on the same dataset, traditional K-fold cross-validation may result in an overly optimistic evaluation of model performance. To overcome this issue, nested cross-validation techniques are used to address common problems related to overfitting and data bias [31]. In our study, threefold cross-validation was applied to generate different inner training and validation sets (inner resampling), while fivefold cross-validation was used to create different non-test and test sets (outer resampling). This nested sample resampling approach allows for a more accurate evaluation of model performance and facilitates hyperparameter tuning. Furthermore, an early stopping strategy was employed to halt training when model performance stopped improving, preventing overfitting. The Adam optimizer was used to achieve optimal results in a short time [32].
Comparison of dementia prediction models
To objectively compare the performance of the DeepSurv, DeepHit, Kaplan–Meier estimator, and CoxPH models built on the UK Biobank dataset, a multi-model comparison approach using benchmarking was employed in this study (https://mlr3benchmark.mlr-org.com/index.html). Harrell's C-index: Also known as the concordance index (C-index), is used to assess the predictive ability of a model and reflects its discriminatory power—i.e., the model’s ability to make accurate predictions [33]. D-Calibration Survival Measure: This measure indicates whether the probability estimates produced by the model's predictions are meaningful. It evaluates the calibration of the models by calculating their calibration level and comparing them to determine which model is better calibrated [34]. The formula for the calibration statistic s is as follows: \(s=B/n {\sum }_{i}{\left({P}_{i}-n/B\right)}^{2}\), where B denotes the number of ‘buckets’, n denotes the number of predictions, and Pi denotes the number of predicted deaths (illnesses) in the ith interval ([0, 100/B), [100/B, 50/B), …., [(B—100)/B, 1) within the predicted number of deaths (illnesses). In this method, the degree of calibration is assessed by calculating the detection statistic s. If si < sj, model i is considered better calibrated than model j. Conversely, if sj < si, model j is considered better calibrated than model i.
Development and interpretation of the DeepSurv dementia prediction model
In this study, the performance of the DeepSurv, DeepHit, Kaplan–Meier estimator, and CoxPH models was compared. The final selection of the DeepSurv model as the optimal choice for building the dementia prediction model, named the DeepSurv Dementia Prediction Model, was based on the results of two assessment metrics: Harrell's C-index and the D-Calibration Survival Measure.
The global interpretation of the DeepSurv Dementia Prediction Model is as follows: Brier score (BS): The BS measures the accuracy of probabilistic predictions, serving as an indicator of the model's calibration. It assesses the discrepancy between the predicted probabilities and the actual outcomes. The BS is one of the most commonly used evaluation metrics for this purpose [35]. Consequently, the importance of a feature in the model can be assessed by examining its impact on the model's calibration level. Partial dependence plot (PDP): A PDP is a tool used in machine learning for model interpretation. It illustrates the marginal effect of a particular feature on the prediction of a model, while accounting for the effects of all other features. PDPs offer visualizations of the relationship between the outcome and the feature, clearly showing whether the relationship is linear, monotonic, or more complex [36, 37].
Creation and implementation of the UKB-DementiaPre tool
To facilitate the wider dissemination of the DeepSurv dementia prediction model developed in this study, we chose to deploy the model on an AliCloud server and named it the UKB-DementiaPre Tool. This will allow a larger number of individuals to utilize the application, benefiting clinical practices and providing a foundation for predicting dementia risk in the population. The model can be accessed via the provided link or QR code associated with the UKB-DementiaPre Tool.
Statistical analyses
The baseline features (the top 30 most important features) of all participants included in the study were statistically analyzed. The Kolmogorov–Smirnov test, suitable for testing the normality of large sample-size data [38], was employed to assess the data distribution of the two groups: participants diagnosed with dementia and those without a dementia diagnosis. Normally distributed data exhibit symmetry, where the mean effectively represents the central tendency, and variance measures the degree of dispersion. Conversely, non-normally distributed data tend to display skewness or extreme values; in such cases, the median is a more robust measure of central tendency, unaffected by outliers, while quartiles provide a reliable measure of dispersion [39]. As a result, numerical data for features are expressed as mean ± standard deviation for normally distributed variables and as median (interquartile range, IQR) for non-normally distributed variables. Categorical data are presented as frequencies and proportions. Additionally, the chi-square test and Wilcoxon rank-sum test (using the epiDisplay package) were employed to compare features between participants with and without a dementia diagnosis during the follow-up period. All statistical analyses were conducted using R software (version 4.1.0). In these analyses, the significance threshold for rejecting the null hypothesis (indicating no difference between groups) was established at a P-value < 0.05 for two-sided tests [40]. Results were considered statistically significant when P < 0.05.
Results
Baseline characteristics
A total of 6,914 participants were newly diagnosed with dementia during the follow-up study, which included 41,484 UK Biobank participants. The mean follow-up period was 12.6 years. A statistical summary of the baseline data is presented in Table 1. Participants with new-onset dementia were found to have a higher proportion of APOE (Ɛ4) carriers at baseline, were older at enrollment, and had a higher proportion receiving attendance, disability, or mobility allowances. They also had a higher proportion of individuals with a long-standing illness, disability, or infirmity. Additionally, these participants showed higher proportions of a history of diabetes, a history of stroke, slower cognitive function-reaction times, a higher prevalence of family history of AD or dementia, Parkinson’s disease, and a higher PRS for dementia. They also had a lower average total household income before tax and a lower proportion of homeownership.
Performance evaluation of dementia prediction models
As illustrated in Fig. 3, each data point represents the outcome of a specific evaluation of the DeepSurv, DeepHit, Kaplan–Meier estimator, and CoxPH models, with five evaluations conducted for each model. A greater divergence in the distribution of the data points or a wider span of the confidence intervals for each model indicates a more pronounced discrepancy in the results of the five model performance assessments. Figure 3A shows that the mean Harrell's C-index for the DeepSurv, DeepHit, CoxPH, and Kaplan–Meier models were 0.743, 0.633, 0.749, and 0.500, respectively. This suggests that, among the four models, the DeepSurv and CoxPH models—both designed to predict the onset of dementia—demonstrated superior discriminatory ability. Figure 3B presents the average D-Calibration Survival Measure for the DeepSurv, DeepHit, CoxPH, and Kaplan–Meier models, which were 6.014, 4408.086, 32,274.743, and 1.508, respectively. These results indicate that the DeepSurv model exhibited superior calibration and provided more meaningful probability estimates for model predictions. In contrast, the CoxPH model showed inferior calibration in predicting dementia onset. Therefore, to achieve an optimal balance between discriminative ability and calibration, the DeepSurv model was ultimately selected as the most suitable for constructing the dementia prediction model in this study. Additionally, the optimal hyperparameters for the DeepSurv model were extracted, and the learner parameters were updated accordingly.
Global interpretation of the DeepSurv dementia prediction model
To assess the importance of features (predictor variables) in the DeepSurv dementia prediction model, the Brill score was initially employed in this study. A higher Brill score indicates that the feature is more important to the model. As shown in Fig. 4, the final 30 features included in the DeepSurv model were ranked based on their relative importance. The results revealed that APOE (Ɛ4) carriage, age, cognitive function-reaction time, history of diabetes mellitus, family history (mother's disease), and consumption of eggs, dairy, wheat, and sugar, as well as the percentage of right leg fat, were among the most influential variables in the model.
To further explore the relationship between individual features and dementia over time, a PDP was employed. As illustrated in Fig. 5, the following key observations were made: 1) Age: Older participants had a higher risk of developing dementia. 2) Genetic factors: APOE (Ɛ4) carriers exhibited a higher PRS for dementia. Participants with these genetic features were at increased risk if their mothers had AD/dementia, chronic bronchitis/emphysema, or major depression. Chronic illnesses and underlying physical conditions were also identified as risk factors. 3) Medical history and physical factors: The following factors were identified to be associated with an increased risk of dementia: history of diabetes, history of stroke, lower right leg fat ratio or left leg fat mass, receipt of attendance/disability/mobility allowance, long-term illness with disability or infirmity, living in sheltered accommodation or care homes, a higher number of self-reported non-cancer illnesses, more falls in the past year, use of prescription medication, lower peak expiratory flow rate, and doctor's diagnosis of other serious illness/disability. The presence of these features indicated a higher risk of dementia. 4) Economic factors: Lower average gross pre-tax household income, renting without owning a home, and being unemployed or unable to work due to illness or disability were associated with a higher risk of dementia. Additionally, a longer cognitive function-response time was linked to an increased dementia risk. Conversely, participants who frequently or mostly drove at high speeds on the motorway exhibited a lower risk of dementia. 5) Dietary and lifestyle features: Participants who consumed eggs (or egg-containing products), dairy products, wheat, and sugar, or who did not consume sugar or sugar-containing foods or drinks, exhibited a lower risk of dementia. Conversely, participants with longer daily TV time or engaged in physically demanding manual work (e.g., carpentry, digging) were found to have a higher risk of dementia. Interestingly, participants who consumed more alcohol 10 years ago (compared to their current intake) had a lower risk of dementia. No significant association was found between changes in alcohol intake (more, about the same, or less) and dementia risk. 6) Psychiatric factors: Participants who experienced no moodiness, nervousness, or lack of interest over the past two weeks were found to have a lower risk of dementia.
Deployment of the UKB-DementiaPre tool
To facilitate the clinical translation of the established predictive models, the UKB-DementiaPre Tool can be accessed via the following link: http://8.137.113.161:3838/UKBDementiaPre/ or by scanning the UKB-DementiaPre Tool QR code (Supplementary Material, Figure S1). The layout of the UKB-DementiaPre Tool page is shown in Fig. 6. Additionally, a concise overview of how to use the UKB-DementiaPre Tool is provided in Supplementary Material, Introduction 1.
Discussion
The primary findings of this study are as follows: Neural network survival analysis represents the state-of-the-art technique for survival analysis. In this study, a dynamic, individualized dementia risk prediction model for individuals aged 60 and above was developed using the DeepSurv model. The model was based on data from 41,484 participants with a mean follow-up of 12.6 years from the UK Biobank, incorporating both genetic and clinical factors. To make the DeepSurv dementia prediction model clinically applicable, we developed and deployed it on an AliCloud server, where it can be accessed via the provided link or QR code. Additionally, this study identified the top 30 features out of 213 that were most important to the model. A global interpretation of these 30 features was provided, offering deeper insights into the relationship between these features and dementia. This understanding is expected to aid in the early identification and prevention of dementia risk.
Survival analysis is a common method for analyzing medical time-to-event data. It is primarily used to examine statistical patterns of events (such as recurrence, death, or cure) over time in longitudinal studies. Through survival analysis, potential sensitive or risk factors can be further identified [41, 42]. The CoxPH model and the Kaplan–Meier estimator are traditional survival analysis models, while the DeepSurv and DeepHit models are survival analysis models based on deep learning techniques [43]. DeepSurv is a nonlinear version of the CoxPH model that leverages deep learning techniques. It is a neural network designed to predict the effect of patient covariates on their hazard rates by learning network weights [44]. Moreover, the DeepSurv model integrates deep learning concepts with the CoxPH model, expressing the sensitive factor risk function as a multilayer perceptron and incorporating additional nonlinear activation functions and techniques such as dropout [12]. In this study, the mean Harrell's C-index values for the DeepSurv, DeepHit, CoxPH, and Kaplan–Meier models were 0.743, 0.633, 0.749, and 0.743, respectively. The average D-Calibration Survival Measures for the four models were 6.014, 4408.086, 32,274.743, and 1.508, respectively. These results demonstrated that the DeepSurv model outperformed the other models in balancing discriminative ability and calibration. Standardization and Min–Max scaling are two commonly used scaling methods in machine learning. Standardization transforms the data into a distribution with a mean of 0 and a standard deviation of 1, which helps prevent certain features from disproportionately influencing the model. It also improves the stability and convergence speed of the model [26]. Min–Max scaling scales the data to a specified range (usually [0, 1]), transforming each feature’s minimum value to 0 and maximum value to 1. Although Min–Max scaling ensures all features are on the same scale, it does not handle outliers well [45]. In this study, we chose Standardization based on the meaning and characteristics of the data in clinical contexts. We acknowledge that other scaling methods could potentially improve model performance; however, due to the time-consuming nature and high computational cost of model training and comparison, we did not further evaluate the impact of different scaling methods on model performance. Future research should explore this aspect further. The efficacy and functionality of neural networks depend not only on the network configuration and parameters established during training but also on the calibration of hyperparameters [46]. Commonly used hyperparameters in neural networks include dropout, weight decay, learning rate, and the number of nodes per layer [23, 47]. For model training in this study, we employed three-fold cross-validation to generate distinct inner training and validation sets (inner layer resampling), and five-fold cross-validation for different non-test and test sets (outer layer resampling). This approach allowed for a more precise evaluation of model performance and facilitated hyperparameter tuning.
In the context of the established DeepSurv dementia prediction model, interpreting the model features is crucial. To this end, this study employed the BS to assess feature importance. Additionally, PDP was used to provide further clarity regarding the nature of the relationship between features and outcomes, whether it is linear, monotonic, or more complex. Age is widely recognized as the most significant risk factor for dementia [48]. Dementia, particularly AD, results from a combination of genetic and environmental factors [49]. The PRS, which aggregates the effects of numerous disease-related genetic variants into a single score, has shown predictive value for a range of prevalent conditions, including dementia [50, 51]. The study found that participants whose mothers had AD exhibited an elevated risk of developing dementia, a finding consistent with previous research [52,53,54]. Additionally, a strong correlation exists between chronic illnesses, physical frailty, and the risk of dementia [55, 56]. A history of diabetes was identified as a significant risk factor for dementia among the participants [57, 58]. Moreover, a history of stroke was found to be a significant risk factor for dementia. Post-stroke cognitive impairment, which occurs between three and six months after a stroke, is characterized by specific regional cognitive deficits related to the location of stroke damage [59]. The results of this study further confirmed a significant correlation between poor physical health and an elevated risk of dementia. These findings are consistent with previous studies, which have shown that poor physical health and the presence of multiple health conditions are associated with an increased risk of dementia [60]. An elevated self-reported number of non-cancerous diseases and a higher prevalence of other severe conditions or disabilities diagnosed by a medical professional were significantly associated with an increased risk of dementia. Prior research has established that the presence of multiple diseases is associated with an elevated risk of developing dementia, AD, and VD. Furthermore, a robust correlation exists between economic status and the likelihood of developing dementia [61]. The study also demonstrated that an increase in reaction time variability or an elongation of the mean reaction time is associated with an elevated risk of developing dementia within the subsequent four years [62]. More detailed discussion of the possible mechanisms by which the features in this dementia prediction model are associated with dementia can be found in Supplementary material, Discussion 1.
To facilitate the clinical utilization of the DeepSurv dementia prediction model developed in this study, we have deployed the trained model on Alibaba Cloud servers. It can be accessed via a link or QR code. The development and deployment of this application will support its use in clinical settings and serve as a valuable tool for predicting dementia risk in the population.
Strengths and limitations
Main strengths: 1) This study is based on data from 41,484 participants in the UK Biobank, with an average follow-up time of 12.6 years. By combining rich genetic and clinical data, the study employs the most advanced survival analysis techniques to establish a dynamic and personalized dementia risk prediction model for individuals aged 60 and above. 2) To facilitate clinical translation, the DeepSurv dementia prediction model developed in this study is deployed on Alibaba Cloud servers and can be accessed for free through a link or QR code. 3) This study identified the top 30 features from a total of 213 and provided a global explanation of the model, offering valuable insights for future research on the primary prevention of dementia and further advancements in dementia prevention.
Limitations of this study: 1) The UK Biobank cohort predominantly represents European populations, mainly of white ethnicity. While most of the characteristics (predictor variables) included in the model are well-established factors influencing dementia risk, the DeepSurv dementia prediction model developed in this study may not be directly applicable to populations in other countries or regions. When used in different regions, some variables may need to be adjusted and further validated based on local demographics. 2) The age of the population included in this study was 60 years and above, which limits the applicability of the DeepSurv dementia prediction model to individuals outside this age range. 3) Although the UK Biobank’s dementia diagnoses are derived from hospital records and are updated dynamically, some participants may not have received regular or timely medical treatment. This limitation may affect the accuracy and generalizability of the model’s training data.
Conclusion
This study successfully developed and validated the DeepSurv dementia prediction model for individuals aged 60 and above by integrating genetic and clinical data. The model was then deployed on AliCloud servers to facilitate clinical translation. It is anticipated that this prediction model will provide more accurate treatment decision support in clinical practice and serve as a valuable reference for the primary prevention of dementia.
Data availability
The data underpinning our study's findings are accessible from the UK Biobank. However, due to a rigorous approval process, access to these data is restricted, and they are not publicly available.
Abbreviations
- AD:
-
Alzheimer's disease
- BS:
-
Brier Score
- C-index:
-
Concordance index
- CoxPH:
-
Cox proportional hazards Cox
- DeepSurv:
-
Deep Learning Survival Analysis
- GBM:
-
Gradient Boosting Machine
- GWAS:
-
Genome-Wide Association Studies
- IV:
-
Instrumental Variable
- IVW:
-
Inverse variance weighted
- LightGBM:
-
Light Gradient Boosting Machine
- NN:
-
Neural network
- PDP:
-
Partial Dependence Plot
- PRS :
-
Polygenic risk score
- SNPs:
-
Single nucleotide polymorphisms
References
Fong TG, Inouye SK. The inter-relationship between delirium and dementia: the importance of delirium prevention. Nat Rev Neurol. 2022;18:579–96.
Heng X, Liu X, Li N, Lin J, Zhou X. Spatial disparity and factors associated with dementia mortality: A cross-sectional study in Zhejiang Province. China Front Public Health. 2023;11:1100960.
Page A, Potter K, Clifford R, McLachlan A, Etherton-Beer C. Prescribing for Australians living with dementia: study protocol using the Delphi technique. BMJ Open. 2015;5: e008048.
Langa KM, Larson EB, Crimmins EM, Faul JD, Levine DA, Kabeto MU, et al. A Comparison of the Prevalence of Dementia in the United States in 2000 and 2012. JAMA Intern Med. 2017;177:51–8.
Wu YT, Fratiglioni L, Matthews FE, Lobo A, Breteler MM, Skoog I, et al. Dementia in western Europe: epidemiological evidence and implications for policy making. Lancet Neurol. 2016;15:116–24.
Tao M, Liu H, Cheng J, Yu C, Zhao L. Motor-Cognitive Interventions May Effectively Improve Cognitive Function in Older Adults with Mild Cognitive Impairment: A Randomized Controlled Trial. Behav Sci (Basel). 2023;13:737.
Walters K, Hardoon S, Petersen I, Iliffe S, Omar RZ, Nazareth I, et al. Predicting dementia risk in primary care: development and validation of the Dementia Risk Score using routinely collected data. BMC Med. 2016;14:6.
Park KM, Sung JM, Kim WJ, An SK, Namkoong K, Lee E, et al. Population-based dementia prediction model using Korean public health examination data: A cohort study. PLoS One. 2019;14:e0211957.
Wang L, Li P, Hou M, Zhang X, Cao X, Li H. Construction of a risk prediction model for Alzheimer’s disease in the elderly population. BMC Neurol. 2021;21:271.
Merkin A, Krishnamurthi R, Medvedev ON. Machine learning, artificial intelligence and the prediction of dementia. Curr Opin Psychiatr. 2022;35:123–9.
Li W, Lin S, He Y, Wang J, Pan Y. Deep learning survival model for colorectal cancer patients (DeepCRC) with Asian clinical data compared with different theories. Arch Med Sci. 2023;19:264–9.
Katzman JL, Shaham U, Cloninger A, Bates J, Jiang T, Kluger Y. DeepSurv: personalized treatment recommender system using a Cox proportional hazards deep neural network. BMC Med Res Methodol. 2018;18:24.
Steinfeldt J, Buergel T, Loock L, Kittner P, Ruyoga G, Zu BJ, et al. Neural network-based integration of polygenic and clinical information: development and validation of a prediction model for 10-year risk of major adverse cardiac events in the UK Biobank cohort. Lancet Digit Health. 2022;4:e84-94.
Sudlow C, Gallacher J, Allen N, Beral V, Burton P, Danesh J, et al. UK biobank: an open access resource for identifying the causes of a wide range of complex diseases of middle and old age. Plos Med. 2015;12: e1001779.
Raisi-Estabragh Z, Petersen SE. Cardiovascular research highlights from the UK Biobank: opportunities and challenges. Cardiovasc Res. 2020;116:e12–5.
Marioni RE, Harris SE, Zhang Q, McRae AF, Hagenaars SP, Hill WD, et al. GWAS on family history of Alzheimer’s disease. Transl Psychiat. 2018;8:99.
Jansen IE, Savage JE, Watanabe K, Bryois J, Williams DM, Steinberg S, et al. Genome-wide meta-analysis identifies new loci and functional pathways influencing Alzheimer’s disease risk. Nat Genet. 2019;51:404–13.
Leng Y, Ackley SF, Glymour MM, Yaffe K, Brenowitz WD. Genetic Risk of Alzheimer’s Disease and Sleep Duration in Non-Demented Elders. Ann Neurol. 2021;89:177–81.
Fan M, Sun D, Zhou T, Heianza Y, Lv J, Li L, et al. Sleep patterns, genetic susceptibility, and incident cardiovascular disease: a prospective study of 385 292 UK biobank participants. Eur Heart J. 2020;41:1182–9.
Sharma A, Verbeke W. Understanding importance of clinical biomarkers for diagnosis of anxiety disorders using machine learning models. PLoS One. 2021;16:e0251365.
WS Miceforest. Github. https://github.com/AnotherSamWilson/miceforest. 2021.
Wang Q, Hall GJ, Zhang Q, Comella S. Predicting implementation of response to intervention in math using elastic net logistic regression. Front Psychol. 2024;15:1410396.
Prasanna C, Realmuto J, Anderson A, Rombokas E, Klute G. Using Deep Learning Models to Predict Prosthetic Ankle Torque. Sensors (Basel). 2023;23:7712.
Huang D, Chen K, Song B, Wei Z, Su J, Coenen F, et al. Geographic encoding of transcripts enabled high-accuracy and isoform-aware deep learning of RNA methylation. Nucleic Acids Res. 2022;50:10290–310.
Kang IA, Njimbouom SN, Kim JD. Optimal Feature Selection-Based Dental Caries Prediction Model Using Machine Learning for Decision Support System. Bioengineering (Basel). 2023;10:245.
Liu Y, Fan L, Wang L. Urban virtual environment landscape design and system based on PSO-BP neural network. Sci Rep-UK. 2024;14:13747.
Rozet A, Kronish IM, Schwartz JE, Davidson KW. Using Machine Learning to Derive Just-In-Time and Personalized Predictors of Stress: Observational Study Bridging the Gap Between Nomothetic and Ideographic Approaches. J Med Internet Res. 2019;21:e12910.
Sanga P, Singh J, Dubey AK, Khanna NN, Laird JR, Faa G, et al. DermAI 1.0: A Robust, Generalized, and Novel Attention-Enabled Ensemble-Based Transfer Learning Paradigm for Multiclass Classification of Skin Lesion Images. Diagnostics. 2023;13:3159.
Yang W, Zhang X, Lei Q, Cheng X. Research on Longitudinal Active Collision Avoidance of Autonomous Emergency Braking Pedestrian System (AEB-P). Sensors (Basel). 2019;19:4671.
Nguyen TP, Cho MY. Insulator Leakage Current Prediction Using Hybrid of Particle Swarm Optimization and Gene Algorithm-Based Neural Network and Surface Spark Discharge Data. Comput Intel Neurosc. 2022;2022:6379141.
Jiao SJ, Liu LY, Liu Q. A Hybrid Deep Learning Model for Recognizing Actions of Distracted Drivers. Sensors (Basel). 2021;21:7424.
DP Kingma, JL Ba. ADAM: A method for stochastic optimization. Cornell University - arXiv. 2014.
Harrell FJ, Califf RM, Pryor DB, Lee KL, Rosati RA. Evaluating the yield of medical tests. JAMA. 1982;247:2543–6.
Haider H, Hoehn B, Davis S, Greiner R. Effective Ways to Build and Evaluate Individual Survival Distributions. J Mach Learn Res. 2020;21:1–63.
Trigg LE, Lyons S, Mullan S. Risk factors for, and prediction of, exertional heat illness in Thoroughbred racehorses at British racecourses. Sci Rep-UK. 2023;13:3063.
Tsuzuki S, Fujitsuka N, Horiuchi K, Ijichi S, Gu Y, Fujitomo Y, et al. Factors associated with sufficient knowledge of antibiotics and antimicrobial resistance in the Japanese general population. Sci Rep-UK. 2020;10:3502.
Zhou Q, Soldat DJ. Creeping Bentgrass Yield Prediction With Machine Learning Models. Front Plant Sci. 2021;12:749854.
Salwa M, Islam S, Tasnim A, Al MM, Bhuiyan MR, Choudhury SR, et al. Health Literacy Among Non-Communicable Disease Service Seekers: A Nationwide Finding from Primary Health Care Settings of Bangladesh. Health Lit Res Pract. 2024;8:e12-20.
Zou X, Ren Y, Yang H, Zou M, Meng P, Zhang L, et al. Screening and staging of chronic obstructive pulmonary disease with deep learning based on chest X-ray images and clinical parameters. BMC Pulm Med. 2024;24:153.
Sayed HY, Ghaly RM, Mostafa AA, Hemeda MS. Cardiovascular effects and clinical outcomes in acute opioid toxicity: A case-control study from Port Said and Damietta Governorates Egypt. Toxicol Rep. 2024;13: 101756.
Johnson LL, Shih JH. CHAPTER 20 - An introduction to survival analysis. Academic Press; 2007. p. 273–82. https://doiorg.publicaciones.saludcastillayleon.es/10.1016/B978-012369440-9/50024-4.
Bashiri A, Ghazisaeedi M, Safdari R, Shahmoradi L, Ehtesham H. Improving the Prediction of Survival in Cancer Patients by Using Machine Learning Techniques: Experience of Gene Expression Data: A Narrative Review. Iran J Public Health. 2017;46:165–72.
Feng J, Zhang H, Li F. Investigating the relevance of major signaling pathways in cancer survival using a biologically meaningful deep learning model. BMC Bioinformatics. 2021;22:47.
Chen JB, Yang HS, Moi SH, Chuang LY, Yang CH. Identification of mortality-risk-related missense variant for renal clear cell carcinoma using deep learning. Ther Adv Chronic Dis. 2021;12:1755284400.
Kaur G, Rana PS, Arora V. State-of-the-art techniques using pre-operative brain MRI scans for survival prediction of glioblastoma multiforme patients and future research directions. Clin Transl Imaging. 2022;10:355–89.
Surianarayanan C, Lawrence JJ, Chelliah PR, Prakash E, Hewage C. A Survey on Optimization Techniques for Edge Artificial Intelligence (AI). Sensors (Basel). 2023;23:1279.
Lin Y, Zhang W, Cao H, Li G, Du W. Classifying Breast Cancer Subtypes Using Deep Neural Networks Based on Multi-Omics Data. Genes (Basel). 2020;11:888.
Fayosse A, Nguyen DP, Dugravot A, Dumurgier J, Tabak AG, Kivimaki M, et al. Risk prediction models for dementia: role of age and cardiometabolic risk factors. BMC Med. 2020;18:107.
Xu W, Tan L, Wang HF, Jiang T, Tan MS, Tan L, et al. Meta-analysis of modifiable risk factors for Alzheimer’s disease. J Neurol Neurosur PS. 2015;86:1299–306.
Lambert SA, Abraham G, Inouye M. Towards clinical utility of polygenic risk scores. Hum Mol Genet. 2019;28:R133–42.
Chen H, Chen J, Cao Y, Sun Y, Huang L, Ji JS, et al. Sugary beverages and genetic risk in relation to brain structure and incident dementia: a prospective cohort study. Am J Clin Nutr. 2023;117:672–80.
Edland SD, Silverman JM, Peskind ER, Tsuang D, Wijsman E, Morris JC. Increased risk of dementia in mothers of Alzheimer’s disease cases: evidence for maternal inheritance. Neurology. 1996;47:254–6.
Gomez-Tortosa E, Barquero MS, Baron M, Sainz MJ, Manzano S, Payno M, et al. Variability of age at onset in siblings with familial Alzheimer disease. Arch Neurol. 2007;64:1743–8.
Oh DJ, Bae JB, Lipnicki DM, Han JW, Sachdev PS, Kim TH, et al. Parental history of dementia and the risk of dementia: A cross-sectional analysis of a global collaborative study. Psychiat Clin Neuros. 2023;77:449–56.
Shang X, Roccati E, Zhu Z, Kiburg K, Wang W, Huang Y, et al. Leading mediators of sex differences in the incidence of dementia in community-dwelling adults in the UK Biobank: a retrospective cohort study. Alzheimers Res Ther. 2023;15:7.
Zhang JJ, Wu ZX, Tan W, Liu D, Cheng GR, Xu L, et al. Associations among multidomain lifestyles, chronic diseases, and dementia in older adults: a cross-sectional analysis of a cohort study. Front Aging Neurosci. 2023;15:1200671.
Ninomiya T. Diabetes mellitus and dementia. Curr Diabetes Rep. 2014;14:487.
Chatterjee S, Peters SA, Woodward M, Mejia AS, Batty GD, Beckett N, et al. Type 2 Diabetes as a Risk Factor for Dementia in Women Compared With Men: A Pooled Analysis of 2.3 Million People Comprising More Than 100,000 Cases of Dementia. Diabetes Care. 2016;39:300–7.
Rost NS, Brodtmann A, Pase MP, van Veluw SJ, Biffi A, Duering M, et al. Post-Stroke Cognitive Impairment and Dementia. Circ Res. 2022;130:1252–71.
Minami Y, Tsuji I, Fukao A, Hisamichi S, Asano H, Sato M, et al. Physical status and dementia risk: a three-year prospective study in urban Japan. Int J Soc Psychiatr. 1995;41:47–54.
Cooper C, Lodwick R, Walters K, Raine R, Manthorpe J, Iliffe S, et al. Inequalities in receipt of mental and physical healthcare in people with dementia in the UK. Age Ageing. 2017;46:393–400.
Kochan NA, Bunce D, Pont S, Crawford JD, Brodaty H, Sachdev PS. Reaction Time Measures Predict Incident Dementia in Community-Living Older Adults: The Sydney Memory and Ageing Study. Am J Geriat Psychiat. 2016;24:221–31.
Funding
This study was funded by Guangdong Provincial Key Laboratory of Traditional Chinese Medicine Informatization (Grant No. 2021B1212040007), Special Projects for Scientific and Technological Research in Chinese Medicine and Ethnomedicine (QZYY-2024–035).
Author information
Authors and Affiliations
Contributions
Jun Lyu, Yitong Ling, Qing Liu: Design research; Shiqi Yuan, Qing Liu, Xiaxuan Huang: First written manuscript; Shiqi Yuan, Yitong Ling, Xiaxuan Huang, Shanyuan Tan, Zihong Bai: Participate in data analysis; Juan Yu, Fazhen Lei, Huan Le, Qingqing Ye, Xiaoxue Peng, Juying Yang: Provide comments and changes to the manuscript. All authors reviewed the manuscript.
Corresponding authors
Ethics declarations
Ethics approval and consent to participate
All procedures in the UK Biobank study were conducted in accordance with ethical standards at both the institutional and national levels, as well as the Helsinki Declaration of 1975 (revised in 2008) (5). Additionally, it verifies that all participants provided written informed consent and were enrolled in the study only after approval from the Northwest Multicenter Research Ethics Committee (11/NW/0382).
Consent for publication
All authors gave their consent for the publication of this article.
Competing interests
The authors declare no competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Yuan, S., Liu, Q., Huang, X. et al. Development of an individualized dementia risk prediction model using deep learning survival analysis incorporating genetic and environmental factors. Alz Res Therapy 16, 278 (2024). https://doiorg.publicaciones.saludcastillayleon.es/10.1186/s13195-024-01663-w
Received:
Accepted:
Published:
DOI: https://doiorg.publicaciones.saludcastillayleon.es/10.1186/s13195-024-01663-w