- Research
- Open access
- Published:
Variables associated with cognitive function: an exposome-wide and mendelian randomization analysis
Alzheimer's Research & Therapy volume 17, Article number: 13 (2025)
Abstract
Background
Evidence indicates that cognitive function is influenced by potential environmental factors. We aimed to determine the variables influencing cognitive function.
Methods
Our study included 164,463 non-demented adults (89,644 [54.51%] female; mean [SD] age, 56.69 [8.14] years) from the UK Biobank who completed four cognitive assessments at baseline. 364 variables were finally extracted for analysis through a rigorous screening process. We performed univariate analyses to identify variables significantly associated with each cognitive function in two equal-sized split discovery and replication datasets. Subsequently, the identified variables in univariate analyses were further assessed in a multivariable model. Additionally, for the variables identified in multivariable model, we explored the associations with longitudinal cognitive decline. Moreover, one- and two- sample Mendelian randomization (MR) analyses were conducted to confirm the genetic associations. Finally, the quality of the pooled evidence for the associations between variables and cognitive function was evaluated.
Results
252 variables (69%) exhibited significant associations with at least one cognitive function in the discovery dataset. Of these, 231 (92%) were successfully replicated. Subsequently, our multivariable analyses identified 41 variables that were significantly associated with at least one cognitive function, spanning categories such as education, socioeconomic status, lifestyle factors, body measurements, mental health, medical conditions, early life factors, and household characteristics. Among these 41 variables, 12 were associated with more than one cognitive domain, and were further identified in all subgroup analyses. And LASSO, rigde, and principal component analysis indicated the robustness of the primary results. Moreover, among these 41 variables, 12 were significantly associated with a longitudinal cognitive decline. Furthermore, 22 were supported by one-sample MR analysis, and 5 were further confirmed by two-sample MR analysis. Additionally, the quality of the pooled evidence for the associations between 10 variables and cognitive function was rated as high. Based on these 10 identified variables, adopting a more favorable lifestyle was significantly associated with 38% and 34% decreased risks of dementia and Alzheimer’s disease (AD).
Conclusion
Overall, our study constructed an evidence database of variables associated with cognitive function, which could contribute to the prevention of cognitive impairment and dementia.
Introduction
Cognitive function encompasses a spectrum of abilities, including memory, attention, processing speed, spatial orientation, language, and problem-solving. These cognitive faculties are essential for maintaining independence and enhancing overall quality of life [1]. Both genetic [2, 3] and environmental [4, 5] variables exert a substantial influence on cognitive function. Notably, the rising global incidence of dementia—currently affecting over 55 million individuals and projected to reach 131 million by 2050—underscores the imperative to identify potentially modifiable lifestyle factors to formulate effective prevention strategies [6]. Unraveling the complex interactions among various variables and cognitive competencies is fundamental for developing interventions designed to enhance cognitive performance and prevent cognitive decline and dementia [7,8,9].
Previous hypothesis-driven research has identified some risk factors linked to cognition [4, 10, 11]. Livingston et al. recently reported 14 modifiable risk factors that could potentially prevent up to 45% of dementia cases, adding two new risks to the previously identified 12 risks reported in 2020 [5, 6]. However, there are still unidentified numerous variables associated with cognitive function. In addition, hypothesis testing is essential and has significantly advanced our understanding of the environmental epidemiology of cognitive function. Nonetheless, it is important to recognize several limitations inherent to this approach. Single-exposure analysis failed to embrace the multiplicity and co-occurrence of exposures, which can result in biased effect sizes and type I errors [12]. Besides, this approach may lead to selection bias and publication bias [13], thereby potentially underestimating the significance of certain factors within single-exposure analyses. Therefore, systematic and agnostic approaches are required to identify genuine signals. The exposome-wide association study (EWAS) is a hypothesis-free strategy to systematically and agnostically explore the association between multiple exposures and a single outcome [12]. By simultaneously exploring multiple exposures, EWAS could reduce false-positive rates and bias [14]. Furthermore, EWAS can be used to validate previously established risk factors and identify novel factors. This approach has been previously applied to various outcomes such as HIV, diabetes, depression, and psychotic experiences [12, 15,16,17]. Nevertheless, there remains a dearth of systematic research on the influence of the exposome on cognitive function.
This study leveraged the UK Biobank dataset to perform an EWAS to identify variables significantly associated with the four cognitive domains employing univariate and multivariate regression analysis in all population and specific subgroups. Additionally, LASSO, rigde, and principal component analysis (PCA) were used to further indicate the robustness of the results. Moreover, we examined the associations between variables and longitudinal cognitive decline. Given the substantial genetic component of many lifestyle factors, we performed Mendelian randomization (MR) analysis to explored potential causal relationships [5]. Furthermore, we assessed the quality of evidence for identified variables, and for those with high-quality evidence, we examined their combined effects on dementia and Alzheimer’s disease (AD).
Materials and methods
Study participants
The study population is from the UK Biobank (UKB) database, which was initiated between 2006 and 2010 and recruited over 500,000 individuals at baseline, who were subsequently followed up [18]. At assessment centers, cognitive tests, a wide range of phenotypic, health-related, and other data were collected. Data on disease outcomes were obtained from hospital inpatient admissions, primary care records, and electronic health care records. Additionally, blood samples were collected for genetic analysis. Written consent was obtained from participants. The National Research Ethics Service Committee North West Multi-Centre Haydock gave ethical approval (MREC, https://www.ukbiobank.ac.uk/learn-more-about-uk-biobank/about-us/ethics). The current analyses were performed under UKB application number 19,542.
Cognitive function assessments
During the baseline assessment, participants underwent four computer-administer cognitive tests. These bespoke tests were designed to assess cognitive functions across different domains and provide valuable insights into aging and pathology within a large population. The brief and bespoke 4 cognitive assessments, including fluid intelligence test (FI), pairs matching test (PAM), reaction time test (RT), and prospective memory test (PM), carried out in the UKB correlated moderately-to-strongly with well-established, standard cognitive tests, showing the reliability [19]. Out of the 164,621 individuals who completed all four cognitive tests, 158 participants diagnosed with dementia were excluded, resulting in a total of 164,463 individuals included in our analysis.
Fluid intelligence test (FI)—a test that evaluates reasoning and problem-solving abilities, encompassing both fluid and crystal intelligence. It serves as a representative measure of general intelligence within this battery. Scores range from 0 to 13, with higher numbers indicating better performance. The outcome is assessed based on the log-transformed total number of correct answers.
Pairs matching test (PAM)—a test focuses on episodic memory. The test displays 6 pairs of matching symbol cards for 5 s in a random pattern, and requires individuals to identify as many pairs as possible with cards faced down. The outcome evaluation is based on the log-transformed total number of errors made by individuals who completed the test. Higher numbers indicate poorer performance.
Reaction time test (RT)—a test assesses processing speed. Participants are asked to press a button as soon as they see two identical cards in each of the 12 rounds. The outcome evaluation is based on the log-transformed mean reaction time for correct responses. Higher numbers indicate slower performance.
Prospective memory test (PM)—a test focuses on event-based prospective memory. Before the test battery, participants were informed to touch an Orange Circle, when they were shown four coloured shapes and asked to touch the Blue Square at the end of the cognitive tests. The outcome was incorrect response on the first attempt.
Cognitive decline was assessed in a subset of 16,547 to 50,287 individuals who participate in a second follow-up and were re-evaluated using the same cognitive tests during the period around 2014. The mean follow-up period was 9.36 years (SD = 2.11, range = 3.17 to 16.01 years). Cognitive decline was operationally defined as a deterioration in their FI, PAM, PM test results, or a reaction time delayed of at least 100 milliseconds compared to baseline measurements [20].
Risk factors
Variables from UKB with more than 20% missing values at baseline were excluded, and data from the methodology section, of non-environmental factors, and with one level were discarded. And for variables with collinearity |r| >0.9, we retained the one that is more important for cognitive functions, easier to interpret, or had a higher degree of accuracy. Finally, 364 variables were obtained, of which 258 were dichotomized and 106 were treated as continuations removing extreme values and transformed into z-scores. Medical disease data defined by International Classification of Diseases (ICD) have been combined appropriately. All variables were subdivided into the following categories: (1) Education, (2) Socioeconomic status (SES), (3) Leisure activity, (4) Body measurement index, (5) Mental health, (6) Diet, (7) Sleep, (8) Physical activities, (9) Smoke, (10) Alcohol, (11) Sexual factors, (12) Early life factors, (13) Household, (14) Sun exposure, (15) Medical conditions, (16) Medical disease, (17) Medical examination, (18) Environments. The field IDs in UKB and detailed information of variables was supplied in Supplement 1 eFig. 1 and Supplement 2 eTable 1.
Dementia and AD incidence
The dementia diagnoses were determined using the corresponding three-character ICD codes (F00-F03, G30), obtained from UKB health outcome datasets, which included the first instances of health outcomes (Category 1712, encompassing hospital records, death registrations and primary care data) and algorithmically defined outcomes (Category 42). Additionally, the diagnoses for AD were based on the ICD codes (F00, G30). We selected incident dementia cases that occurred after a three-year baseline assessment until September 2023 to minimize reverse causality.
Statistical analyses
The statistical analyses were conducted using R version 4.0.4, involving three main steps. Firstly, a comprehensive exposome-wide analysis was performed. We used linear regression to test the associations of the variables with three cognitive domains (fluid intelligence, pairs matching, and reaction time) and used logistic regression to explore the associations with prospective memory. At first, we randomly divided the data into the discovery dataset and the validation dataset [12]. We conducted univariate analysis to identify variables which showed significant association with cognitive function in both the discovery and validation datasets. The Bonferroni corrected P value (P < 1.37 × 10 − 4) was employed for univariate analysis to rigorously control for false positives during initial screening stage [12]. For these identified variables, multivariate analysis was performed to further explore the association with cognitive function, in which P values after false discovery rate (FDR) correction less than 0.05 were deemed statistically significant, thereby reducing false positives while balancing the risk of false negatives. All the above analyses were performed for four cognitive tests by adjusting for age, gender, and APOE ε4. Besides, we performed the above analyses in subgroups stratified by age (≥ 60 years or < 60 years), gender (female or male), and APOE ε4 carrier status (carriers or non-carriers), SES (annual average total household income < £18,000 or ≥£18,000), education (college degree or not); and to eliminate potential confounding effects due to the collinearity of certain factors with comorbidities, we conducted the analyses in a subgroup of healthy individuals who were free of diabetes, cardiovascular (coronary heart disease, hypertension, disorders of lipoprotein metabolism, heart failure) and cerebrovascular (stroke) conditions, and chronic obstructive pulmonary disease.
Moreover, to reduce the impact of multicollinearity on the results, the LASSO [21], ridge regression analysis [22], and the PCA [21] were conducted with the adjustment of age, gender, and APOE ε4, which could mitigate overfitting arising from collinearity and complexity among variables. Varimax orthogonal rotation method was performed for PCA. And the scree plot is used to determine the number of principal components (PCs) to keep with cumulative variance contribution rate > 85%. Also, sensitivity analyses were performed by (1) additionally adjusting for race and different assessment centers, (2) additionally adjusting for above chronic diseases to control for collinearity of certain factors and avoid an over selection, and (3) by imputing the missing data with random forest approach using the “missRanger” package [23], which further validate robustness of the findings.
Furthermore, for the variables significantly associated with cognitive function in multivariable model, we explored the non-linear relationships between continuous variables and cognition using restricted cubic splines with four knots [24]; And the longitudinal association of the variables with cognitive decline were investigated using logistic regression models. All above analyses adjusted for age, gender, and APOE ε4. Additionally, we conducted a sensitivity analysis by adjusting for varying follow-up durations to further validate the association with cognitive decline.
Secondly, we conducted MR analyses to further examine the genetic associations. One-sample MR analyses were utilized to investigate the potential links between the significant variables identified in the EWAS analyses and cognitive function. The MRlap method was used in the analyses to address the potential bias arising from sample overlap. This method, which has been recently developed and proven to be robust, was successful in generating estimates using corrected effects [25, 26]. The summary statistics of both exposures and cognitive function tests were obtained from a genome-wide association study (GWAS) of population from UKB (http://www.nealelab.is/uk-biobank), with available detailed protocols (https://github.com/Nealelab/UK_Biobank_GWAS). SNPs classified as low confidence variant were excluded from the analysis. For the one-sample MR analysis, we rigorously selected a P-value threshold of 5.0 × 10 − 8 and a linkage disequilibrium (LD) clumping cut-off of 0.001 for the genetic instruments to minimize false positives. Subsequently, variables identified in one-sample analyses were subjected to further verification using two-sample MR analyses. For the two-sample MR, we used external GWAS data of dementia as outcome from FinnGen study (https://r8.finngen.fi/pheno/F5_DEMENTIA), and we opted for a more relaxed P-value threshold of 5 × 10 − 6 and a LD of 0.01 to enhance statistical power and capture a broader spectrum of genetic effects [27]. The inverse-variance weighted (IVW) method, in conjunction with weighted median and MR-Egger was utilized to produce the odds ratios (OR) and 95% CI. Potential heterogeneity and horizontal pleiotropy were assessed by IVW Cochran’s Q test, the Egger intercept, and the IVW (random-effects model) MR-PRESSO, and leave-one-out (LOO) analysis were used to address between variants heterogeneity and pleiotropy effect [17, 28].
Thirdly, we assessed the quality of the pooled evidence for the associations between identified variables and cognitive function. One score was assigned based on the following conditions: (1) to be significantly associated with cognitive function in multivariate analysis and have same direction of effect in univariate analysis; (2) to be significantly associated with more than one cognitive test in multivariate analysis; (3) to be significantly associated with longitudinal cognitive decline; (4) to have genetic association with cognitive function in the one-sample MR analysis; (5) to have genetic association with dementia in the two-sample MR analysis. The total score ranged from 1 to 5, with scores of 4–5, 2–3, and 1 indicating high-quality, medium-quality, and low-quality evidence, respectively. To investigate the joint effect of high-quality variables on dementia and AD, we computed a combined score. Each variable was dichotomized into a binary classification based on either their original binary status or the median value for continue variable. A score of 1 was assigned if the variable was deemed beneficial to cognitive function, and a score of 0 otherwise. The composite score for each individual was subsequently derived by summing the scores of all high-quality variables. The longitudinal association between the combined score and the incidence of dementia and AD in the complete population was investigated using Cox proportional hazards models. We assessed the assumption of proportional hazards, and all analyses adjusted for age, gender, and APOE ε4. P values less than 0.05 were considered statistically significant.
Results
Identification of variables in exposome-wide analysis
A total of 164,463 participants were included in our study, with a female proportion of 54.51% and a mean age of 56.69 (SD = 8.14, range = 39–70) years, and 89.60% were White ethnicity (eTable 1 in Supplement 1). Our univariate analysis revealed 231 variables that were significantly associated with at least one cognitive domain as in both the discovery and validation datasets (Fig. 1, eData 1–4 in Supplement 2). Subsequently, we conducted multivariate analyses on these 231 variables and found that 46 variables remained significant (eData 5 in Supplement 2). Among them, 41 variables showed same direction of effect on cognitive functions in univariate and multivariate analyses, spanning categories such as education, SES, lifestyle factors, body measurement index, mental health, medical conditions, early life factors, and household characteristics. (Fig. 2, eFig. 2 in Supplement 1). Among the 41 variables identified, 12 were found to be associated with multiple cognitive functions. Of these, five variables demonstrated significant associations with all four better cognitive domains: a college degree education, greater right hand grip strength (HGS), more time spent using a computer, playing computer games, and drive faster than motorway speed limit. Conversely, seven variables were significantly associated with at least two poorer cognitive domains: lower household income, reduced peak expiratory flow (PEF), a tendency to be tense or highly-strung, tea intake, abstaining from sugar or foods/drinks containing sugar, the age started to wear glasses or contact lenses, and increased time spent outdoors during winter. In stratified analyses, the above 12 variables that associated with multiple cognitions in all population, were also significantly identified in all subgroups (Fig. 2).
Variables significantly associated with cognitive functions in all population and subgroups. Association between variables and cognitive functions were explored using multivariate analyses adjusting for age, sex, and APOE ε4 status. The green box represents variable positively associated with better cognitive function, while red box represents variable negatively associated with cognitive function. Highlighted variables indicate correlations with multiple cognitive tests in all population, and remain significant in all subgroups. †, fluid intelligence test; ‡, pairs matching test; §, reaction time test; ¶, prospective memory test. Edu, education; SES, socioeconomic status; Hous, household; Env, environment. NA, the variable that was not examined within specific population
Lasso, and ridge regression analysis produced similar results comparing with above multivariate regression analysis, and above 12 variables that significantly associated with multiple cognitions were simultaneously verified by the two analyses (Fig. 3). Moreover, the exposures in total population were merged into 25 PCs, with 4 PCs (protective effect: PC1, deleterious effect: PC5, PC12, PC20) significantly linked to multiple cognitions, and 5 PCs (protective effect: PC15, PC18, deleterious effect: PC6, PC7, PC16) significantly correlated to FI after FDR correction (Table 1). The PC1 was characterized by body measurement index, including higher right and left HGS, PEF, forced vital capacity (FVC), sitting and standing height, basal metabolic rate, impedance of whole body, and the rest; the PC5 included more comprehensive variables including SES, household characteristics, education, mouth-related, hearing-related, and eyes-related conditions, and the rest; the PC12 was characterized by the cardiovascular diseases (CVD) including coronary heart disease, hypertension, angina, dyslipidemia, diabetes, and the rest; and the PC20 included leisure activities, sun exposure, sexual factors, smoke, alcohol intake, and the rest (eFig. 8 in Supplement 1). The primary variables identified by above multivariate and LASSO were mostly among those present in the significant PCs.
Variables significantly associated with cognitive functions through LASSO and ridge regression analysis. Association between variables and cognitive functions were explored using multivariate, LASSO, and ridge regression analysis adjusting for age, sex, and APOE ε4 status. The green box represents variable positively associated with better cognitive function, while red box represents variable negatively associated with cognitive function. Highlighted variables indicate correlations with multiple cognitive tests in each analytic approach; variables that were significantly validated in LASSO and ridge analysis were underlined. FI, fluid intelligence test; PAM, pairs matching test; RT, reaction time test; PM, prospective memory test; Edu, education; SES, socioeconomic status; Exam, medical examinations; PA, physical activity; Early, Early life factors; Hous, household; Env, environment; Sun, sun exposure
Furthermore, sensitivity analyses were performed by incorporating additional adjustments for race and assessment centers (see eFig. 3 in Supplement 1), adjustments for chronic disease (see eFig. 4 in Supplement 1), and by analyzing the data following missRanger interpolation (see eFig. 5 in Supplement 1). The primary findings remained consistent with those obtained from the above multivariate analysis.
Non-linear associations between continuous variables and cognitive functions
We further explored the non-linear relationships between 13 continuous variables out of the 41 variables identified in multivariate analyses and four cognitive domains using restricted cubic splines (RCS) (Fig. 4 and eFig. 9 in Supplement 1). Our findings indicate a rapid enhancement in cognitive function as computer usage time approached the median, and then a plateau in performance was observed. Notably, several apparent inverted J-shaped associations of tea consumption, driving time and watching TV time with specific cognitive domains were revealed; these findings revealed that as these variables increased, cognitive performance exhibited an initial enhancement followed by a decline.
Non-linear relationship between continuous variables and cognitive functions. Restricted cubic splines were employed with adjustment of age, sex, and APOE ε4 status. Larger estimate of fluid intelligence test, while smaller estimate of pairs matching, reaction time, and prospective memory test indicated a better cognitive function
Variables associated with longitudinal cognitive decline
Among the 41 variables identified in the multivariate analyses, 12 exhibited significant associations with longitudinal cognitive decline (eFig. 6 in Supplement 1). In terms of the longitudinal associations with specific cognitive domains, we identified 3 variables for fluid intelligence (protective effect: right, and left HGS; deleterious effect: salt added to food), 2 variables for reaction time (protective effect: college degree education; deleterious effect: napping during day), 9 variables for prospective memory (protective effect: college degree education, higher household income, right HGS, PEF, and steady average or brisk pace; deleterious effect: being tense or highly-strung, never eat sugar or foods/drinks containing sugar, more time spent outdoors in winter, and age started wearing glasses), and no variables for pairs matching. In the sensitivity analysis, which included adjustments for varying follow-up durations, all previously identified associations remained significant, with the exception of the association between salt added to food and fluid intelligence (eFig. 7 in Supplement 1).
Identification of variables in MR analyses
Among the 41 variables identified in multivariate analyses, two variables (education score and employment score) had no available genetic data. Thus, we conducted MR analysis on the remaining 39 variables. The heritability (h2) of these variables ranged from 0.01 to 0.33 (eData 6 in Supplement 2). Firstly, our one-sample MR analysis showed that 22 variables were genetically associated with cognitive function in the UKB (Fig. 5). Among these 22 variables, 7 exhibited causal associations with multiple cognitive domains. Subsequently, two-sample MR analysis further identified the cause effect of 5 out of 22 variables on dementia, including college degree education (OR = 0.81, p = 0.029, h2 = 0.17), average total household income > 18,000 (OR = 0.87, p = 0.046, h2 = 0.10), time spent using a computer (OR = 0.76, p = 0.021, h2 = 0.10), hand grip strength (left) (OR = 0.74, p = 0.038, h2 = 0.11), and sitting height (OR = 0.94, p = 0.029, h2 = 0.33) (eData 7 in Supplement 2). No evidence of horizontal pleiotropy effect was found (eData 7 in in Supplement 2).
Results of one-sample Mendelian randomization. Dots represent odds ratios and lines represent 95% Cis, with values exceeding 1 suggesting a positive risk association and values below 1 indicating a protective correlation. FI, fluid intelligence test; PAM, pairs matching test; RT, reaction time test; PM, prospective memory test
The combined impact of variables with high-quality evidence on dementia and AD
Reliability assessments were performed on 41 identified variables, showing that 10 variables had high-quality evidence (Fig. 6A). A combined effect score (0–10 points) was calculated for these variables in the total population, with higher scores indicating a more favorable lifestyle (eTable2 in Supplement 1). Using Cox proportional hazards analysis, we conducted a longitudinal analysis on a sub-set of 303, 657 UKB participants who provided high-quality evidence for variables, with an average follow-up duration of 14.14 years. This subset was subsequently categorized into three groups based on the tertiles of the scores: unfavorable, intermediate, and favorable lifestyles. The results indicated that individuals with a favorable lifestyle exhibited significantly reduced risks of dementia (HR = 0.62, 0.54–0.71, P < 0.001) and AD (HR = 0.66, 0.54–0.82 P < 0.001) (Fig. 6B). Moreover, for each point increase in the score, the risks of developing dementia and AD decreased by 9% and 10%, respectively (Fig. 6B).
Reliability assessments and the combined impact on incident dementia and Alzheimer’s disease. Among the 41 variables for multivariate analyses, 10 were rated as high-quality reliability, 13 as moderate-quality reliability, and 18 as low-quality reliability. Utilizing the Cox proportional hazards regression model, we investigated the combined effects of the 10 highly reliable variables on dementia and Alzheimer’s disease, after adjusting for age, gender, and APOE ε4 status. SES, socioeconomic status; BMI, body measurement index; HGS, hand grip strength; MS, multiple sclerosis; HR, hazard ratio; CI, confidence intervals
Discussion
Our study comprehensively investigated the association between various variables and cognitive function. Our multivariate analysis identified 41 variables that significantly related to cognitive function. Among them, 12 were associated with cognitive decline; and 22 were supported by one-sample MR analysis, and of which 5 were further confirmed by two-sample MR analysis, including well-studied variables such as the education and SES, and less-studied variables such as hand grip strength, sitting height, and computer usage. The quality of the pooled evidence for the associations between 10 variables and cognitive function was rated as high. These 10 variables encompassed 6 protective factors (higher education, higher pre-tax household income, higher right and left HGS, PEF, and more time spent using computer), and 4 risk factors (being tense or highly strung, never eating sugar or foods/drinks containing sugar, spending more time outdoors during winter, age started wearing glasses or contact lenses). The combined effects of these 10 variables were significant on decreasing the risks of dementia and AD.
Livingston et al. have recently identified 14 potentially modifiable lifestyle factors through systematic reviews and meta-analyses [5], with which our findings show partial consistency. Firstly, our study corroborates the association between higher educational attainment and improved cognitive function, as well as a deceleration in cognitive decline; this causal relationship was further substantiated through Mendelian Randomization (MR) analysis. Secondly, we identified a significant association between mental health and cognitive performance. Notably, subthreshold mental health symptoms, such as tension, anxious feelings, and mood swings, were more strongly linked to diminished cognitive function than diagnosed depressive, anxiety, and bipolar disorders. Thirdly, while the relationship between physical activity and cognition yielded inconsistent results, we observed a relatively stable and robust association between cognitive functions and physical fitness indicators, including grip strength, PEF, sitting height, and walking speed. Fourthly, we observed a correlation between passive smoking and poorer cognitive function. Also, the analysis revealed a significant negative correlation between mental and behavioral disorders related to alcohol consumption and cognitive function, underscoring the risks of excessive drinking. Moreover, air pollution and social isolation were linked to diminished cognitive performance in both subgroup analyses and imputed datasets. Additionally, hearing impairments, hypertension, diabetes, and CVD emerged as significant PCs associated with cognitive function. Although low-density lipoprotein (LDL) cholesterol levels were not significantly identified to be linked to cognition, diagnosed lipid metabolism disorder was associated with poorer cognition in subgroup analysis and important PCs. Regarding vision loss, we also found that age started wearing glasses or contact lenses were associated with poorer cognitive function, highlighting the unexpected impact of eyesight on cognitive impairment among the elderly [29]. We did not record significant association between diagnosed obesity and cognition, while related variables such as higher waist to hip ratio was found to be associated with poorer cognition. Regrettably, due to the absence of data on traumatic brain injury, we were unable to explore its relationship with cognition.
The college degree education and pre-tax household income are among 10 variables affecting cognitive function with high-quality evidence in our study. Previous studies have consistently shown that higher levels of education are associated with better cognitive performance and a reduced risk of dementia [9, 30, 31]. Furthermore, economic factors have been identified as crucial factors influencing cognition [32] and neuropsychological diseases [33]. Their impact on dementia incidence even has been reported to outweigh that of comorbidities and lifestyles by a previous study [34]. Both education and household income have substantial genetic component [35], and our MR analysis confirmed their causal associations with cognition. Education has the potential to impact cognitive stimulation and reserve [9, 36], and lower socioeconomic status may result in psychological stress, diminished well-being, and decreased cognitive stimulation [33]. Thus, our study provided robust evidence validating the relationships between education, household income, and cognition.
Mental health variables such as being tension, anxiety, mood swings, sensitivity feelings showed significant deleterious relationship with cognition in our study. These findings underscore the important role of subthreshold mental health in cognitive functions. An increasing attention has been paid to the influences of mental health on cognitive function [11, 37]. A longitudinal study revealed significant associations between mental disorders and an increased incidence of dementia [38]. Moreover, a meta-analysis revealed associations of anxiety with an increased risk of AD and vascular dementia [39]. Unhealthy mental states can greatly hinder one’s ability to effectively cope with life stressors, and prolonged exposure to stress can lead to changes in brain structure, increasing susceptibility to depression and ultimately resulting in cognitive impairment [40, 41]. It is crucial to prioritize and emphasize the pivotal role in the future.
As for body measurement indexes, hand grip strength, PEF, and sitting height had medium to high-quality evidence. Previous studies with relatively small sample sizes of other cohorts have indicated that grip strength is associated with improved cognition and a reduced risk of dementia [42, 43]. Consistently, our study found grip strength was significantly associated with four cognitive domains in a large-scale population, and the results were supported by subsequent longitudinal and MR analyses. PEF, an indicator of lung function, has recently been linked to the onset of dementia [44, 45]. Our exposome-wide analysis identified PEF as a significant variable associated with cognitive function. Moreover, moderate-quality evidence suggested that higher sitting height might confer protective effects on cognition. Sitting height has substantial genetic component [46], and our MR analysis further demonstrated its causal relationship with cognition. Prior evidence is mostly restricted to cross-sectional evidence [47,48,49], which might be partly due to the design of hypothesis-driven research. Overall, our findings suggested that physical indicators may be closely associated with cognitive function, which needs further exploration.
Furthermore, engagement in cognitive activities, such as more time spent using computer and playing computer games, were associated with better cognitive functions, supported by moderate to high-quality evidence in our study. In contrast, time spent watching TV and duration of mobile phone showed deleterious association with cognition, particularly affecting fluid intelligence. Studies investigating the impact of electronic and social media products on cognition and dementia have produced inconsistent findings [50,51,52]. For instance, one study revealed that computer use was linked to a reduced risk of dementia, whereas television watching was associated with an increased risk of dementia [52], aligning with our results. It is important to note that the observed association with computer use may be influenced by a false-positive bias, as individuals with greater computer proficiency may achieve higher scores on computer-based cognitive tests. Nonetheless, our MR analysis further confirmed its causal association with cognition. Future studies need to explore the type, content, and duration of media use in detail to better understand its impact on cognitive function.
As for diet, abstaining from sugar or foods/drinks containing sugar were negatively associated with cognitive function, supported by high-quality evidence. Moreover, the association was consistent and robust even among individuals without diabetes and other diseases. A meta-analysis of interventional studies has revealed a significantly positive effect of glucose on verbal performance [53]. Additionally, our EWAS identified the excessive addition of salt to food as a factor that negatively affects cognition, which is consistent with previous evidence [54]. The impact of salt on cognition may be influenced by factors such as blood pressure [55] and tau pathology [56]. Moreover, our restricted cubic spline model showed a inverted J-shaped curve between tea consumption and cognitions. And a previous study showed that participants with a daily intake of 3 to 5 cups had the lowest risk of dementia [57]. The impact of salad and raw vegetable consumption on cognitive function showed variability across different analytical approaches. Specifically, multivariate and lasso regression analyses revealed a negative correlation with FI. In contrast, PCA highlighted a beneficial impact of a plant-based diet on FI. These seemingly discordant findings underscore the complexity of the relationship between dietary habits and cognitive function and need further research.
Moreover, we discovered a strong association between spending more time outdoors during winter and poorer cognition. Besides, regarding sleep factors, we found that nap during day and daytime dozing were associated with poorer cognitive function, while evening person showed beneficial effect on prospective memory. Also, we observed that early life factors, such as part of a multiple birth, the body size and height at age 10, were significantly associated with adult cognitive function. These findings need further exploration.
Our study has several strengths. Firstly, we leveraged the large data and EWAS methodology to systematically uncover environmental variables of cognitive function. Secondly, diverse analytical methods were utilized to ascertain the reliability of our results. Moreover, we conducted longitudinal analyses of cognitive decline and MR analyses to validated the causality associations. In addition, we conducted a reliability assessment of the association between variables and cognition, and developed a strategy that effectively reduced the incidence of dementia and AD.
There are several limitations in our study as well. Firstly, although the brief cognitive tests we used are reliable [19], we did not conduct comprehensive and standardized cognitive assessments. Secondly, our study was limited by the quality and accessibility of exposures in the UK Biobank database [58], and the dichotomization of certain variables hindered our ability to conduct dose-response analyses, such as categorizing education as possessing a college degree or not. Thirdly, given that the primary analyses are cross-sectional and dementia is a syndrome characterized by long-term progression, there is a potential for reverse causation bias. Furthermore, although Mendelian Randomization (MR) analysis was employed to investigate causal associations, the low heritability of some variables complicates the ability to adequately explain them through genetic factors [59]. Additionally, despite the use of a large sample size and the application of logarithmic transformation to the data to meet the assumptions of linear regression, heteroscedasticity may remain an issue due to the complexity and variability inherent in the numerous variables analyzed, necessitating further investigation. Besides, employing a systematic methodology with rigorous protocols could potentially lead to type II errors, thereby obscuring the statistical significance of critical variables. Also, the sample population comprised of volunteers who were predominantly young and in good health, suggesting a heightened level of participation and overall wellness. Consequently, caution is warranted when interpreting and generalizing the findings.
Conclusion
In conclusion, our research leveraged a large-scale population sample and employed EWAS and MR methodologies to identify variables influencing cognitive function. We confirmed the significant association between education, economic status, and cognitive functions. Besides, we identified significant relationships between cognitive function and multiple other types of variables including body measurements such as grip strength, PEF, and sitting height, and leisure activities such as computer and television use, and mental health issues such as a tendency to be tense or highly-strung. Also, we developed a favorable lifestyle score that demonstrated significant potential to reduce the incidence of dementia and AD.
Data availability
This study utilized the UK Biobank Resource under application number 19542. The primary data utilized in this research were procured from the UK Biobank Resourc. Researchers registered with UK Biobank can apply for access to the database by completing an application. This application necessitates a concise summary of the proposed research plan, the requisite data fields, any novel data or variables projected to be generated, and a monetary contribution to offset the incremental expenses associated with processing the application. (https://www.ukbiobank.ac.uk/enable-your-research/applyfor-access).
Abbreviations
- EWAS:
-
Exposome-wide association analysis
- MR:
-
Mendelian randomization analysis
- LASSO:
-
The least absolute shrinkage and selection operator
- AD:
-
Alzheimer’s disease
- PCA:
-
Principal component analysis
- UKB:
-
UK Biobank
- PC:
-
Principal components
- FI:
-
Fluid intelligence test
- PAM:
-
Pairs matching test
- RT:
-
Reaction time test
- PM:
-
Prospective memory test
- LD:
-
Linkage disequilibrium
- IVW:
-
Inverse-variance weighted
- SES:
-
Socioeconomic status
- SD:
-
Standard deviation
- HGS:
-
Hand grip strength
- PEF:
-
Peak expiratory flow
- FVC:
-
Forced vital capacity
- RCS:
-
Restricted cubic splines
- OR:
-
The odds ratios
- HR:
-
The hazard ratio
References
Chen C-Y, et al. The impact of rare protein coding genetic variation on adult cognitive function. Nat Genet. 2023;55(6):927–38.
Savage JE, et al. Genome-wide association meta-analysis in 269,867 individuals identifies new genetic and functional links to intelligence. Nat Genet. 2018;50(7):912–9.
Ge YJ, et al. Genome-wide meta-analysis identifies ancestry-specific loci for Alzheimer’s disease. Alzheimers Dement. 2024;20(9):6243–56.
Zhao Y-L, et al. Environmental factors and risks of cognitive impairment and dementia: a systematic review and meta-analysis. Ageing Res Rev. 2021;72:101504.
Livingston G et al. Dementia prevention, intervention, and care: 2024 report of the Lancet standing Commission. Lancet, 2024.
Livingston G, et al. Dementia prevention, intervention, and care: 2020 report of the Lancet Commission. Lancet (London England). 2020;396(10248):413–46.
Hou X-H, et al. Models for predicting risk of dementia: a systematic review. J Neurol Neurosurg Psychiatry. 2019;90(4):373–9.
Calvin CM, et al. Predicting incident dementia 3–8 years after brief cognitive tests in the UK Biobank prospective study of 500,000 people. Alzheimers Dement. 2019;15(12):1546–57.
Lövdén M et al. Education and cognitive functioning across the Life Span. Psychol Sci Public Interest: J Am Psychol Soc, 2020. 21(1).
Yu JT, et al. Evidence-based prevention of Alzheimer’s disease: systematic review and meta-analysis of 243 observational prospective studies and 153 randomised controlled trials. J Neurol Neurosurg Psychiatry. 2020;91(11):1201–9.
Kivipelto M, Mangialasche F, Ngandu T. Lifestyle interventions to prevent cognitive impairment, dementia and Alzheimer disease. Nat Rev Neurol. 2018;14(11):653–66.
Lin BD, et al. Nongenetic factors Associated with psychotic experiences among UK Biobank participants: exposome-wide analysis and mendelian randomization analysis. JAMA Psychiatry. 2022;79(9):857–68.
Patel CJ, Ioannidis JP. Studying the elusive environment in large scale. JAMA. 2014;311(21):2173–4.
Ioannidis JP, et al. Researching genetic versus nongenetic determinants of disease: a comparison and proposed unification. Sci Transl Med. 2009;1(7):7ps8.
Patel CJ, et al. Systematic identification of correlates of HIV infection: an X-wide association study. Aids. 2018;32(7):933–43.
He Y, et al. Comparisons of Polyexposure, Polygenic, and clinical risk scores in risk prediction of type 2 diabetes. Diabetes Care. 2021;44(4):935–43.
Choi KW, et al. An exposure-wide and mendelian randomization Approach to identifying modifiable factors for the Prevention of Depression. Am J Psychiatry. 2020;177(10):944–54.
Bycroft C, et al. The UK Biobank resource with deep phenotyping and genomic data. Nature. 2018;562(7726):203–9.
Fawns-Ritchie C, Deary IJ. Reliability and validity of the UK Biobank cognitive tests. PLoS ONE. 2020;15(4):e0231627.
Ko F, et al. Association of retinal nerve Fiber layer Thinning with Current and Future Cognitive decline: a study using Optical Coherence Tomography. JAMA Neurol. 2018;75(10):1198–205.
Saberi Hosnijeh F, et al. Association between anthropometry and lifestyle factors and risk of B-cell lymphoma: an exposome-wide analysis. Int J Cancer. 2021;148(9):2115–28.
Schüssler-Fiorenza Rose SM, et al. A longitudinal big data approach for precision health. Nat Med. 2019;25(5):792–804.
Wong KC, et al. Uncovering clinical risk factors and Predicting severe COVID-19 cases using UK Biobank Data: Machine Learning Approach. JMIR Public Health Surveill. 2021;7(9):e29544.
Bhaskaran K, et al. Association of BMI with overall and cause-specific mortality: a population-based cohort study of 3·6 million adults in the UK. Lancet Diabetes Endocrinol. 2018;6(12):944–53.
Prince C, et al. The relationships between women’s reproductive factors: a mendelian randomisation analysis. BMC Med. 2022;20(1):103.
Ye CJ, et al. Mendelian randomization evidence for the causal effects of socio-economic inequality on human longevity among europeans. Nat Hum Behav. 2023;7(8):1357–70.
Song Y, et al. Social isolation, loneliness, and incident type 2 diabetes mellitus: results from two large prospective cohorts in Europe and East Asia and mendelian randomization. EClinicalMedicine. 2023;64:102236.
Nazarzadeh M, et al. Plasma lipids and risk of aortic valve stenosis: a mendelian randomization study. Eur Heart J. 2020;41(40):3913–20.
Killeen OJ, Zhou Y, Ehrlich JR. Objectively measured visual impairment and dementia prevalence in older adults in the US. JAMA Ophthalmol. 2023;141(8):786–90.
Xu W, et al. Education and Risk of Dementia: dose-response Meta-analysis of prospective cohort studies. Mol Neurobiol. 2016;53(5):3113–23.
Stern Y, et al. Influence of education and occupation on the incidence of Alzheimer’s disease. JAMA. 1994;271(13):1004–10.
Steptoe A, Zaninotto P. Lower socioeconomic status and the acceleration of aging: an outcome-wide analysis. Proc Natl Acad Sci U S A. 2020;117(26):14911–7.
Kivimäki M, et al. Association between socioeconomic status and the development of mental and physical health conditions in adulthood: a multi-cohort study. Lancet Public Health. 2020;5(3):e140–9.
Yaffe K, et al. Effect of socioeconomic disparities on incidence of dementia among biracial older adults: prospective study. BMJ. 2013;347:f7051.
Hill WD, et al. Genome-wide analysis identifies molecular systems and 149 genetic loci associated with income. Nat Commun. 2019;10(1):5741.
Roe CM, et al. Education and Alzheimer disease without dementia: support for the cognitive reserve hypothesis. Neurology. 2007;68(3):223–8.
Moffitt TE, Caspi A. Psychiatry’s opportunity to prevent the rising burden of age-related disease. JAMA Psychiatry. 2019;76(5):461–2.
Richmond-Rakerd LS, et al. Longitudinal associations of Mental disorders with Dementia: 30-Year analysis of 1.7 million New Zealand citizens. JAMA Psychiatry. 2022;79(4):333–40.
Becker E, et al. Anxiety as a risk factor of Alzheimer’s disease and vascular dementia. Br J Psychiatry. 2018;213(5):654–60.
Marin MF, et al. Chronic stress, cognitive functioning and mental health. Neurobiol Learn Mem. 2011;96(4):583–95.
Fusar-Poli P, et al. What is good mental health? A scoping review. Eur Neuropsychopharmacol. 2020;31:33–46.
Cui M, et al. Grip strength and the risk of Cognitive decline and Dementia: a systematic review and Meta-analysis of Longitudinal Cohort studies. Front Aging Neurosci. 2021;13:625551.
Buchman AS, et al. Grip strength and the risk of incident Alzheimer’s disease. Neuroepidemiology. 2007;29(1–2):66–73.
Russ TC, Kivimäki M, Batty GD. Respiratory disease and lower pulmonary function as risk factors for dementia: a systematic review with Meta-analysis. Chest. 2020;157(6):1538–58.
Ma YH, et al. Lung function and risk of incident dementia: a prospective cohort study of 431,834 individuals. Brain Behav Immun. 2023;109:321–30.
Richard D et al. Functional genomics of human skeletal development and the patterning of height heritability. Cell, 2024.
Liang X, et al. Short sitting height and low relative sitting height are Associated with severe cognitive impairment among older women in an Urban Community in China. Neuroepidemiology. 2015;45(4):257–63.
Heys M, et al. Childhood growth and adulthood cognition in a rapidly developing population. Epidemiology. 2009;20(1):91–9.
Kim JM, et al. Limb length and dementia in an older Korean population. J Neurol Neurosurg Psychiatry. 2003;74(4):427–32.
Cho G, Betensky RA, Chang VW. Internet usage and the prospective risk of dementia: a population-based cohort study. J Am Geriatr Soc. 2023;71(8):2419–29.
Pall ML. Low intensity Electromagnetic fields Act via Voltage-gated Calcium Channel (VGCC) activation to cause very early Onset Alzheimer’s Disease: 18 distinct types of evidence. Curr Alzheimer Res. 2022;19(2):119–32.
Raichlen DA, et al. Leisure-time sedentary behaviors are differentially associated with all-cause dementia regardless of engagement in physical activity. Proc Natl Acad Sci U S A. 2022;119(35):e2206931119.
García CR, et al. Effect of glucose and sucrose on cognition in healthy humans: a systematic review and meta-analysis of interventional studies. Nutr Rev. 2021;79(2):171–87.
Liu W, et al. Excessive Dietary Salt Intake exacerbates cognitive impairment progression and increases dementia risk in older adults. J Am Med Dir Assoc. 2023;24(1):125–e1294.
Kendig MD, Morris MJ. Reviewing the effects of dietary salt on cognition: mechanisms and future directions. Asia Pac J Clin Nutr. 2019;28(1):6–14.
Faraco G, et al. Dietary salt promotes cognitive impairment through tau phosphorylation. Nature. 2019;574(7780):686–90.
Zhang Y, et al. Consumption of coffee and tea and risk of developing stroke, dementia, and poststroke dementia: a cohort study in the UK Biobank. PLoS Med. 2021;18(11):e1003830.
Allen NE, et al. Prospective study design and data analysis in UK Biobank. Sci Transl Med. 2024;16(729):eadf4428.
Burgess S, et al. Guidelines for performing mendelian randomization investigations: update for summer 2023. Wellcome Open Res. 2019;4:186.
Acknowledgements
We express our gratitude to all the participants and professionals who have contributed to the UK Biobank, the participants and investigators of the FinnGen study, and all the participants involved in the present study.
Funding
This study was supported by grants from grants from the Science and Technology Innovation 2030 Major Projects (2022ZD0211600), the National Natural Science Foundation of China (82071201, 91849126), the National Key R&D Program of China (2018YFC1314702), Shanghai Municipal Science and Technology Major Project (No.2018SHZDZX01) and ZHANGJIANG LAB, Tianqiao and Chrissy Chen Institute, and the State Key Laboratory of Neurobiology and Frontiers Center for Brain Science of Ministry of Education, Fudan University.
Author information
Authors and Affiliations
Contributions
JTY and LT had full access to all of the data in the study and take responsibility for the integrity of the data and the accuracy of the data analysis. JTY conceived and designed the project. All authors acquired, analyzed or interpreted data: YLZ, YNH, YJG conducted statistical analysis. YLZ, YNH wrote the initial draft of the manuscript. YLZ, YNH, YJG, YZ, LYH, YF, DDZ, YNO, XPC, JFF, WC, LT, JTY critically revised the manuscript for important intellectual content.
Corresponding authors
Ethics declarations
Ethics approval and consent to participants
UK Biobank has received ethical approval from the North West Multi-centre Research Ethics Committee (MREC, https://www.ukbiobank.ac.uk/learn-more-about-uk-biobank/about-us/ethics), and informed consent through electronic signature was obtained from study participants in accordance with the Declaration of Helsinki. This study utilized the UK Biobank Resource under application number 19542.
Consent for publication
Not applicable.
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Zhao, YL., Hao, YN., Ge, YJ. et al. Variables associated with cognitive function: an exposome-wide and mendelian randomization analysis. Alz Res Therapy 17, 13 (2025). https://doiorg.publicaciones.saludcastillayleon.es/10.1186/s13195-025-01670-5
Received:
Accepted:
Published:
DOI: https://doiorg.publicaciones.saludcastillayleon.es/10.1186/s13195-025-01670-5