Quantifying the post-radiation accelerated brain aging rate in glioma patients with deep learning

Background and purpose: Changes of healthy appearing brain tissue after radiotherapy (RT) have been previously observed. Patients undergoing RT may have a higher risk of cognitive decline, leading to a reduced quality of life. The experienced tissue atrophy is similar to the effects of normal aging in healthy individuals. We propose a new way to quantify tissue changes after cranial RT as accelerated brain aging using the BrainAGE framework. Materials and methods: BrainAGE was applied to longitudinal MRI scans of 32 glioma patients. Utilizing a pre-trained deep learning model, brain age is estimated for all patients’ pre-radiotherapy planning and follow-up MRI scans to acquire a quantiﬁcation of the changes occurring in the brain over time. Saliency maps were extracted from the model to spatially identify which areas of the brain the deep learning model weighs highest for predicting age. The predicted ages from the deep learning model were used in a linear mixed effects model to quantify aging of patients after RT. Results: The linear mixed effects model resulted in an accelerated aging rate of 2.78 years/year, a signif-icant increase over a normal aging rate of 1 (p < 0.05, conﬁdence interval = 2.54–3.02). Furthermore, the saliency maps showed numerous anatomically well-deﬁned areas, e.g.: Heschl’s gyrus among others, determined by the model as important for brain age prediction. Conclusion: We found that patients undergoing RT are affected by signiﬁcant post-radiation accelerated aging, with several anatomically well-deﬁned areas contributing to this aging.

One of the primary treatments for tumors in the brain is radiation therapy (RT), usually with a combination of surgery, chemotherapy [1,2] and, occasionally, immunotherapy [3]. Due to the complex nature of changes in the brain, quantifying the effects of RT on seemingly healthy brain tissue remains a challenge. Tissue atrophy occurs both post-RT and in normal aging, with atrophy caused by normal aging occurring at a low rate of $0.5% per year in the healthy elderly [4]. As tissue atrophy caused by RT resembles accelerated natural aging, the ability to calculate a patient's "brain age" to quantify atrophy could provide useful insights due to ease of interpretation compared to other methods such as cortical thickness and volumetric measures, which rely on pre-defined features [5][6][7]. One method to quantify brain age is the BrainAGE framework [8], which has already been widely used in describing disease-related changes of the brain, such as Alzheimer's disease [8,9] and psychiatric disorders [10,11]. BrainAGE, or brain age gap estimation, is a technique to determine the discrepancy between a person's chronological-and biological brain age [12], using magnetic resonance imaging (MRI) scans. In a healthy brain, showing normal aging patterns, the chronological and biological ages are expected to be identical, however, in case of an abnormal condition or disease the biological age may differ from the chronological age. In this study, BrainAGE will be applied in glioma patients who have undergone RT.
RT plays an important role in the treatment of cranial tumors, however, the effect of radiation is not selective to cancer affected tissue and it comes with the unintentional side effect of radiation-induced brain injury to the rest of the brain tissue, which can lead to progressive cognitive decline. Cognitive symptoms occur in approximately 50-90% of patients undergoing RT [13], and can lead to a reduced quality of life (QoL) [14]. Quantification of changes in the brain using an age-based metric is of interest, as it can be related to some of the changes in QoL using existing knowledge on brain aging [15]. By predicting a brain age of a patient before RT and comparing that age to the ages predicted for the follow up scans, the effects of RT on brain aging can be determined in a longitudinal manner. Since the effects of RT on cancer-affected tissue are extremely destructive, a similar effect can be expected on the healthy tissue. Moreover, the effects of RT on healthy tissue are similar to aging in a healthy brain, e.g. enlarged ventricles due to tissue atrophy [15], understanding age-related changes in a healthy brain is of importance. Two examples of healthy brain aging is shown in Fig. 1. In Fig. 1/A, an example of aging over a longer time span in cross-sectional data is shown from the Information eXtraction from Images (IXI) data set [16]. Noticeable differences are in the size of the ventricles and space between the gyri, indicating a loss of tissue volume. An example of relatively short-term aging is shown in Fig. 1/B, taken from the MyConnectome data set [17]. The time between the scans is approximately 11 months, which is similar to the mean follow-up time of 11.67 months between first and last scan within our RT data set. The short-term (non-)aging example shows that the healthy brain is not affected by tissue atrophy in a similar time span as the expected overall survival of glioma patients. The differences between the scans, most notably the slightly darker colour of the brain stem in the second scan, are related to the different noise patterns and contrast.

Patient selection and data collection
For this study, MRI scans of 32 histologically proven glioma patients, who received RT in the UMC Utrecht in the period from  [16], showing 3 MRI scans at 40, 60 and 80 years, from left to right respectively. Fig. 1/B shows two longitudinal MRI scans from the MyConnectome project [17], taken approximately 11 months apart. Both MyConnectome scans were preprocessed using optiBET for brain extraction [18] and FLIRT for linear registration to MNI152 template. [19,20] The differences between the scans, most notably the slightly darker colour of the brain stem in the second scan, are related to the different noise patterns and contrast. 2016 to 2017, were analyzed retrospectively. The first scan of every patient was acquired during the postoperative RT planning. The scans are T1-weighted and were acquired on a 3 T Philips Ingenia scanner with the 3D turbo-spin echo sequence without gadolinium enhancements. The voxel resolution is 1 x. 0.96 Â 0.96 mm 3 , with a matrix size of 207 Â 289 and 213 continuous axial slices without gap. The parameters used were TR = 8.1 ms, TE = 3.7 ms and the flip angle was 8 degrees. The minimum number of scans per patient was 2, the pre-RT scan and the first scan after RT. The maximum amount of scans acquired was 9, with a mean of 4.03, and a SD of 1.96. The mean time between first and last scan is 11.67 months, with a SD of 4.29 months. See supplementary Fig. 1 for a visualization of the time between scans and Table 1 for an overview of the patient characteristics. The institutional review board waived informed consent for this retrospective study (study ID 18/274). The IXI data set, utilized for validating the saliency maps, was adjusted for this study to contain 310 healthy individuals between the age range of 42 and 82. This selection of individuals contains 134 males and 176 females, with a mean age of 59.19 and a SD of 9.27.

Preprocessing
To be able to utilize the model from Peng et al. [21] the data required preprocessing. Specifically, the data had to be reoriented to stereotaxic 1 mm Montreal Neurological Institute (MNI) space and non-brain tissue had to be removed. Both of these actions were performed using FSL version 6.0. [22] First, the non-brain tissue was removed using optiBET [18], an optimized version of BET (Brain Extraction Tool) [23], using the default settings. The rigid registration to MNI space was done using the FSL FLIRT tool [19,20], with 6 degrees of freedom, using the image output from optiBET and the MNI152 template as reference. Finally, the transformed images were subject to the pre-trained network to obtain a predicted age for each brain. In Fig. 2, a visualization of the complete preprocessing pipeline can be found. The MRI scan in this figure is from a 42 year old patient, which will function as an example throughout the manuscript.

Deep learning model
The Simple Fully Convolution Network (SFCN) model by Peng et al. [21], which was trained on UK Biobank data [24] and selected for winning the PAC2019 contest [25], was used via Python 3.85 [26] to obtain a probability distribution for each of the MRI scans. This distribution ranged from ages 42 to 82, adding up to a total of 40 possible outcomes. The age with the highest probability was selected as the predicted age. See supplementary Fig. 2 for an example of the model output, showing the softmax layer output histogram of the 42 year old patient pre-RT and a predicted age of 54.

Saliency maps
Furthermore, saliency maps were extracted from the SFCN model to visualize which parts of the brain contribute to the age estimation the most. These saliency maps were created and averaged for both the RT data set and the IXI data set. By retrieving the voxel weights the model used to predict the ages for each patient and average them across the whole cohort, a visualization of the contribution of all brain areas was created. The FSL tool autoaq was used to aid the anatomical interpretation of such areas using built-in atlases [27][28][29][30]. An arbitrary threshold of 0.05 was selected to eliminate the voxels with a relatively low weight to aid visual interpretation and highlight hotspots. Finally, the saliency maps for both data sets were compared by subtracting the saliency maps from each other and removing all data points with a difference less than 0.015.

Statistical modeling
To obtain the aging rate for each patient, RStudio 1.2.5019 [31] with the package'lme4 0 [32] was used to implement mixed models. For the linear mixed model, the formula. y ¼ changes $ time þ ðtimejsubjectÞ was used, where the "changes" are the biological changes in aging in months, and the "time" is the time passed in months. The model is adjusted for normal aging, so only accelerated or decelerated aging are predicted, with 0 being normal aging. The model predicts an age, using mixed effects linear regression for each patient, with the subject being an exclusively random effect, while the time passed is both a fixed and a random effect. By using "time" as both a fixed and random effect, the average aging rate is used as predictor due to the fixed effect, and the random effect allows the aging rate to vary between patients. To correct for the baseline prediction error, the error was removed from all scans, taking the initial error per patient and subtracting it from every scan for that patient. Additionally, a model was created in which the baseline error was removed before training the model. Both models were validated with a standard leave-one-out cross validation (LOOCV).

Results
To test the accuracy of the SFCN model before RT, the pre-RT scans were first analyzed separately. The mean absolute error (MAE) for the pre-RT scans is 6.53 years. In supplementary Fig. 3, the chronological age is compared to the predicted age. The figure shows that the ages of older patients are underestimated by the model, while the ages of younger patients are overestimated.
To show the model output, Fig. 3 contains three follow-up scans of the aforementioned 42 year old male patient. The chronological age for this patient is 42, while the SFCN model predicts 54 years for the pre-RT scan. Four months after RT at the first follow-up, the scan is predicted at 59 years, indicating a five year increase in biological age, or a BrainAGE score of + 5 years in four months, which indicates a 15x aging rate. Similar effects are found based on the next two follow-up scans, showing that this particular patient's brain aged a total of 8 years in the 9 months after RT. The clear upward aging trend presented for this patient is visible in the tissue changes. To analyze the changes in aging rate, the SFCN model predicted the ages for all images. Fig. 4/A shows the predicted age for all scans, minus the chronological age (adjusted for normal aging). The red curve represents a smoothed average of all predictions using a locally estimated scatterplot smoothing (LOESS) function [33]. As this curve trends upward, an increased aging rate is implied as time passes. However, this analysis does not take into account the prediction error, which was shown based-on the pre-RT scans, resulting in a biased average. The averaged curve continues to trend upward towards the late follow-ups at the two year mark. The grey-coloured bands show that the confidence interval is much wider for this area, as the available data is more sparse in this time period. The confidence bands remain narrow in the first 12 months. Fig. 4/B represents the predicted aging over time, corrected for the baseline prediction error. The smoothed average LOESS curve shows a positive slope and narrow confidence bands after the bias is removed. This indicates that the changes in aging, on average, are accelerated.
In Fig. 4/C, the individual aging rates per patient, as predicted by the mixed effects model based on the data in Fig. 4/A, are shown. The average predicted aging rate for all patients was 2.78, which is statistically significant compared to a baseline aging rate of 1 (p < 0.05, CI 2.54-3.02). To adjust for the bias introduced by the prediction error of the pre-RT scans in the SFCN model, as well as to show the heterogeneity of the slopes, Fig. 4/D shows the same regression slopes as Fig. 4/C with the intercepts removed. The smoothed average LOESS regression curve (red) shows an upward trend with the curve flattening as data points get sparser. All lines have a higher slope than normal aging, showing that the linear mixed effects model predicts every patient to age faster than normal, which indicates that all patients undergoing RT will show increased aging when measured with this framework. The narrow Fig. 2. The processing pipelipe for applying the model, starting with A), the unprocessed MRI from the 42 year-old patient. The unprocessed MRI is processed using optiBET to obtain B), the optiBET MRI, which is then put into MNI152 space to get C), the MNI152 MRI. This MRI can then be used in the model, providing D), the model output, which can be used to perform E), the statistical analysis and obtain F), the saliency maps.
confidence bands show that the model has low uncertainty, especially up to 12 months. In supplementary Fig. 4, the results of the alternative model can be found which has a higher predicted aging rate of 3.22 (p = 0.00029, CI 2.11-4.33), but more variance between the rates. This model utilizes the changes in age, but with the baseline error removed before creating the model. To compare these two models and validate them, a leave-one-out cross validation was performed, the results of which can be found in supplementary Table 1. In short, the adjusted model performed slightly better in terms of MAE, but the increase in MAE after LOOCV was similar to the original model. However, while the p-value of the original model increased to 0.156, the p-value of the adjusted model remained similar (1.342e-05). Finally, aging rates of differ-ent patient subgroups were compared, the results of which can be found in supplementary Figs. 5, 6 and 7, for chemotherapy-, WHO grade-and gender-based comparisons, respectively. There are no significant differences between any of the groups (p = 0.583 for male/female, p = 0.1797 for chemotherapy/no chemotherapy and p = 0.2136 for low/high WHO grade). The largest differences between the groups come from the data past 15 months, which is sparse and therefore has high confidence intervals.
To visualize which areas of the brain had the highest contribution in the SFCN model for determining brain age, a population average saliency map was created, shown in supplementary Fig. 8. A saliency map shows which voxels of an MRI scan

BrainAGE in glioma RT
contributed the most to the outcome. In Fig. 5 the thresholded saliency map is shown with a cortical atlas on top. The green crosshair emphasize a cluster of 753 voxels within the Heschl's gyrus, which borders are shown in dark blue. The Heschl's gyrus is associated with acoustic processing [34], indicating that the model weights could be translated to specific brain functions. Other examples of contributing anatomically well-defined areas are the brain stem and the middle cerebellar peduncle clusters with more than 1500 voxels. Supplementary Fig. 9 shows the saliency maps from both the RT population and the healthy IXI population [16], to compare the difference in brain areas of importance between the two populations. The general structure of the saliency maps stay the same, although there are differences between the two, as seen in supplementary   [27] is overlaid, showing which brain regions contains the high-weighted clusters. One of which is the Heschl's gyrus, under the green crosshair. The more warm (red) an area is coloured, the higher it is weighed by the model, with purple being the least weighed. Fig. 9/C. The areas with a difference of more than 0.015 for the radiotherapy patients are the Heschl's Gyrus and white matter in the cerebellum, while in the healthy cohort the brain stem had more contribution.
In supplementary Fig. 10/A, the MRI scan of a 52 year old patient with a large resection cavity in the frontal part brain is shown, with the saliency map and the radiation dose on top. Since purple colour in the current colour scale indicates the lowest model weight, and no colour means zero weight, the model does not take the resection cavity into account for patient A. For patient B, the resection cavity does have a model weight, but the irradiated area shows a lower model weight than the rest of the brain.

Discussion
In this study, we show that cranial RT has a remarkable effect on the brain, which we conceived as postradiation accelerated brain aging. A deep learning model was able to quantify changes in the brain post radiotherapy. After RT, the brain showed a statistically significantly accelerated aging of 2.78-3.22 times the normal rate in 32 glioma patients (p < 0.05). Overall, these results imply that radiation changes the tissue of the brain, which manifests similarly to accelerated aging. Based on studies investigating the effect of normal, healthy aging, an increased brain age after RT may also result in cognitive decline [15]. textcolorredConsequently, normal appearing brain tissue should be spared as much as possible to avoid this post-radiation accelerated aging when irradiating the tumor-affected area.
The changes are quantified with an interpretable score using the entire brain without the potential bias or limitation of commonly used, pre-defined features found in neuroimaging studies, such as cortical thickness [6] and volumetric measures [7], e.g.: hippocampus volume [35]. The population average saliency maps provide insights into the workings of the SFCN model. The areas weighed highest by the model are located in well defined anatomical areas of the brain, and may encompass certain functions of the brain, showing that the saliency maps could be of interest for further research on how the brain changes after radiotherapy. The absence of model weights for brain abnormalities indicates that the model values the existing brain structure most, and bases the predictions on the existing tissue patterns. It is not unexpected that the high dose areas do not overlap with the high saliency weight areas, given the highly interconnected structure of brain [36]. This concept, called the connectome, has been utilized in many disciplines, such as Alzheimers disease [37], bipolar disorder [38] and brain tumor research [39]. It uncovers how disease or intervention related changes alter the underlying brain networks, which have widespread consequences. The regions that received the largest amount of radiation to fulfill the therapeutic needs show lower model weights in the saliency maps. This indicates that the brain tissue that received large amounts of radiation does not necessarily contribute more to the brain age prediction. While the saliency maps can provide insight into the understanding of brain changes, their localization utility should be interpreted with caution [40], especially when multiple models are compared. In order to fully explain the localization element of the deep learning model, there are additional requirements, such as having matched, controlled data from the same MR system and proper statistical correction for multiple comparisons. In the absence of these factors, we performed our comparison only in a qualitative manner.
Lastly, there are some limitations to this study, starting with the baseline prediction error of the deep learning model. The SFCN model has a relatively large prediction error of 6.53 years MAE for the pre-RT scans, which is higher than the 2.14 years found by Peng et al. [21]. All patients present some sort of abnormality, because of the tumor itself and treatment side-effects such as tissue scarring and oedema, which could cause this discrepancy. Since the SFCN model was trained on healthy volunteers without such abnormalities, they are not represented in the prediction, which can affect the performance of the model. Another factor is the differences between scanners, as the deep learning model was trained on UK Biobank data, which uses different scanners and scanner parameters than our cohort. Despite the baseline error introduced by the pre-trained model, the trends shown by the model are clear, indicating that while the accuracy of the pre-RT predictions may not be perfect, the aggressive effect of RT acts despite such model imperfections, measured by the increased aging rate. Furthermore, we assume that the baseline error established from the pre-RT scans applies systematically to all follow-up scans, therefore the measured upward trend in brain ages corresponds solely to the RT-related tissue changes. We find this assumption permissive, as it is highly unlikely that the prediction errors largely collide with the measured effects, since such an error would have to have the same direction as the effect, while a random error is expected on a population level. Additionally, the LOOCV for the mixed effects model should be taken lightly, as cross validation for mixed effects models remain a challenge. [41] The mixed effects model is only intended for parameterization, not for out of sample prediction, as it can only predict new scans for existing patients. In any case, this study does not aim to provide a clinical prediction model, but a proof of concept for brain age prediction in radiotherapy. For future work, BrainAGE may provide a novel way to quantify the effectiveness and damage caused by treatment by comparing patient outcomes. Damage to healthy brain tissue could be minimized by selecting the treatment with reduced accelerated aging.

Conclusion
In conclusion, in this work we show that patients who have undergone cranial RT experience brain tissue atrophy, which can be identified as post-radiation accelerated aging using the Brai-nAGE method. Due to the lack of pre-defined features, BrainAGE can be used to predict aging rates in a non-biased manner. Since the saliency maps indicate that there is an aging effect occurring in the healthy tissue, a global aging effect might be present for the entire brain. This indicates that the mere presence of RT will cause postradiation accelerated aging, potentially affecting patients' QoL. By comparing the post-radiation accelerated aging between patients and selected the treatment with the least amount of aging, damage to healthy tissue caused by the treatment may be reduced.

Data sharing
MRI scans of the patients cannot be shared. The IXI data set can be found on the IXI website [16]. The MyConnectome data set can be found on the MyConnectome website [42]. The SFCN model code can be found on the GitHub page of Peng et al. [43]. The population average saliency maps from both the patient and the IXI groups are available as supplementary materials.

Funding
None.

Conflict of interest
None for all authors.