Anatomical change during radiotherapy for head and neck cancer, and its effect on delivered dose to the spinal cord

Highlights • A cohort of 133 head & neck cancer patients treated with TomoTherapy was examined.• Differences between planned and delivered maximum spinal cord dose were small.• Substantial weight loss and anatomical change during treatment was observed.• No link between weight loss or anatomical change, and dose differences was seen.


a b s t r a c t
Background and purpose: The impact of weight loss and anatomical change during head and neck (H&N) radiotherapy on spinal cord dosimetry is poorly understood, limiting evidence-based adaptive management strategies. Materials and methods: 133 H&N patients treated with daily mega-voltage CT image-guidance (MVCT-IG) on TomoTherapy, were selected. Elastix software was used to deform planning scan SC contours to MVCT-IG scans, and accumulate dose. Planned (D P ) and delivered (D A ) spinal cord D 2% (SCD 2% ) were compared. Univariate relationships between neck irradiation strategy (unilateral vs bilateral), T-stage, Nstage, weight loss, and changes in lateral separation (LND) and CT slice surface area (SSA) at C1 and the superior thyroid notch (TN), and DSCD 2% [(D A -D P ) D 2% ] were examined. Results: The mean value for (D A -D P ) D 2% was À0.07 Gy (95%CI À0.28 to 0.14, range À5.7 Gy to 3.8 Gy), and the mean absolute difference between D P and D A (independent of difference direction) was 0.9 Gy (95%CI 0.76-1.04 Gy). Neck treatment strategy (p = 0.39) and T-stage (p = 0.56) did not affect DSCD 2% . Borderline significance (p = 0.09) was seen for higher N-stage (N2-3) and higher DSCD 2% . Mean reductions in anatomical metrics were substantial: weight loss 6.8 kg; C1LND 12.9 mm; C1SSA 12.1 cm 2 ; TNLND 5.3 mm; TNSSA 11.2 cm 2 , but no relationship between weight loss or anatomical change and DSCD 2% was observed (all r 2 < 0.1). Conclusions: Differences between delivered and planned spinal cord D 2% are small in patients treated with daily IG. Even patients experiencing substantial weight loss or anatomical change during treatment do not require adaptive replanning for spinal cord safety. Radiotherapy (RT) remains a crucial treatment modality for patients with head and neck cancer (HNC), and IMRT is considered standard of care in most cases [1]. Modern linear accelerators deliver complex dose distribution geometries in 3 dimensions, with plans that include multiple dose levels and simultaneous integrated boosts, whilst respecting dose constraints to key organs at risk (OARs) [2]. Adaptive Radiotherapy (ART) is a logical step in the evolution of external beam X-Ray therapy for HNC. The anatomy of both the patient's tumour and normal tissues can alter substantially during a course of treatment [3][4][5][6][7], and these changes may result in differences between intended or planned radiation dose to a structure (D P ), and that which is actually delivered (D A ) [8]. ART adds a fourth dimension to the complex geometry of an IMRT plan, by amending that geometry during a course of RT to account for these changes [5].
Although the concept of ART is popular, its uptake and utilisation lack uniformity, and it is often performed at the discretion of individual treating physicians [9]. Most work-flows require a new simulation CT scan and a new RT treatment plan, which can be laborious and resource-intensive for the hospital, and onerous for patients [10]. Many clinical protocols are institution-specific, and other centres use this approach only in the research arena. Studies are starting to show both dosimetric [11] clinical benefit for ART in selected patients [12], but many patients may not require this intervention, and rational selection methods are needed [1,13].
A crucial dose-limiting OAR for HNC RT is the spinal cord (SC). With modern RT equipment and techniques, severe SC toxicity in the form of transverse myelitis is extremely rare, although Lhermitte's syndrome remains surprisingly common [14]. Nonetheless, transverse myelitis is catastrophic, and the SC is treated with great respect during planning; conservative dose constraints are given the highest priority in the treatment planning system optimiser. HNC patients experience weight loss and anatomical change during a course of RT treatment [3,4], and it could be hypothesised that all internal anatomy, including the SC, may be subject to significant differences between D P and D A as a result. Available literature suggests that such differences are generally small, and depend on the frequency and quality of image-guidance (IG) [7,15,16]. However, these papers study small cohorts, and there are minimal data examining potential associations between weight loss, anatomical change, and differences in SC dose.
The major objectives for this work were: firstly, to examine differences between planned and delivered SC dose in a systematic way in a large cohort; secondly, to measure weight loss and inter-fraction anatomical change in the same patients; finally, to use these data to look for factors that may predict clinically important dose differences, which could in turn be managed by ART strategies.

Patient data and treatment planning
VoxTox is an interdisciplinary research programme based at the University of Cambridge [8,17], seeking to define differences between D P and D A , and better understand relationships between radiation dose and toxicity. The study received ethical approval in February 2013 (13/EE/0008) and is part of the UK Clinical Research Network Study Portfolio (UK CRN ID 13716).
For this pre-planned sub-study, a cohort of 133 HNC patients treated between 2010 and 2016 was defined, with inclusion criteria as follows; squamous or salivary gland carcinomas, a minimum prescription dose of 60 Gy in 30 fractions, neck irradiation to include at least levels II and III unilaterally, and availability of all daily mega-voltage CT (MVCT) images for dose recalculation. Baseline patient characteristics and treatment protocols are summarised in Table 1. All patients in the study were immobilised with a 5-point fixation thermoplastic shell for CT-simulation and treatment. Target and OAR volume definition, as well as CTV and PTV margins, were in-line with a current UK trial protocol [18]. Manual SC contours were expanded axially by 3 mm to a planning organ at risk volume (PRV), to which a dose objective of 46 Gy, and constraint of 50 Gy, was applied. All patients were treated on TomoTherapy Hi-Art units with daily MVCT image guidance (IG) and positional correction with a zero-action level approach (DIPC) [19]. Although IG -MVCTs had a smaller field of view than corresponding kVCT planning scans, all of the upper cervical SC, corresponding to the area of highest cord dose, was imaged daily for all patients. Specifics of the IG workflow used during treatment of patients in this study are detailed in Supplementary material.

Computing delivered dose
The planning kVCT scans of all patients were retrieved from archive, tokenised, and reloaded into segmentation software (Prosoma 3.3, MEDCOM, Darmstadt, Germany). To ensure consistency, the SC was manually re-contoured on all planning scans by the first author. The inter-observer consistency of this observer relative to 5 senior radiation oncologists experienced in managing HNC or CNS tumours was found to be acceptable, as previously reported [20]. All MVCT IG imaging (over 4000 MVCT scans), kVCT structure sets, planned dose cubes, and TomoTherapy delivery plans, were transferred to the University of Cambridge Cavendish Laboratory for curation, and automated processing using the Ganga taskmanagement system [17,21].
Deformable image registration was used to propagate kVCT SC contours onto daily MVCT images. This was performed using the Elastix software [22] -trained and validated as previously described [20]. Daily dose was calculated using a locally implemented ray-tracing algorithm -CheckTomo [8,20,23,24], and voxel dose-histories were accumulated. Final SC delivered dose was reported as a cumulative dose volume histogram (DVH). To minimise sources of discrepancy in the process of dose calculation, planned SC dose was also re-computed using CheckTomo. As the SC is a serial organ [25], we examined maximum dose, and report D 2% in line with ICRU 83 [26]. The difference, DSCD 2% , between D A D 2% and D P D 2% was used for comparison with predictive variables. DSCD 2% is reported as (D A -D P ), as the clinically relevant difference in this context is higher D A .
The potential impact of DIPC on delivered SC dose was investigated by simulating D A values in the absence of any IG. MVCT DICOM headers include details of daily radiographer couch shifts. These values were combined to compute an average couch shift for each patient. The spinal cord contour was translated by the inverse of this shift -relative to the planned dose cube -and D 2% recorded in this position. D P D 2% values were then subtracted to give a simulated 'No IG' D A D 2% value.

Predictive variables and anatomical change
To replicate previous methodologies, disease T and N staging data were examined as potential predictors of SC dose differences [27]. Binary classification was used for both metrics (T0-2 vs. T3-4, N0-1 vs. N2-3). Potential differences in SCD 2% between patients undergoing unilateral neck irradiation (UNI), and bilateral neck irradiation (BNI), were examined, as was the effect of dose gradient in the vicinity of the spinal cord. To do this, the SC contour on the kVCT scan was grown axially by 6 mm; twice the PRV margin. On the CT slice with the highest SC D P D 2% , 4 point doses on this SC + 6 mm ring were measured at 0, 90, 180 and 270 degrees relative to the SC centroid. From these values, corresponding values on the same vector at the edge of the SC contour were subtracted. Totals were summed, then divided by 24 to give a mean dose gradient in Gy/mm ( Supplementary Fig. 1).
Weight loss (WL) is a common reason to instigate ART [5,13], and previous work has directly linked weight loss to changes in SC dose [28]. Patients within VoxTox are weighed at baseline, and weekly during treatment. For this study, baseline weight, and weight measured in the final week of treatment were used to calculate a difference (DWL). Twenty-eight patients had missing or inadequate data, leaving 105 patients for this analysis.
We hypothesised that reducing neck separation might be associated with differences in SCD 2% . To test this, first and final fraction IG-MVCT images were reloaded into Prosoma. Caliper measurements of lateral neck diameter (LND) at the level of the CI vertebra and superior thyroid notch (TN) were made on both scans (Fig. 1A-D) [27]. Automated external contours were generated on the same CT slice, and the contour (slice) surface area (SSA) was measured. (Fig. 1 E-H) [3]. Changes from the first to the final fraction of RT were recorded as DC1LND, DC1SSA, DTNLND and DTNSSA. One patient with very atypical setup (extreme cervical kyphosis; axial plane at C1 included maxillary sinus anteriorly, and spinous process of C3 posteriorly) was excluded, leaving 132 for this analysis.

Statistical analysis
Patient weight data were directly entered electronically into MOSAIQ data management software (Elekta, Stockholm, Sweden). Anatomical measurement and DVH data were stored in Microsoft Office Excel 2010. Statistical analysis was undertaken using Excel, and R statistical software (R Notebook, R version 3.4.0). Means and 95% confidence intervals are reported for normally distributed data. Links between categorical variables (UNI vs. BNI, T0-2 vs. T3-4, N0-1 vs. N2-3) and DSC D2% were analysed with two-sample t-tests; changes in anatomical variables were assessed with paired t-tests. Collinearity between changes in anatomical metrics was assessed with Pearson correlation coefficients (R, R 2 ). Relationships between these changes and DSC D2% were examined as univariate relationships with linear regression models (r, r 2 ) [1].
Simulated SCD 2% in the absence of IG was also normally distributed (Fig. 2B). Interestingly, the sample mean was similar (À0.47 Gy, 95% CI À0.88 to À0.05), but the distribution was substantially broader (mean difference independent of direction 1.8 Gy) and a bigger range was observed, À8.1 Gy to 6.4 Gy.
In 72 (54.1%) patients planned SC dose was higher, whilst in 61 (45.9%) delivered dose was higher. Four patients in the cohort had a delivered D 2% that was 2 Gy or more than planned D 2% , and the biggest observed difference was 3.8 Gy (D P = 31.4 Gy, D A = 35.2 G y). No patient in the cohort had a delivered D 2% that breached tolerance dose. There was no relationship between planned D 2% , and whether or not DSCD 2% was positive or negative ( Fig. 2A).

Anatomical change
Weight loss, and start-to-end of treatment anatomical change data are shown in Table 2. In order to better understand patterns of anatomical change, and to ensure that univariate relationships between (relative) anatomical metrics and changes in SC dose were independently meaningful, correlation statistics between weight and anatomical change metrics were calculated. Statistical significance (p < 0.001) was found for all relationships, but no correlation was sufficiently strong to preclude separate analysis versus dose change. Correlations between weight loss and shape metrics were generally weaker (Pearson's product moment correlation, R 0.28 to 0.40) than relationships between shape metrics themselves (R 0.37-0.61). Full results of this analysis are shown in Supplementary Table 1.
Univariate linear regression models were used to compare relative change in anatomical metrics, and DSCD 2% . Results are shown in Table 3, and scatter plots are available in Supplementary Fig. 3. In this cohort, no meaningful association between weight loss, or any other metric of shape change was observed (r 2 < 0.05 for all models).

Spinal cord dose
This study assesses the difference between planned and delivered SC dose in a cohort of 133 patients, compared to sample sizes of 10-20 patients in previously published work [10,16,[29][30][31][32][33]. It is the first to do so by accumulating dose from daily IG scans, whilst systematically analysing anatomical change during radiotherapy, and searching for factors that predict for higher than planned delivered dose to the spinal cord.
The magnitude of absolute differences in SC dose seen in this study (0.9 Gy, 2.5% of planned dose) is broadly similar to previously reported data (2.1-4.9%) [10,[29][30][31][32][33]. However, these studies found SC delivered dose to be systematically higher than planned, in contrast to data presented here. Other authors have not observed such clear systematic differences; Robar and colleagues report a sample mean of 0.3% (sd 4.7%) for DD max 1 cc , similar to our mean DSCD 2% of À0.07 Gy (À0.2% of mean D P D 2% , sd 3.4%) [6], and a more recent study found a mean SCDD max of 0.4 Gy (in plans with a 5 mm CTV to PTV margin) [33]. Differences between planned and delivered dose on the TomoTherapy system have also been reported. Using daily MVCT-IG on a cohort of 20 HNC patients undergoing BNI, Duma et al found that 51% of treatment fractions had a D max higher than planned, and an overall difference of 1.2% from the plan [15]. The same authors found a 'systematic deviation' between planned and accumulated D max in 75% of patients [16], similar to the 74.4% (99/133) of patients in this study who had a delivered SCD 2% >1% different to planned D 2% .
Nonetheless, the discrepancy between data presented here, and studies in which delivered SC dose is systematically higher, merits further discussion. One possible explanation is the frequency of imaging for dose accumulation. Some researchers have accumulated dose from scheduled (kVCT) rescans, and interpolated dose between timepoints (Castadot -4 scans, Ahn -3 scans, Bhide -4 scans, Cheng -2 scans) [10,29,30,32], whilst others use weekly CBCT [31]. Interestingly, all these studies reported systematically higher delivered dose. In contrast, authors using daily IG images to accumulate dose saw smaller systematic differences (Duma et al (MVCT), D A 0.16 Gy higher; van Kranen et al (CBCT), D A 0.4 Gy) [16,33].
PRV margins may also be relevant, and reporting on their use is inconsistent. Graff and colleagues did not find greater dose differences for a 4 mm PRV than for the SC itself [34]. However, Castadot et al found that the difference in SC-PRV (4mm margin) Dmax (1.9 Gy) was more than twice that seen for the cord [10], lending credence to the notion that PRV driven optimisation results in steep dose gradients away from the cord itself, and a more homogeneous 'dose-island' within. Thus anatomical change and setup error may  result in significant differences to PRV dose, without substantial changes to cord dose itself. Our data support this logic; delivered SC dose was systematically higher than planned in patients with a steep dose gradient in the vicinity of the cord itself. Image guidance policy may also be important. In our simulation, we found mean (direction agnostic) DSCD 2% to be double the calculated values where daily IG was used (1.8 Gy vs 0.9 Gy). This supports the findings of previous studies, where daily IG use is associated with smaller dose differences [16,33], and where a direct relationship between frequency of IG, and the magnitude of dose difference is shown [15]. In line with these data, we believe that the small differences seen and reported are due to our policy of DIPC.

Anatomical change and predictors of dose difference
The data provide no evidence to support the initial hypothesis that patients undergoing bilateral neck treatment would be more likely to see higher delivered SC doses. Furthermore, the results show no effect of disease T-stage on DSCD 2% , in line with previous work [27]. A possible relationship between more advanced nodal disease and higher SC dose is suggested, although statistical significance was not reached. Interestingly, N-stage is an important parameter in models that predict for the need for ART [13].
The observed mean weight loss of 7.9% is similar to previously published figures (5-11.3%) [3,4,30,31,35,36]. Crucially, no relationship was seen between weight loss, and higher than planned SC doses, a point on which the literature lacks consensus. The general notion that weight loss leads to significant dosimetric changes is commonly held [1,37], and one study has shown a link between weight loss and changes in SC dose [28]. Others have not [27,31], a finding replicated here. Our study is substantially larger than any which has previously addressed this question, and helps to clarify this point.
Patients undergoing radical RT for HNC may undergo shape change independent of weight loss; studies have shown that reducing neck diameter is common during treatment [27,36,38]. In-silico modelling suggests reducing neck diameter may lead to higher than planned dose to the SC and brainstem [39], and some clinical data have linked such shape change to higher SC dose. Capelle et al [27] found a significant correlation between reducing LND at the TN and DSCD 2% , although no relationship for reduction at C1. Ahn and colleagues [29] found a significant correlation (R = 0.3) between reduction at the level of the 'mandibular joint' and increased SC dose, a surprising result given that this structure is superior to the foramen magnum in most patients (in the axial plane). We observed significant reduction in both lateral separation and axial surface area at the level of both the C1 vertebra, and the Thyroid Notch. The shape change data presented here are similar in magnitude to those previously reported [27,29,38], but no relationships between these changes and a systematic increase in cord dose were seen. We explain this in 3 ways. Firstly the concept suggested by Graff and colleagues [34], that the spinal cord may be preserved from significant dosimetric change due to its central location, and the use of a PRV margin. This leads to the second point, that dosimetric differences are likely to be random, with minimal impact from systematic error [6]. Finally, most importantly, and based on our simulation of D A in the absence of IG and the logic of Duma et al [15], we suggest that our policy of DIPC is crucial to the small differences we report. This is the largest analysis of differences between planned and delivered spinal cord dose in patients undergoing curative radiotherapy for HNC. All patients in the study underwent daily IG with positional correction, and a zero-action level, and observed differences between planned and delivered spinal cord dose were small. No patient had a delivered D 2% that breached tolerance dose. Simulated dose differences in the absence of IG were double calculated values, and patients with steep dose gradients in the vicinity of the spinal cord were more likely to have delivered spinal cord dose higher than planned. Weight loss and anatomical change were common and substantial, but had no impact on spinal cord dosimetry. This finding is novel and may assist clinicians making decisions about ART for patients with HNC who undergo significant inter-fraction weight loss and shape change. In patients treated with daily IG, weight loss and shape change does not mandate radiotherapy replanning for spinal cord safety.