From 1Brazilian Parkinson’s disease Rehabilitation Initiative (BPaRkI), 2Physical Therapy Postgraduate Program, Physical Therapy Department, Santa Catarina State University (UDESC), Florianópolis and 3Physical Therapy Department, Federal University of Minas Gerais (UFMG), Belo Horizonte, Brazil
Objective: To investigate the measurement properties of the Timed Up and Go Assessment of Biomechanical Strategies (TUG-ABS) to determine its adequacy for use with individuals with Parkinson’s disease.
Subjects: Fifty individuals with Parkinson’s disease.
Design: Diagnostic accuracy.
Methods: The study investigated the following properties: reliability (inter-examiner, intra-examiner, test-retest, internal consistency and minimal detectable change), construct validity, and floor and ceiling effect.
Results: Considering the total score, the inter-examiner, test–retest and intra-examiner reliabilities were classified as excellent (0.95 ≤ intra-class correlation coefficient (ICC)≤0.99). The TUG-ABS presented excellent internal consistency (α = 0.98). The minimal detectable change was 3.82 points. The construct validity between the TUG-ABS and the Unified Parkinson’s Disease Rating Scale (UPDRS) – part III was classified as moderate (ρ = –0.62). Significant, elevated and positive correlations were obtained between TUG-ABS and the Balance Evaluation System Test (BESTest)-VI (ρ = 0.72) and negative correlations between TUG-ABS and TUG (ρ = –0.78). The discriminant function obtained with the total score of TUG-ABS classified 60% of the individuals correctly with respect to the group (determined by the performance in TUG) to which they belonged. One-way analysis of variance (ANOVA) showed that TUG-ABS discriminated the individuals with Parkinson’s disease in all stages according to Hoehn & Yahr. There was a ceiling effect of 22%.
Conclusion: TUG-ABS presented adequate measurement properties in individuals with Parkinson’s disease.
Key words: validity; reliability; Parkinson’s disease; physiotherapy; sensitivity; specificity.
Accepted May 19, 2017; Epub ahead of print Sep 26, 2017
J Rehabil Med 2017; 49: 00–00
Correspondence address: Alessandra Swarowsky, Brazilian Parkinson’s disease Rehabilitation Initiative (BPaRkI), Physical Therapy Postgraduate Program, Physical Therapy Department, Santa Catarina State University (UDESC), Florianópolis, Brazil. Rua Pascoal Simone, 358. Zip code 88080350, Brazil. E-mail: firstname.lastname@example.org
Mobility changes are an inevitable consequence of Parkinson’s disease (PD) and an important cause of reduced quality of life in this population (1). Functional mobility is defined as the ability of a person to move in an independent and safe way in diverse environments and carry out everyday activities, such as locomotion and postural transfers (2). In PD the postural biomechanical changes related to progression of the disease contribute greatly to compromising functional mobility (3). Amongst these changes, the most marked are a narrow base and forward inclination of the trunk, internal rotation of the shoulders, and flexion of the hips and knees (3–6). These changes are reflected in an increase in flexor tonus and weakness of the extensor muscles, and can be associated with common negative effects on gait ability, such as reduced speed, decreased pace length and decreased arm swing (3–6).
Of the tests available to evaluate compromised functional mobility in PD, the Timed Up and Go test (TUG) is one of the most used, reported and highly recommended (7–9). TUG was defined by Isles et al. (10) as a measure of the time taken by an individual to carry out some functional manoeuvres, such as getting up, gait, turning and sitting down. Although commonly used and recommended, the outcome provided by TUG is the time of execution, which is of limited value for diagnoses or planning of treatment, since it fails to identify what is compromised in carrying out the task. Thus, Timed Up and Go Assessment of Biomechanical Strategies (TUG-ABS) was developed to be able to identify biomechanical strategies while carrying out TUG, and its measurement properties have been investigated for hemiparetic individuals (11). The outcome provided by TUG-ABS completes the time measure obtained using TUG. With TUG-ABS, in addition to obtaining a predictive measure of falls for the individuals, one can add standardized information about biomechanical strategies adopted by these individuals while carrying out the test (11–13). This information could improve diagnosis related to mobility limitation, enabling better clinical reasoning and treatment planning.
The aim of this study was therefore to investigate two principal domains of measurement properties of the TUG-ABS (reliability and validity, comparing accuracy between the tests) in order to determine whether the instrument is adequate for use in clinical practice in individuals with PD. We hypothesized that the TUG-ABS would be reliable and valid for assessment of biomechanical strategies in individuals with PD, during getting up, turning and sitting down performed in sequence.
This prospective study aimed to investigate the measurement properties of the TUG-ABS in 2 principal domains (reliability and validity) in individuals with PD. Both the planning of the investigation and the analysis of the measurement properties followed the recommendations of COSMIN (Consensus-based Standards Measurements Instruments – checklist) (14–16).
The research project was submitted to the Ethics Committee for Research using Human Beings of the Santa Catarina State University (UDESC, Florianópolis, Santa Catarina, Brazil) according to the terms of Resolution 466/2012, and was approved under report number 1.251.231. All participants were informed about the study objectives and signed a Free and Clarified Term of Consent (FCTC).
Fifty individuals with PD were included, according to the recommendations of COSMIN with respect to the number of participating individuals for a good study (14–16). They were selected by convenience, were of both sexes, and were all participants in the extension project “Neurofunctional Rehabilitation in Parkinson’s Disease” carried out at the Catarinense Rehabilitation Center of the State Secretary of Health of Santa Catarina, Brazil. All evaluations were carried out in the “on” medication phase, with intervals of between 2 and 4 h after administration and always in the afternoon period.
Inclusion criteria were: individuals diagnosed with PD and confirmed by a neurologist according to UK Brain Bank criteria (17); between stages 1 and 4 of the disease (Hoehn & Yahr) (18); cognitive level in the Mini Mental State Examination (MMSE) (19) relative to education levels given as cut-off points by Bertolucci et al. (20) (score ≥ 13 for illiterate subjects, ≥18 for low/medium education level, and ≥ 26 for a high education level); and on stable medication (for at least 4 weeks). Exclusion criteria were: individuals presenting other associated neurological or orthopaedic diseases and/or joint limitations that compromise the musculoskeletal system making mobility difficult, or important and/or severe dyskinesia (greater than 2 in item 33 of the UPDRS) (21).
Participants’ sociodemographic data and clinical history were obtained with an evaluation form, developed previously for this purpose. All evaluation instruments were applied in the following order: Mini Mental State Examination (MMSE) (19), Hoehn & Yahr stage scale (18), and the motor section of the UPDRS–part III (21). The TUG test (22); section VI of the Balance Evaluation System Test (BESTest) (23) and the TUG-ABS test (11) were applied in sequence.
The TUG-ABS test is composed of 15 items that allow evaluation of the movement strategies adopted to carry out TUG, subdivided into the test activities: getting up from a seated position, gait, turning round 180º, and sitting down from a standing position. The total score varies from 15 to 45, with higher score indicating better performance by the individual (11–13).
Standardization of investigation of the measurement properties
Two evaluators were submitted to a 2-week training period to familiarize them with the TUG-ABS instrument, following orientation by the authors (11, 12). The inter-examiner, test-retest and intra-examiner reliabilities were evaluated. The first was established by comparison of the results of evaluations carried out by 2 independent evaluators (EV1 and EV2), carried out on the same day with a 1-h interval between them, the evaluators having no contact with each other during or in the intervals of the evaluation. The order of evaluations was always EV1 first followed by EV2. The second consisted of 2 evaluations carried out by the same evaluator (EV1) with a 1-week interval. Both evaluators had 5 years of experience in the evaluation of PD. Intra-examiner reliability was also investigated using a video, according to the procedures adopted by the author of the instrument (12): 3 Casio EX-FH20® video cameras were used to record the performance in TUG by the participants, while the evaluator (EV1) gave the scores for TUG-ABS from observations in real time. The TUG test was carried out according to the protocol recommended by Podsiadlo & Richardson (22). Four weeks later the same evaluator (EV1) gave his scores for the TUG-ABS again from the previously recorded videos of the same evaluations in random order. The videos were processed and edited using Adobe Premiere Pro 2.0® software, which allows for the 3 vision points to be grouped in the same file and observed simultaneously on screen. The cameras were positioned as shown in Fig. 1. The camera focus was positioned at a height of 105 cm. All evaluations were carried out according to the procedures adopted for the development of TUG-ABS (24).
Fig. 1. Camera positioning scheme to register performance in Timed Up and Go test.
Sample characterization. The sociodemographic characteristics of the participants were described using descriptive statistics (mean and standard deviation (SD)).
Floor and ceiling effects. The floor and ceiling effects were verified by calculating the percentage of individuals who obtained the minimum and maximum scores in the TUG-ABS. When the incidence of either of these scores was greater than 15% of the sample, the floor (minimum) and/or ceiling (maximum) effects were present, and this criterion was adopted in the present study (25).
Psychometric domains: reliability and validity. The intra-class correlation coefficient (ICC) with 95% confidence interval (95% CI) was used to evaluate the inter-examiner, intra-examiner and test-retest reliabilities. In the presence of significant correlation (α = 5%), the following classification was adopted: weak agreement, ICC < 0.40; moderate agreement, ICC ≤ 0.75; and excellent agreement, ICC > 0.75 (24).
The inter-examiner reliability was also evaluated for the scoring of each item of the TUG-ABS using the weighted kappa (kp) for the individual items. In the presence of significant agreement (α = 5%), the following classification was adopted: excellent for kp values from 0.81 to 1.0; substantial, 0.61–0.80; moderate, 0.41–0.60; considerable, 0.21–0.40; and slight, 0–0.20 (27). A Bland & Altman plot analysis was used to make a more detailed analysis of eventual differences between the scores of the evaluators in the TUG-ABS test (25).
The internal consistency, another type of reliability, was established using Cronbach’s alpha coefficient, adopting a value between 0.70 and 0.90 to classify good agreement between the items of TUG-ABS (26).
The minimal detectable change (MDC) was calculated from the confidence interval, standard deviation of the base evaluation (EV1) and the correlation value of the test-retest, referring to the minimal amount of change that is not due to variation on the TUG-ABS scale. The MDC was measured using the formula MDC = Z score level of confidence × SDbaseline × √(2 [1–rtest-retest)], considering a 95% CI (25).
We hypothesized that the internal consistency of TUG-ABS would vary between 0.7 and 0.9, and the reliabilities should exceed 0.75 for ICC and 0.6 for kp value (26, 27).
Construct validity was investigated from the convergence analysis between the scores in TUG-ABS with the results of TUG; between the scores in TUG-ABS and the total score of UPDRS-III, and between the scores in TUG-ABS and the total score of section VI of the BESTest, which refers to gait stability. Non-parametric Spearman’s correlation was used for these analyses. In the presence of significant correlation (α = 5%), the following classification was adopted: low 0.20 ≤ ρ ≤ 0.49, moderate 0.50 ≤ ρ ≤ 0.69, high 0.7 ≤ ρ ≤ 0.89 and very high 0.90 ≤ ρ ≤ 1 (28). We hypothesized that, for construct validity for convergence analysis, we expected that the correlation between the UPDRS-III and TUG with TUG-ABS was negative and higher than 0.5, and with BESTest section IV and TUG-ABS, positive and higher than 0.5.
The construct validity was also investigated by discriminant analysis, a multivariate analysis that enables evaluation of how the variables differ between the groups and how the individuals are classified within the group they belong to from these variables. For this analysis, the individuals with PD were divided into 3 subgroups based on their performance in the TUG test (quick, moderate or slow). For this purpose, the TUGs of each individual were organized in increasing order, as already carried out in an earlier study with another population group (24). The inter-tertile interval, with decomposition points of 33% and 66%, was then used to divide the individuals within the 3 above-mentioned subgroups. One-way analysis of variance (ANOVA) was used to compare the TUG times of the 3 subgroups, followed by the post-hoc test (α = 5%), to determine if the subgroups differed with respect to the TUG time (24, 27). This first analysis with ANOVA was important to add information about the construct validity of the TUG-ABS, since the development of this instrument was based on the selection of variables that could differentiate the individuals who carried out the TUG test in different times
Discriminant analysis can be used for descriptive and predictive purposes. In the present study, discriminant analysis was carried out to investigate whether the TUG-ABS scores could predict the association between the groups determined by their performance in TUG (quick, moderate and slow) (24, 29). We hypothesized that TUG-ABS could classify correctly the groups determined by TUG performance (quick, moderate and slow).
Finally, comparison of the known groups was also used to investigate the construct validity of the TUG-ABS (to verify whether the groups determined by Hoehn & Yahr are the same as those determined by the TUG-ABS). For this purpose the known groups were determined according to the Hoehn & Yahr scale using 3 degrees of impairment: mild (stages 1 and 2), moderate (stage 3) and severe (stage 4). One-way ANOVA with Tukey’s post-hoc (α = 5%) was used to compare the TUG-ABS scores obtained by each of these 3 groups (to verify differences between the 3 groups). We hypothesized that TUG-ABS would also be able to differentiate the groups determined by Hoehn & Yahr.
All the data were analysed using the MedCalc 12.5.0 and SPSS 20.0 programs, both from Windows, accepting a significance level of 5% for all these procedures.
The present study was carried out from August 2015 to July 2016.
The mean age of the individuals with PD who took part in this study was 67.38 years (SD 8.98), with no sex predominance. According to the Hoehn & Yahr scale there was a predominance of participants in the mild (1 & 2) and moderate (3) stages, of 82%. Table I shows the descriptive statistics of the clinical-demographic characteristics of the sample. No adverse events occurred during the tests.
Table I. Clinical and sociodemographic characteristics of the study participants
The TUG-ABS test presented a ceiling effect of 22% (11/50) and 10 of these 11 individuals were classified as being in the mild impairment stage (25).
In the evaluation of the inter-examiner reliability for the TUG-ABS test, the ICC (95% CI) for the total score was 0.95 (0.93–0.98). The value for kp for the total score was 0.8 (0.74–0.86) and for the individual items (95% CI) varied between 0.27 and 0.73 (Table II). The minimum and maximum values for kp refer to questions 7 and 12, respectively. The Bland & Altman plot (Fig. 2a) illustrates the inter-examiner agreement, considering the total scores. For the TUG-ABS test, the mean difference between the 2 evaluations did not differ significantly from zero, and the limits of agreement represented 6% and 10.2% of the variation of the instrument.
Table II. Inter-examiner reliability analysis for Timed Up and Go Assessment of Biomechanical Strategies (TUG-ABS)
Fig. 2. Bland & Altman plot analysis showing inter-evaluator and test-retest agreement considering the total Timed Up and Go Assessment of Biomechanical Strategies (TUG-ABS) scores. (a) Mean of the evaluations and inter-examiner differences that did not differ significantly from zero; (b) Mean of the evaluations and test-retest differences that also did not differ significantly from zero; (c) Mean of the evaluations and intra-examiner differences that also did not differ significantly from zero.
With respect to the test-retest reliability of TUG-ABS, the ICC (95% CI) for the total score was equal to 0.96 (0.93–0.98) and for the individual items varied between 0.39 (question 2) and 0.95 (question 4) (Table III). Analysis of the Bland & Altman plot (Fig. 2b) showed that the limits of agreement represented 8% and 9.1% of the scale variation.
Table III. Test-retest reliability analysis for Timed Up and Go Assessment of Biomechanical Strategies (TUG-ABS)
In the intra-examiner reliability according to the TUG-ABS video, the ICC (95% CI) for the individual items varied between 0.56 (question 2) and 0.97 (question 12), and was equal to 0.99 (0.98–0.99) for the total score (Table IV). Analysis of the Bland & Altman plot (Fig. 2c) showed that the limits of agreement represented 4.2% and 4.6% of the variation of the instrument.
Table IV. Intra-examiner reliability analysis by video for Timed Up and Go Assessment of Biomechanical Strategies (TUG-ABS)
The internal consistency of the TUG-ABS test was considered excellent, with a Cronbach’s alpha coefficient of 0.98 (α = 0.98).
According to the MDC the total score of the TUG-ABS scale was 3.82 points
With respect to the construct validity according to the convergence analysis, the total score of the TUG-ABS test presented a highly negative correlation (p = –0.78; p < 0.001) with the TUG test, a moderately negative correlation with the UPDRS – part III (p = –0.62; p < 0.001) and a highly positive correlation with section VI of the BESTest (p = 0.72; p < 0.001).
With respect to the construct validity investigated by discriminant analysis, the individuals with PD divided into 3 groups (quick, moderate and slow performance) presented significant differences in the TUG test (Table V): the 1-way ANOVA showed significant differences between the groups (F(2) = 31.01; p < 0.001) and the post-hoc showed significant differences between all 3 groups (p < 0.05). As can be seen in Table VI, 76.5% of the individuals of the fast group, 18.8% of the moderate group and 82.4% of the slow group were correctly classified by the TUG-ABS score for the initially grouped cases. The discriminant function obtained with the total TUG-ABS score showed a canonical correlation of 0.740 (Wilks ƛ = 0.452, χ2 = 37.308, p < 0.001), with 60% of individuals correctly classified into the group to which they belonged.
Table V. Discriminant validity of known groups between the total score for the Timed Up and Go Assessment of Biomechanical Strategies (TUG-ABS) and the Hoehn & Yahr stage (n = 50)
Table VI. Results of the discriminant analysis by know groups (n = 50)
With respect to the construct validity determined by comparison of known groups using the Hoehn & Yahr scale (Table V), the 1-way ANOVA showed significant differences between the known groups (F(2) = 30.16; p < 0.05), and using Tukey’s post-hoc test it was observed that the TUG-ABS score differed significantly between all the groups: p = 0.0001 between the mild and moderate impairment stages; p = 0.003 between the moderate and severe impairment stages; and p = 0.0001 between the mild and severe impairment stages, where the higher the total score in the TUG-ABS, the less severe the impairment.
The aim of the study was to evaluate 2 measurement property domains of the TUG-ABS test in individuals with PD, which showed adequate results for reliability (intra- and inter-examiner, test-retest and internal consistency, the MDC also being determined and classified as excellent), and for validity (construct), when used in these individuals.
There was a ceiling effect of 22% for the TUG-ABS test; that is, more than 15% of the participants obtained the maximum total score of 45 points, showing the best performance, and, of these 22%, 91% were classified in the mild impairment stage. Although the instrument was sufficiently sensitive to evaluate individuals with PD in all stages of the disease, there was a ceiling effect for individuals showing the best performance (normally corresponding to the mild impairment stage) apparently signifying that the TUG-ABS test was not sensitive enough to capture alterations presented at the start of the disease. Another possible explanation for these results could be that in the mild stage, the common alterations in functional mobility were not sufficient to alter the biomechanical strategies adopted during the performance of TUG.
The inter-examiner reliability for the total TUG-ABS score presented excellent agreement (ICC = 0.95), a similar result to that found for individuals with stroke (13). When the TUG-ABS items were evaluated for their kp values (0.27–0.73), it was noted that the lowest value referred to the question “Balance phase–absence of foot contact with the ground (majority of the steps)” with respect to the “Gait” phase of TUG. The present results corroborated with those found in the original version, with a kp = 0.24 for the same task in post-stroke individuals (12).
The test-retest reliability was considered excellent for the total TUG-ABS score (ICC = 0.96). When the TUG-ABS items were analysed individually (0.39–0.95), the item showing the least reliability was the question referring to “Attempts to get up from a seated position associated with the use of the strategy of sitting as close as possible to the edge of the chair” in the “Seated to on foot” phase of TUG.
In the intra-examiner reliability, the total TUG-ABS score showed excellent agreement (ICC = 0.99) and, once again, the lowest reliability referred to the question “Attempts to get up from a seated position associated with the use of the strategy of sitting as close as possible to the edge of the chair”. The item showing the greatest agreement was the same for the 3 reliabilities tested: “gait-turn round-gait sequence” in the “Turning round” phase of TUG, not agreeing with the results of the original version which presented the item “Attempts to get up from a seated position associated with the use of the strategy of sitting as close as possible to the edge of the chair” as the question with greater reliability (kp of 0.73 to 1.00). Reliability was shown to be excellent for both video analysis and application of the instrument in real time, the form of evaluation remaining at the discretion of the physiotherapist according to his or her objectives. In relation to internal consistency, the TUG-ABS presented α = 0.98, this result being similar to that of the original version (α = 0.87), both being considered excellent.
The MDC of 3.82 points found for the total TUG-ABS score was considered excellent, representing the minimum amount of change necessary for the total score of the instrument to be considered a true change in the period of time between evaluations by the same evaluator (24, 25). Thus, the MDC obtained indicated that the change in the total TUG-ABS scores presented a chance of less than 8.48% to be due to random variation or measurement error (30), helping the physiotherapist and permitting a better interpretation of the results obtained with the use of the instrument at different times or with different evaluators.
When analysing the construct validity by convergence analysis, the result of the set of 3 analyses showed the adequacy of this property of the TUG-ABS in individuals with PD (27). The TUG-ABS showed a moderately negative correlation with UPDRS – part III. The items 13, 14, 15, 29 and 30 of the UPDRS, which evaluate postural and gait instability in PD, were recommended for their excellent internal consistency, although they showed a floor effect in the mild impairment stage and did not include items of gait performance. This could explain their moderate correlation with the TUG-ABS, which has items evaluating gait performance in its construct (9).
When the TUG-ABS was correlated with section VI of the BESTest, a high positive correlation was found. Bloem et al. (9) evaluated all the instruments used to evaluate posture, gait and equilibrium in PD, according to the orientations of COSMIN, and concluded that no instrument covered all the gait characteristics specific and relevant to PD. Thus they recommended the best evaluation instruments according to the various constructs of interest. These same authors recommended the use of the MiniBESTest and of the BESTest for the functional evaluation of gait and equilibrium in individuals with PD, since they cover an ample variety of constructs, although there is a need for extra materials to carry them out and a time of 10–15 min for their application (9). In the present study, section VI, related to gait stability was correlated with TUG-ABS, and high correlation was found, possibly because they refer to similar constructs.
TUG-ABS was developed to complement TUG, a measurement with adequate validity and reliability to evaluate functional mobility in individuals with PD (10, 22, 31, 33). When compared with the TUG time, the TUG-ABS score presented a highly negative correlation (p = –0.78), indicating that the shorter the TUG time, the higher the TUG-ABS score, characterizing better performance. A time longer than 11.5 s in TUG is considered predictive of falls in PD (32), and the test is capable of discriminating PD fallers from non-fallers (33), as well as presenting a value for MDC of 3.5 s in these individuals (34). Since the only standardized measurement outcome for TUG is the time, which, although presenting a series of advantages, is limited by providing little information about the quality of the performance, the use of TUG-ABS complements the information obtained with TUG. The TUG-ABS provides information about the biomechanical strategies used by these individuals during the carrying out of important functional activities, such as those that constitute the TUG test, and this information is important for making clinical decisions with respect to these individuals. Thus TUG-ABS can be used to complement the results of TUG in individuals with PD, as already indicated for post-stroke individuals (12, 13).
The construct validity, analysed by discriminant analysis, showed a global classification precision of 60%, indicating that the predictions of the group members in relation to the TUG-ABS score were correctly classified in the majority of cases initially grouped considering the TUG results. Only the moderately impaired group was poorly classified (18.8%). Analysis of the confidence intervals of the individuals with moderate and slow performance (Table V) indicated a possible explanation for this result: an extensive time interval can be observed, and hence the individuals in the sample in the present study may not have contemplated the possible diversity in performance. Although the 3 groups presented statistically different TUG results, the groups were formed considering a sample selected by convenience. It is possible that, if more x with different performances in TUG had been evaluated, the formation of groups with different performances in TUG might have contemplated a greater variability in performance that could have been better identified by TUG-ABS.
In relation to the construct validity by comparison of known groups, it was verified that the groups in more advanced stages presented lower total scores in the TUG-ABS test, indicating greater impairment of the biomechanical strategies. The statistical analysis revealed that the TUG-ABS test (Table V) was capable of discriminating between the individuals with PD in all stages of the disease. However, these results should be considered with caution, since the severely impaired group consisted of a small sample (9 individuals). This discriminatory capacity of the degree of impairment by instruments is important, since the physiotherapist has difficulty in differentiating mild from moderate stages for questions that the Hoehn & Yahr scale (18) does not consider.
One of the limitations of the present study was the small number of participants in the severe stage of PD, and hence one cannot generalize the result for application in the population in more advanced stages of the disease. Further studies are required to ratify the results already found, including a greater variability in the individuals with respect to their performance in TUG, which could provide answers to other important questions raised by its use.
In conclusion, the TUG-ABS was shown to be an instrument with reliability and construct validity, with the accuracy to identify the biomechanical characteristics and strategies used by individuals with PD while carrying out the TUG test. The TUG-ABS is therefore of use in clinical practice.