Content » Vol 44, Issue 8

Original report

Test-retest reliability and validity of the comprehensive activities of daily living measure in patients with stroke

I-Ping Hsueh, MA1,2, Chun-Hou Wang, BSc3, Tsan-Hon Liou MD PhD4, Chia-Huang Lin, BSc4 and Ching-Lin Hsieh, PhD1,2

From the 1School of Occupational Therapy, College of Medicine, National Taiwan University, 2Department of Physical Medicine and Rehabilitation, National Taiwan University Hospital, 3School of Physical Therapy, Chung Shan Medical University and Room of Physical Therapy, Chung Shan Medical University Hospital and 4Department of Physical Medicine and Rehabilitation, Shuang Ho Hospital, Taipei Medical University, Taipei, Taiwan

OBJECTIVE: To examine the test-retest reliability, convergent validity, and predictive validity of the comprehensive activities of daily living (CADL) measure in patients with stroke.

DESIGN: A repeated-assessments design, 10–14 days apart, was used to examine test-retest reliability in 70 patients. In the validity study, a further 168 patients were assessed at 6 months and 1 year after stroke.

SETTING: Three rehabilitation units.

Main outcome measures: The CADL measure, providing Rasch-calibrated scores, assesses the entire continuum of basic and instrumental activities of daily living. Both domains (self-care and mobility) of the stroke-specific quality of life questionnaire (SS-QOL) were used to examine the convergent validity. The summary score of the SS-QOL was used as the criterion for examining the predictive validity of the CADL measure.

RESULTS: The test-retest reliability was excellent (intraclass correlation coefficient = 0.96). The CADL measure and both domains of the SS-QOL exhibited strong associations at 6 months and 1 year post-stroke (Pearson’s r ≥ 0.77). The score of the CADL at 6 months post-stroke was highly correlated with that of the SS-QOL at 1 year post-stroke (r = 0.75).

CONCLUSION: The CADL measure showed satisfactory test-retest reliability, convergent validity, and predictive validity in patients with stroke.

Key words: stroke; activities of daily living; test-retest reliability; validity.

J Rehabil Med 2012; 44: 637–641

Correspondence address: Ching-Lin Hsieh, School of Occupational Therapy, College of Medicine, National Taiwan University, 4th Floor, 17 Xuzhou Rd., Taipei 100, Taiwan. E-mail: clhsieh@ntu.edu.tw

Stroke is the most common cause of disability in activities of daily living (ADL) among elderly people. The term ADL refers to basic ADL (BADL) or overall ADL (1). However, BADL does not capture important losses in higher levels of ADL function or activities that are necessary for independence in the home and community (i.e. instrumental ADL; IADL) (2). ADL function is related to subjective well-being or quality of life in patients with stroke (3, 4). Thus, both the BADL and IADL measures are recommended as the primary outcome measures for stroke patients after hospital discharge (5).

Several authors recommend combining the BADL measure (e.g. the Barthel Index (6); BI) and the IADL measure (e.g. the Frenchay Activities Index (7); FAI) to comprehensively assess ADL function (8–10). Hsueh et al. (9) modified the BI and FAI using Rasch analysis to form a new measure to assess comprehensive ADL function (called the CADL measure). The CADL measure assesses an enhanced range and continuum of ADL function and has sufficient Rasch reliability and unidimensionality (9, 11). In addition, the CADL scores are Rasch calibrated scores that can be viewed as interval-level measurements and are useful for most statistical techniques (9, 12). However, some other important psychometric properties of the CADL measure remain largely unknown, thus limiting the utility of the measure.

The aims of this study were to examine the test-retest reliability, minimal detectable change, convergent validity, and predictive validity of the CADL measure in patients with stroke living in the community. The results of this study could lend support to the utility of the CADL in both clinical and research settings.

MethodS

Subjects

The study protocol was divided into two parts. The first part examined the test-retest reliability and minimal detectable change of the CADL measure. Patients were recruited from outpatients receiving rehabilitation in 3 hospitals in Taiwan. Patients were eligible for this part of study if they met the following criteria: (i) having cerebrovascular accident without other major diseases (e.g. cancer, severe rheumatoid arthritis); (ii) having had a stroke beyond 6 months; and (iii) ability to complete the interview.

In the second part of the study, the convergent validity and predictive validity of the CADL measure were investigated. The data used for this study were collected in a prospective study partly reported elsewhere (13). Patients were eligible for this part of the study if they met the following criteria: (i) first onset of stroke without other major diseases (e.g. cancer); and (ii) admission to an acute care hospital within 14 days of the onset of a stroke. Participants were excluded if they had another stroke or other major disease/s during the follow-up period. Further details of selection and exclusion criteria were reported previously (13). Each subject was evaluated at 14 days after stroke onset and reassessed at other specific time-points (e.g. 6 months after stroke and 1 year after stroke). For the purpose of this study, we used the participants’ data at 6 months and 1 year after stroke. The study protocols were approved by the ethics committee of the hospital where the study was conducted.

Procedure

To investigate the test-retest reliability of the CADL measure, all patients in the first sample were evaluated initially using the original BI and FAI. The scores on both measures were transformed to CADL scores later on the basis of the Rasch item parameters reported earlier (9). Both measures were re-administered after a 10- to 14-day interval. Such an interval was used to prevent the rater from remembering the scoring results. A trained occupational therapist interviewed both the patient and his/her caregiver, if available, to confirm the performance and needs for assistance of the patients in doing CADL tasks in the community. If there was a discrepancy between a patient and his/her caregiver, the therapist further clarified the discrepancy with the patient and his/her caregiver to obtain the patient’s performance in daily life. All the interviews were conducted in person.

To examine the convergent validity and predictive validity of the CADL measure, all patients in the second part of the study were assessed using the BI and FAI at 6 months and 1 year post-stroke. The patients’ responses on the BI and FAI were later re-coded, combined, and transformed via the Rasch model as suggested by Hsueh et al. (9). Another therapist, rather than the therapist in the test-retest study, administered these interviews at the patients’ residences. To maintain equivalent ratings across time and patients, the primary investigator (CLH) and the rater discussed the ratings regularly and when necessary. In addition, the patients or their proxies completed the stroke-specific quality of life (SS-QOL) (14) questionnaire at both time-points (i.e. 6 months and 1 year post-stroke). The therapist first asked and encouraged each patient to complete the questionnaire by himself/herself. If there was difficulty, the patient’s caregiver was asked to complete the questionnaire for the patient. The patient-proxy agreement of the SS-QOL has been established (15).

Measures

The BI assesses BADL function in persons with neurological or musculoskeletal disorders (6). It comprises 10 items and has satisfactory psychometric properties in patients with stroke (16, 17). The FAI was developed to measure IADL function following stroke (7). It comprises 15 items and is reliable and valid in patients with stroke (7, 18).

Both the BI and the FAI can be combined to represent the entire continuum of BADL and IADL, or CADL (8–10). The CADL contains 10 items of the BI and 13 items of the FAI (without two misfit-to-Rasch-model items (social occasions and walking outside) of the FAI). These 23 items of the CADL measure assess a single unidimensional ADL function. Because the response categories of both BI and FAI appeared redundant, the response category of the 23 items was simplified to a dichotomy (i.e. independence or dependence). Thus, the possible raw score of the CADL ranges from 0 to 23, which can be obtained from re-coding the original BI and FAI scores or directly from the 23 dichotomous items. The Rasch reliability for patient estimates is 0.94 (which can be similarly interpreted as Cronbach’s alpha) (9). Hsueh et al. (9) originally provided the Rasch scores (–9.46 to 6.80) for the CADL measure. For ease of interpretation, the Rasch scores were converted (linearly) to 0–100 in this study. A higher score indicates a higher level of independence in living in the community. The raw scores and transformed scores of the CADL are listed in Appendix I. Further details of the CADL measure can be found in the Hsueh et al.’s study (9).

The SS-QOL (14) consists of a total of 49 items with 12 domains: energy, family roles, language, mobility, mood, personality, self-care, social roles, thinking, upper extremity function, vision, and work/productivity. All 49 items use 5-ordinal-level response categories. Both domains (self-care and mobility) are highly related to BADL (19) and were used for examining convergent validity of the CADL. The self-care domain includes 5 items and the mobility domain has 6 items. We used the mean item scores of each domain for data analysis. In addition, we adopted the summary score (a sum score of the mean value of all domains) of the SS-QOL (14, 20, 21), representing overall health-related quality of life, in this study. The SS-QOL was chosen as the criterion for examination of predictive validity because ADL function is related to quality of life in patients with stroke (22). The SS-QOL can be completed by either patients or their proxies (15). It is reliable, valid, and responsive in stroke patients (14, 21).

Data analysis

Test-retest reliability indicates agreement between repeated assessments. The intraclass correlation coefficient (ICC2,1) was used to examine the level of test-retest reliability. A 2-way analysis of variance (ANOVA) (assuming both patients’ effects and trials’ effects to be random) was used to compute the variance needed to calculate the ICC2,1 (23). ICC values > 0.8 indicate high reliability, and ICC values in the range of 0.6–0.8 represent substantial reliability (24). In addition, a paired t-test was performed to examine whether significant differences existed between test-retest assessments.

The minimum detectable change (MDC) indicates the smallest change between repeated assessments that reflects real change rather than measurement error at a certain confidence level (e.g. 90% or 95%) (25). MDC, based on the standard error of measurement (SEM), was calculated using the following formulae (25):

MDC = z-score level of confidence × √2 × SEM (1)

SEM = SD baseline × √ (1– Pearson’s rtest-retest) (2)

In formula (1), the z-score represents two standard units on a standard normal distribution (i.e. 1.96 for 95% confidence level in this study). The multiplier of √2 indicates the additional uncertainty caused by the use of different scores from measurements at two sessions. Thus, the MDC (1.96 × SEM × √2) was used to determine whether the change score of an individual patient was real (beyond random measurement error) at the 95% confidence level in this study. ICC2,1 was used for Pearson’s rtest-retest to calculate SEM because ICC is more commonly adopted as an indicator of test-test reliability (25).

Convergent validity represents the degree to which a measure correlates with other measures assessing related entities (26). Predictive validity shows the extent to which a measure correlates with other health-related measures administered at follow-up (27). The convergent validity and predictive validity of the CADL measure were examined using the Pearson’s correlation coefficient (r). Convergent validity was determined by the strength of association between the scores of the CADL measure and both domains of the SS-QOL (self-care and mobility) at 6 months and 1 year after stroke, respectively. Predictive validity was determined by the strength of association between the score of the CADL measure at 6 months and the summary score of the SS-QOL at 1 year after stroke. A Pearson’s r between 0.4 and 0.74 was considered as a moderate association, and a Pearson’s r ≥ 0.75, as a high association (28). We hypothesized that the scores of the CADL measure would be highly correlated with those of both self-care and mobility domains of SS-QOL, and that the score of the CADL measure at 6 months after stroke would be moderately associated with the summary score of the SS-QOL at one year after stroke.

Results

Seventy patients with chronic stroke participated in the test-retest study. Approximately 24% of their caregivers were not available for interview, so in such cases, only the patients were interviewed. The mean time since the patients’ most recent strokes was 46 months. The CADL scores of the patients were distributed throughout most of the range of the measure (0–86.7). Table I shows further characteristics of the patients.

Table II shows the test-retest reliability was excellent (ICC = 0.96, 95% CI = 0.94–0.98). The mean difference (–0.1) between two assessments was trivial (p = 0.93). The MDC of the CADL measure was 12.3 out of 100 points.

Table II. Test-retest reliability indices of the Comprehensive Activities of Daily Living (CADL) measure (n = 70)

Measure

First test

Mean (SD)

Second test

Mean (SD)

Difference

Mean (SD)

ICC

(95% CI)

SEMa

MDCa

CADL

51.4 (23.0)

51.3 (22.8)

–0.1 (6.3)

0.96 (0.94–0.98)

4.4

12.3

aWe calculated both SEM and MDC on the values of SD and ICC at 3 decimal places.

ICC: intraclass correlation coefficient; CI: confidence interval; SEM: standard error of measurement; MDC: minimal detectable change.

In addition, 168 patients who had had stroke for 6 months participated in the validity study. Forty-two patients were lost to the second assessment at 1 year after stroke because of recurrent stroke or relocation. The characteristics (i.e. gender, age, side of lesion, and diagnosis) of the 42 drop-outs were not statistically different from those of the patients followed (p > 0.15). However, the drop-outs had lower CADL scores (p = 0.048) and SS-QOL scores (p = 0.018). The 168 patients had a wide range of ADL function, as shown by the CADL scores ranging from 0 to 95. In addition, some patients (24% and 22% of the participants at 6 months and 1 year after stroke, respectively) could not complete the SS-QOL, so proxy ratings were used instead. Table I shows details of the patients.

Table I. Demographic and clinical characteristics of subjects in this study

Characteristic

Test-retest reliability study

(n = 70)

Convergent validity study at

6 months after stroke

(n = 168)

Predictive validity study

(n = 126)

Age, years, mean (SD)

59.0 (12.0)

65.4 (10.3)

67.0 (11.0)

Diagnosis, n

Cerebral haemorrhage

30

43

33

Cerebral infarction

40

125

93

Sex, male/female, n

46/24

107/61

76/50

Side of hemiplegia, right/left, n

37/33

67/99

49/77

Months after stroke at 1st evaluation, mean (SD)

45.8 (138.8)

Days between 2 evaluations, mean (SD)

11.9 (3.8)

CADL score (0–100), mean (SD)

At 6 months after stroke

55.8 (22.2)

At 1 year after stroke

58.0 (22.4)

SS-QOL self-care score (0–5), mean (SD)

At 6 months after stroke

3.8 (1.1)

At 1 year after stroke

3.9 (1.1)

SS-QOL mobility score (0–5), mean (SD)

At 6 months after stroke

3.9 (0.9)

At 1 year after stroke

3.9 (1.0)

SS-QOL summary score (0–60), mean (SD)

At 1 year after stroke

43.1 (10.3)

SD: standard deviation; CADL: comprehensive activities of daily living; SS-QOL: stroke-specific quality of life questionnaire.

In the convergent validity investigation, the score of the CADL measure was highly correlated with that of both domains of the SS-QOL at both time-points (Pearson’s r = 0.81 (self-care), 0.77 (mobility) at 6 months after stroke (n = 168); r = 0.87 (self-care), 0.82 (mobility) at 1 year after stroke (n = 126)).

In the predictive validity investigation, the score of the CADL at 6 months post-stroke was highly correlated with that of the SS-QOL questionnaire at 1 year post-stroke (Pearson’s r = 0.75).

In addition, the BI showed a notable ceiling effect at both time-points (39.9% and 44.8% of the patients achieving highest possible score of the BI). The FAI had a notable floor effect (20.8% and 22.1% of the patients achieving lowest possible score of the FAI). The CADL showed very limited floor or ceiling effects (≤ 2.5%).

Discussion

Our results showed that the test-retest reliability of the CADL measure was excellent, with a near-zero mean difference between two assessments. More importantly, the MDC of the CADL measure was found to be 12.3 points. This indicates that only a change greater than that amount between two consecutive assessments scored by the same rater can be interpreted as a real change at the 95% confidence level (25). In addition, the MDC can be used as a threshold for determining whether an individual patient has made a statistically significant improvement (p < 0.05) (25). The excellent test-retest reliability and limited amount of MDC support repeated use of the CADL measure in both clinical and research settings.

Validity indicates whether a measure assesses the concept that is to be measured (27). In the absence of a gold standard (e.g. the CADL function), validity can be established by assessing the extent to which the measure is associated with other measures assessing theoretically related constructs (convergent validity) (26). The score of the CADL measure was highly correlated with that of the self-care and mobility domains of the SS-QOL (Pearson’s r ≥ 0.77). The high correlation might be due to the fact that the CADL assesses actual performance of daily functioning (including self-care and mobility), whereas both domains of the SS-QOL assess the patient’s perception of his/her functioning of self-care and mobility, respectively. These measures assess different perspectives of daily functioning. Thus, the patients’ perceptions of self-care and mobility were highly correlated with the comprehensive ADL function. These results support the convergent validity of the CADL measure.

Predictive validity indicates the measure’s ability to predict relevant attributes (e.g. future health-related quality of life in this study) (27). We found that the score of the CADL measure at 6 months post-stroke was highly correlated with that of the SS-QOL questionnaire at 1 year post-stroke. This finding indicates that the CADL measure has satisfactory predictive ability for health-related quality of life. The good evidence of predictive ability makes the CADL measure practical for clinicians to manage ADL function for promoting quality of life in patients with stroke. In addition, the findings suggest that early assessment of CADL is clinically useful for stroke patients after hospital discharge.

The findings of validity might be threatened because of our use of proxy ratings on the SS-QOL. Nearly a quarter of the patients could not complete the SS-QOL, so proxy ratings were used as substitutes. Proxies may report more dysfunction in multiple domains of SS-QOL than stroke patients themselves (15). Thus, we might have underestimated the level of association between the CADL measure and both domains, as well as the summary score of the SS-QOL (i.e. the validity of the CADL measure). Because we found high levels of associations between the measures, the convergent validity and predictive validity of the CADL were not compromised.

The BI and FAI, as expected, showed notable ceiling and floor effects, respectively, at 6 months and 1 year after stroke. On the other hand, the CADL showed negligible floor and ceiling effects. These findings are similar to previous findings (8, 9). Thus, these observations further support the discrimination power of the CADL over both the BI and FAI.

The CADL measure is a new measure and has at least two characteristics that may be of concern to prospective users. First, the CADL measure may be useful for patients living in an institution (e.g. a long-term care centre or hospital) or in the community. However, the CADL measure contains both BADL and IADL. Because IADL is not commonly performed by patients living in hospitals, the IADL items of the CADL measure may be redundant for these patients. Thus, the CADL measure is most useful for assessing patients living in the community and may be useful for monitoring patients at the time of transition when returning to the community from the hospital. Secondly, the Rasch-calibrated 23-item CADL is useful for data interpretation. For example, previous results have shown that the 10-item BADL is easier to endorse than the 13-item IADL, according to characteristics of item difficulty among the 23 items of the CADL measure (9). Thus, if a patient independently performs all 10 BADL tasks, he/she will obtain a score of 60 or more. If a patient scores above 60, he/she is very likely to be independent in performing BADL. However, the utility of the cut-off score (60) of the CADL measure remains to be examined in future studies to provide empirical evidence for both clinicians and researchers.

The generalization of our findings may be limited because of 3 concerns. First, we excluded patients with other major diseases who were likely to have low BADL function. Because of the strict selection criteria used in this study, our findings may not be generalized to those stroke patients who have major comorbidities. Secondly, 25% of the patients could not be followed in the predictive study. Thirdly, we tried to interview both the patient and his/her caregiver to obtain his/her real ADL performance in daily life. However, when the patient’s and caregiver’s reports differed, the discrepancy had to be clarified by the interviewer. In addition, some caregivers were not available for interview, making our data resources inconsistent. These observations might have introduced bias to the CADL score, which might have lessened the test-retest reliability and level of association with the other measures. In addition, the minimal important difference (MID) (29) represents a threshold of change that is meaningful to patients. The MID is critical in decision-making in clinical settings and serves as a benchmark for clinical trials (29). Future research to estimate the MID for the CADL measure is suggested in order to further promote the utility of the measure.

In brief, our results suggest that the CADL measure has satisfactory test-retest reliability, minimal detectable change, convergent validity, and predictive validity in patients with stroke. These results support the utility of the CADL measure in both clinical and research settings.

Acknowledgements

This study was supported by research grants from the National Science Council (NSC96-2314-B-002-168-MY2) and the National Health Research Institutes (NHRI-EX96-9512PI) in Taiwan.

References

Comments

Do you want to comment on this paper? The comments will show up here and if appropriate the comments will also separately be forwarded to the authors. You need to login/create an account to comment on articles. Click here to login/create an account.