From the 1Department of Clinical Neuroscience, Rehabilitation Medicine, Institute of Neuroscience and Physiology, Sahlgrenska Academy, University of Gothenburg, and 2Department of Occupational Therapy and Physiotherapy, Sahlgrenska University Hospital, Gothenburg, Sweden
Objective: To explore the concurrent validity, responsiveness, and floor- and ceiling-effects of the 2 items of Action Research Arm Test (ARAT-2) in comparison with the original ARAT and the Fugl-Meyer Assessment for Upper Extremity (FMA-UE) during the first 4 weeks post-stroke.
Design: A prospective longitudinal cohort study.
Subjects: A non-selected cohort of 117 adults with first-ever stroke and impaired upper extremity function.
Methods: The activity capacity and motor function was assessed with ARAT and FMA-UE at 3 days, 10 days and 4 weeks post-stroke.
Results: Correlation between ARAT-2 and the
other assessment scales was high (r = 0.92–0.97) and ARAT-2 showed statistically significant changes between all time-points (effect size, r = 0.31–0.48). The effect sizes for the change in ARAT and FMA-UE varied from 0.44 to 0.53. ARAT-2, similarly to ARAT, showed a floor effect at all time-points. The ceiling effect was reached earlier using ARAT-2 than with ARAT and FMA-UE.
Conclusion: ARAT-2 appears to be valid and a responsive short assessment for upper extremity activity capacity, and suitable for use in the acute stage after stroke. However, when the highest score has been reached, the assessment needs to be complemented with other instruments.
Key words: stroke rehabilitation; motor function; upper extremity; activity capacity; patient outcome assessment; validation studies, behaviour rating scale.
Accepted Jan 30, 2019; Epub ahead of print Feb 15, 2019
J Rehabil Med 2019; 51: 00–00
Correspondence address: Margit Alt Murphy, Rehabilitation Medicine, Institute of Neuroscience and Physiology, Sahlgrenska Academy, University of Gothenburg, Per Dubbsgatan 14, SE413 45 Gothenburg, Sweden. E-mail: margit.alt-murphy@neuro.gu.se
After a stroke most people may have difficulty using their affected arm and hand in daily life. Appropriate outcome measures should be used to evaluate meaningful improvements in arm function. This study investigated how well a short version of a standardized and recommended clinical test on arm function (ARAT-2) can be used in acute clinical settings. The results showed that ARAT-2, which includes 2 tasks (pour water from glass to glass, and place hand on top of the head), was able to measure limitations in arm function. ARAT-2 was also able to capture improvements over the first 4 weeks after the stroke. The ARAT-2 can be recommended as an outcome measure early after stroke. How-ever, when the highest score is reached in ARAT-2, other assessments may be needed to evaluate minor deficits or improvements in arm function.
Approximately 22,000 persons experience acute stroke each year in Sweden (1). Upper extremity impairment, reported in 48–77% of patients in the acute phase, is one of the most common sequelae after stroke (2, 3). Impaired function limits voluntary, well-coordinated effective movements (4) and can lead to activity limitations (5), reduced independence and participation in social and physical environment (6). Improvement in upper extremity occurs mainly during the first 4 weeks (7, 8). This recovery can be explained by resorption of cellular oedema of the non-infarcted penumbral areas around the infarcted area, and cortical as well as subcortical reorganization (8–10). Functional improvement can still be achieved even after the early sub-acute stage, although to a lesser degree (6, 11).
The median stay in hospital according to the Swedish Stroke Register (12) was 8 days in 2016 (1). A short hospital stay requires an early assessment so that an appropriate plan can be made for discharge and rehabilitation in the short and long term (2). Initial motor function during the first 4 weeks after stroke is an important factor predicting upper extremity recovery (13, 14).
Appropriate outcome measures should be used to discover meaningful improvements in motor function and activity. In addition to validity and reliability, a standardized clinical assessment needs to be sensitive to changes over time (15). Action Research Arm Test (ARAT) (16, 17) and Fugl-Meyer Assessment for upper extremity (FMA-UE) (18) are 2 recommended assessments to evaluate upper extremity activity capacity and impairment, respectively (19). Both scales are considered time consuming and therefore rarely used in acute clinical settings. The ARAT also requires special equipment. Thus, there is a clinical need for a short assessment for the upper extremity in the acute stage after stroke.
Clinical assessment in the acute stage should ideally be easy to administer, time effective, not require special equipment, be valid for severe to mild stroke impairments and preferably provide useful information on recovery prediction (20). A short version of the ARAT, the ARAT-2 that contains 2 items from ARAT (pour water from glass to glass and place hand on top of the head), has been shown to predict well functional outcome early after stroke (21). It does not require any special equipment, is quick and easy to use and has potential to contribute valuable predictive and clinical information.
There is a need to investigate the psychometric properties of a short assessment, such as ARAT-2, in the acute stage after stroke. Thus, the aim of this study was to determine the concurrent validity, responsiveness, floor and ceiling effect of the ARAT-2 in comparison with the original ARAT and the FMA-UE in a non-selected cohort of patients with stroke assessed at 3 days, 10 days and 4 weeks after stroke onset.
Data for this study were extracted from the Stroke Arm Longitudinal study at the University of Gothenburg (SALGOT-study, ClinicalTrials.gov NCT01115348) (22), which aimed to investigate upper extremity functioning, recovery and consequences of stroke on activity and participation in a non-selected sample during the first year after stroke. The SALGOT study comprised 117 patients who were included from a stroke unit at the Sahlgrenska University Hospital in Gothenburg, Sweden during a period of 18 months (2009–10). Inclusion criteria for the SALGOT study were: (i) diagnosed first-ever clinical stroke according to the World Health Organization (WHO) (23); (ii) impaired upper extremity function, defined as < 57 points on ARAT, at day 3 (± 1 day) after stroke onset; (iii) admitted to the stroke unit within 3 days after stroke onset; (iv) 18 years or older; (v) resident in Gothenburg urban area within 35 km of the hospital. The exclusion criteria were: (i) injury or condition prior to the stroke that limits the upper extremity function; (ii) short life-expectancy, e.g. less than 12 months due to other illness (cardiac disease, malignancy); (iii) non-Swedish speaking. The flowchart of the inclusion process is shown in Fig. 1.
Fig. 1. Flowchart of the inclusion process in the Stroke Arm Longitudinal study at University of Gothenburg (SALGOT study).
All participants received individually adjusted functional task-specific rehabilitation from the first day at the stroke unit according to the Swedish national guidelines. The participants followed an individually adjusted standardized routine for rehabilitation after the hospital discharge that commonly included interventions at community care with a physiotherapist and/or occupational therapists. In the SALGOT study the participants were assessed with a battery of assessments at 8 occasions during the first year post-stroke: 3 and 10 days, 3, 4 and 6 weeks, 3, 6 and 12 months post-stroke (22).
In the current study, data from the assessment time-points at 3 and 10 days, as well as 4 weeks post-stroke were used. All assessments at these time-points were performed by 2 experienced physiotherapists, undergoing a training period for the assessment battery prior to the study (24). Most of the assessments were performed at the hospital. In case the patient was discharged and unable to travel, the assessment was conducted in the patients’ home or nursing home. Ethical approval for the SALGOT study was approved by the Regional Ethics Committee, Gothenburg (225-08) and informed, written consent was received from all participants.
Stroke severity was determined by the National Institute of Health Stroke Scale (NIHSS) (25) and the type and location of stroke were collected from the patients’ medical charts. The total score of the NIHSS varies from 0 to 42 points and a higher score indicates a more severe stroke.
Upper extremity activity capacity was assessed by the ARAT (16), which is a standardized observational rating scale constructed to assess manual ability to grasp and handle different objects after stroke. The assessment contains 19 items divided into 4 subscales: grasp, grip, pinch and gross movement. Each item is scored on a 4-point ordinal scale (0 = unable to complete any part of the task within 60 s, 1 = the task is partially performed within 60 s, 2 = the task is completed, but with great difficulty or takes an abnormally long time (6–60 s), 3 = the task is performed normally within 5 s with a total score of 0–57 points. The ARAT is valid and responsive to change of activity capacity over time in patients with stroke, has good intra- and inter-rater reliability (ICC = 0.99 and 0.95, respectively) (19, 26). The intra- and inter-rater reliability at item level shows good agreement (percentage agreement, PA ≥ 70%) although minor systematic disagreements have been shown for few items (grasping a large block, pinch grip of a 6-mm ball-bearing between 3rd finger and thumb, hand to mouth) (24).
The ARAT-2 (17, 21, 24) comprises 2 items from the ARAT: pour water from glass to glass (item 7) and place hand on top of the head (item 18). Each item is scored in a similar way as in the original ARAT and the total score of the 2 items ranges between 0 to 6 points. The construction of ARAT-2 was based on a standardized procedure (21): (i) the items that did not require special standardized equipment were selected; (ii) principal components analysis was used to identify the minimum number of items needed to capture most of the variance in the ARAT; and (iii) item difficulty established with Mokken analysis (27) was used to guide the selection of items that would cover a wide range of activity capacity limitation. The intra- and inter-rater reliability evaluated by percentage of agreement for the pour water item varied between 89% an 97%, and for the hand on top of the head item between 77% and 91% (24). Neither systematic nor random individual disagreements were detected for these 2 items (24). The 2 items in the ARAT-2 have been shown to cover a broad range of activity limitations by explaining 95% of the total variance in ARAT (21).
Upper extremity motor function was assessed by the FMA-UE (18), which is a standardized observational rating scale constructed to assess sensorimotor impairments after stroke. It consists of 33 items, each scored on a scale of 0–2, and with a total score of 66 points, which indicates high motor function. In addition to pure motor items, the FMA-UE also includes 3 reflex activity items, measuring a different construct (28). In this study, also a pure motor score of the FMA-UE, excluding the 3 reflex items was calculated. The FMA-UE without the reflex items had a maximum score of 60 points. The FMA-UE has shown excellent intra- and inter-rater reliability (ICC = 0.99 and 0.96, respectively) (26) and validity for individuals with stroke (19, 29). The non-motor domains of the FMA-UE, assessing sensory impairment and pain during passive joint motion, at 3 days post-stroke were used for background data.
Descriptive statistics were used to summarize demographics and clinical characteristics. The ratings of the ARAT-2 were retrieved from the ratings on the original 19-item ARAT. At 10 days assessment, the FMA-UE scores were missing from 20 patients due to administration problems. An estimated score for each of these patients was calculated by using the mean change from day 3 to day 10 of all patients (n = 97), as described previously (21). This mean score was then added to the patients (n = 20) day 3 FMA-UE score. The estimated score for day 10 could not exceed the FMA-UE score at day 3. The scores for the FMA-UE without reflex items of these 20 patients at day 10 were only constructed when the scores of the reflex items were unchanged from a previous and following test occasion. Data analyses were performed in parallel with observed and imputed data to ensure that the imputed data did not influence the results.
Concurrent validity of ARAT-2 was examined in comparison with the original ARAT (0–57), FMA-UE (0–66) and the FMA-UE without reflex items (0–60) separately at day 3, 10 and week 4. Correlation between the scales was examined visually using scatterplots and by using the Spearman’s correlation coefficient rho (r). The significance level p < 0.05 was used and the strength of correlation was interpreted as follows: < 0.26 (little if any), 0.26–0.49 (low), 0.50–0.69 (moderate), 0.70–0.89 (high) and ≥ 0.90 (very high) (30).
The responsiveness of ARAT-2 was examined by using the Wilcoxon signed-rank test to evaluate the change between each time-point. Bonferroni correction of the p-value (p < 0.016) was used to correct testing between 3 time-points (31). The effect size for change scores was calculated by dividing the z value obtained from the Wilcoxon signed-rank test with the square root of number of observations. Effect size values < 0.30 indicate small effect, 0.30–0.49 medium effect and ≥ 0.50 large effect (31). The percentage of patients with a positive, negative and tie outcome was also investigated. The change between each time-point was similarly calculated for the ARAT (0–57), FMA-UE (0–66) and the FMA-UE without reflex items (0–60). The floor and ceiling effect was considered when more than 20% of the patients scored a minimum or maximum score of the scale, respectively (32, 33). All statistical analyses were performed with SPSS version 22.
The demographic and clinical characteristics of the 117 patients assessed at 3 days post-stroke are summarized in Table I. The number of patients included in the analysis at each time-point is reported in Table II. The main reasons for missing data was being too tired to perform an assessment (n = 4), difficulties to cooperate or understand instructions needed to perform the assessment (n = 2), moving away from the Gothenburg urban area (n = 1), did not want to come to the assessment (n = 2), death (n = 2) or dismissed from the study (n = 3). The ARAT-2 showed high correlation (r = 0.92–0.97) with ARAT, FMA-UE and FMA-UE without reflex items (Fig. 2).
Table I. Characteristics of the included patients at 3 days post-stroke
Table II. Ceiling and floor effects of the assessment scales at 3 and 10 days and 4 weeks after stroke onset
Fig. 2. Correlation of Spearman’s rho (r) between the ARAT-2 and the ARAT and FMA-UE, respectively, at 3 days (A), 10 days (B) and 4 weeks (C) post-stroke. ARAT: Action Research Arm Test; ARAT-2: short version of Action Research Arm Test; FMA-UE: Fugl-Meyer Assessment of Upper Extremity.
All scales showed a medium to large effect to detect changes during the first 4 weeks after stroke (p < 0.001, effect size 0.49–0.53, Table III). The ARAT-2 showed a medium effect (< 0.49) at all time-points, and the smallest effect size for the change was detected for the time between 10 days and 4 weeks post-stroke (0.31). The effect sizes for the observed time-points for the ARAT and FMA-UE were in the range 0.44–0.48. Largest effect sizes were noted for the change from 3 days to 4 weeks post-stroke (0.48–0.53) for all clinical scales. The proportions of patients showing improvement, deterioration or no change in assessment scores are shown in Fig. 3.
Table III. Median values for all assessment scales at 3 time-points together with the z-value and effect size for the change
Fig. 3. Proportion of patients showing positive, negative or no changes in assessment scores between the 3 time-points. ARAT-2: short version of Action Research Arm Test; ARAT: Action Research Arm Test; FMA-UE: Fugl-Meyer Assessment for Upper Extremity.
The ARAT-2 and the ARAT both showed a floor effect at 3 days (both 38%), 10 days (31% and 30%) and 4 weeks (both 24%) post-stroke (Table II). No floor effect was observed in the FMA-UE, but similarly to ARAT-2 and ARAT, the floor effect was also present in the FMA-UE without reflex items at 3 days (35%) and 10 days (27%), but not at 4 weeks (12%) post-stroke. There was a ceiling effect detected for ARAT-2 at 10 days (22%) and 4 weeks (32%) in contrast to the ARAT that showed a small ceiling effect only at 4 weeks (21%). The FMA-UE and FMA-UE without reflex did not show any ceiling effect within the first 4 weeks post-stroke.
This study investigated the concurrent validity, responsiveness, floor and ceiling effects of the ARAT-2 in comparison with the original ARAT and the FMA-UE within the first 4 weeks after stroke onset. The ARAT-2 showed a strong correlation with the original ARAT and FMA-UE and was, similarly to other scales, sensitive to change between all tested time-points, (3 days, 10 days, and 4 weeks, respectively) post-stroke. The ARAT-2 had similar floor effect compared with the ARAT at all time-points, but showed a ceiling effect already at 10 days post-stroke, compared with ARAT, which showed a ceiling effect first at 4 weeks post-stroke.
In order to improve the research methodology of rehabilitation and recovery trials after stroke, an international consensus group, the Stroke Recovery and Rehabilitation Roundtable (SRRR), developed recommendations for standardized assessment (34). The SRRR lists the time-points and measurements that should be included in stroke rehabilitation and recovery trials. These time-points were based on what is known about the neural repair process and the measurements tools were identified through existing recommendations. The SRRR recommended using the FMA-UE and ARAT as assessment for impairment and activity limitation, respectively. The assessments should, according to the SRRR, be performed within 7 days after stroke onset and followed up at set time-points until at least 3 months post-stroke. Both FMA-UE and ARAT are, however, rarely used in acute settings since they are considered to be time consuming, require training and as in case of ARAT require special equipment (26, 35). Similarly to our study, there have been other suggestions for shorter tests. A short version of the FMA-UE (S-FM), including 6 items, showed good concurrent validity with the original FMA-UE (≥ 0.93) at subacute and chronic stages after stroke (35). The responsiveness of the S-FM was, however, moderate and should be interpreted with caution, as the calculations did not take into account the ordinal nature of the data (35).
ARAT-2 is a short assessment and, according to the present study, suitable for use in stroke units early after stroke. The ARAT-2 consists of items that require some shoulder abduction and finger extension, which are important early signs to predict UE activity capacity at 6 months post-stroke (36). A previous study has also shown that ARAT-2 predicts well the expected UE function required for use of the affected arm when drinking from a glass at later time-points (21). For example, the ARAT-2 score of 2 or more points, assessed at 3 days post-stroke, have showed a high probability for prediction of arm function at 10 days as well as at 12 months post-stroke (21). Similarly to other clinical scales the accuracy for prediction of long-term outcome for those with no or very little initial arm and hand function was less precise (21). The results of the current study are, however, promising and suggest that a shorter version of an established clinical scale might be useful in the clinical acute settings after stroke.
The present study showed that ARAT-2 and ARAT both showed a floor effect up to 4 weeks post-stroke. Similarly to our results, previous studies have reported a floor effect of the ARAT at 2 weeks post-stroke (26, 37). The floor effect in our sample was also detected for the pure motor FMA-UE without reflexes at 3 and 10 days, but not at 4 weeks post-stroke. On the other hand, the FMA-UE including the reflex items showed no floor and ceiling effect during the first 4 weeks post-stroke, which has also been reported by others (26, 37). The floor effect in scales assessing pure motor function and activity capacity (ARAT-2, ARAT and FMA-UE without reflex items), can be expected in the acute stage after stroke in a non-selected cohort at stroke unit (6, 8). The ceiling effect for ARAT-2 was reached at 10 days (22% of patients reached a full score), and for the ARAT at 4 weeks (21%). These results are comparable with previous studies that also demonstrated a ceiling effect at 4 weeks post-stroke in ARAT, but not in FMA-UE (26, 37).
In the current study, the ARAT-2 along with established clinical scales of ARAT and FMA-UE were able to detect changes in motor functions and activity capacity between the selected early time-points. The ARAT-2 showed the lowest (0.31) effect size for detecting a change between 10 days and 4 weeks post-stroke, which might be connected to lower sensitivity to smaller changes in ARAT-2 due to few grading categories of the ARAT-2 compared with the ARAT. An advantage of using a short assessment in the acute stage is that it will facilitate a wider use of standardized testing in clinical settings and be less demanding for the patient. The ARAT and the FMA-UE should, however, be considered as better alternatives when the highest score has been reached with the ARAT-2, and when smaller specific improvements are important to capture, e.g. to determine the effect of specific targeted interventions to improve the upper limb function or activity capacity.
One limitation of this study was that the patients were assessed with a battery of assessments, which may have caused tiredness and subsequently affected the scoring of the assessments. Tiredness is also one of the consequences particularly evident in the acute stage after stroke. On the other hand, there were only 4 patients who had missing assessments due to tiredness 3 days post-stroke. In addition, in the SALGOT protocol, the order of assessments was predefined and ARAT was always assessed first prior to FMA and before other assessments. During the assessments, breaks were also allowed at any time and the testing could also be split into 2 parts. The test procedure was constructed in order to minimize testing bias caused by tiredness. The cognitive impairments are common after stroke and may be particularly evident in the early stages after stroke onset (38). In the present study the patients were not excluded due to cognitive deficits, which may have influenced the scoring due to communication problems. Cognitive deficit was also a reason for missing assessment data in 2 participants. On the other hand, the cognitive demand to perform the clinical activity capacity and motor assessments included in the present study is low. The strength of the current study is that the cohort represents an ecologically valid unselected sample of patients with a wide range of motor deficits that are commonly seen in an acute clinical stroke setting.
There is a clinical need for a short valid and reliable assessment for upper extremity activity capacity in the acute setting of stroke. The ARAT-2 offers several benefits for assessing upper extremity activity capacity early after stroke. First, it consists of 2 items that can be administered quickly by a physiotherapist or occupational therapist. Thus, ARAT-2 can be an efficient way to assess activity capacity in patients with low endurance without getting scores confounded by fatigue. Secondly, the ARAT-2 does not rely on understanding the language or any complex instructions. Thirdly, the concurrent validity and responsiveness of the ARAT-2 were found to be satisfactory.
The results of the current study indicate that ARAT-2 is a valid and responsive tool for assessment of upper extremity activity capacity in the acute stage after stroke. However, when the highest score has been reached in ARAT-2, the assessment needs to be complemented with other assessments in order to evaluate minor deficits or improvements in upper extremity activity capacity.
The authors thank all participants and staff at Sahlgrenska University Hospital who have contributed to the data collection.
This work was supported by the Local Research and Development Board for Gothenburg and Southern Bohuslän, the Swedish Foundation for Neurological Disabilities (Neuroförbundet) and Swedish Stroke Association.
The authors declare no conflicts of interest.