Spinal Cord Independence Measure, version III: Applicability to the UK spinal cord injured population

Clive A. Glass, PhD1, Luigi Tesio, MD2, Malka Itzkovich, MD3,4, Bakul M. Soni, FRCS1, Pedro Silva, MD1, Munawar Mecci, FRCS5, Raymond Chadwick, PhD5, Waghi el Masry, FRCS6, Aheed Osman, MFRCS6, Gordana Savic, MD7, Brian Gardner, FRCS7, Ebba Bergström, MPhil5 and Amiram Catz, MD, PhD3,4

From the 1North West Regional Spinal Injuries Centre, Southport, UK, 2Institute of Human Physiology II, Università degli Studi, and the Research Laboratory in Neuromotor Rehabilitation, IRCCS Istituto Auxologico Italiano, Milan, Italy, 3Spinal Department, Loewenstein Rehabilitation Hospital, Raanana, 4Tel Aviv University, Tel Aviv, Israel, 5The North of England Spinal Cord Injuries Centre, Middlesbrough, 6Midlands Centre for Spinal Injuries, RJ & AH Orthopaedic Hospital, Oswestry and 7National Spinal Injuries Centre, Stoke Mandeville Hospital, UK

OBJECTIVE: To examine the validity, reliability and usefulness of the Spinal Cord Independence Measure for the UK spinal cord injury population.

DESIGN: Multi-centre cohort study.

SETTING: Four UK regional spinal cord injury centres.

SUBJECTS: Eighty-six people with spinal cord injury.

Interventions: Spinal Cord Independence Measure and Functional Independence Measure on admission analysed using inferential statistics, and Rasch analysis of Spinal Cord Independence Measure.

Main outcome measures: Internal consistency, inter-rater reliability, discriminant validity; Spinal Cord Independence Measure subscale match between distribution of item difficulty and patient ability measurements; reliability of patient ability measures; fit of data to Rasch model; unidimensionality of subscales; hierarchical ordering of categories within items; differential item functioning across patient groups.

RESULTS: Scale reliability (kappa coefficients range 0.491–0.835; (p < 0.001)), internal consistency (Cronbach’s alpha 0.770 and 0.780 for raters), and validity (Pearson correlation; p < 0.01) were all significant. Spinal Cord Independence Measure subscales compatible with stringent Rasch requirements; mean infit indices high; distinct strata of abilities identified; most thresholds ordered; item hierarchy stable across clinical groups and centres. Misfit and differences in item hierarchy identified. Difficulties assessing central cord injuries highlighted.

CONCLUSION: Conventional statistical and Rasch analyses justify the use of the Spinal Cord Independence Measure in clinical practice and research in the UK. Cross-cultural validity may be further improved.

Key words: spinal cord injuries, statistics, rehabilitation.

J Rehabil Med 2009; 41: 723–728

Correspondence address: Clive A. Glass, North West Regional Spinal Injuries Centre, District General Hospital, Town Lane, Southport Merseyside, PR8 6PN UK. E-Mail: clive.glass@southportandormskirk.nhs.uk

Submitted May 7, 2008; accepted April 27, 2009

INTRODUCTION

The Spinal Cord Independence Measure (SCIM) is a disability profile containing 3 sub-scales developed specifically for people with spinal cord injury (SCI). Through measures across its distinct scales, the profile describes patients’ ability to undertake activities of daily living. Three versions of the SCIM (I–III) have been developed consecutively since 1997.

The developers of the scale (1) initially undertook conventional and then Rasch analyses of SCIM-II (2) in order to validate the scale. After consultation with colleagues from various countries, they developed the scale further (3, 4) (SCIM-III), with 19 tasks organized into 3 domains represented by 3 sub-scales: Self-care, Respiration and sphincter management, and Mobility. The combined scores on all 19 tasks together allow for an individual to attain a score ranging from 0 to 100, with higher scores reflecting greater ability. Linear properties of the scores, however, were looked for only within each sub-scale.

The psychometric properties of SCIM-III have been further defined and refined through conventional descriptive and inferential statistical analysis and Rasch analysis in 2 international, multi-centre studies (3, 4). The present investigation uses the UK sub-set of the data from these multi-centre international SCIM-III investigations in order to:

• examine the validity and reliability of SCIM-III for people with SCI in the UK;

• review the specifics of the Rasch findings for the UK population in isolation;

• compare the UK and combined data from the other countries to identify commonalities and differences in Rasch response patterns.

PATIENTS AND METHODS

Patients

Patients involved in rehabilitation immediately after their SCI, from 4 UK SCI centres, were enrolled consecutively over a 12-month period. Inclusion criteria required each patient to have experienced a recent spinal cord lesion (American Spinal Injury Association (ASIA) impairment scale (AIS) grade A, B, C or D) and be ≥ 18 years of age. Patients with co-morbidities (e.g. traumatic brain injury or significant mental health difficulties) or any other condition that might influence their everyday functional ability were excluded.

Methods

Instruments. The SCIM is a disability scale developed specifically for people with SCI in order to describe their ability to accomplish activities of daily living and to make functional assessments of this population more sensitive to changes (3). SCIM-III is comprised of 19 items in 3 subscales, which are: (i) self-care (sub-score 0–20); (ii) respiration and sphincter management (sub-score 0–40); and (iii) mobility (sub-score 0–40). The mobility subscale is further sub-divided to assess mobility “in room and toilet” and for “indoors and outdoors, on an even surface”. The total score ranges from 0 to 100.

The Functional Independence Measure (FIMTM) was developed to evaluate minimum functional abilities and burden of care of a disability. The FIMTM emerged from a thorough developmental process overseen by a National Task Force of rehabilitation research in the USA. It was not designed specifically for SCI. It evaluates 6 areas of function, based on 18 tasks. Scoring of each task ranges from 1 to 7, with 1 requiring full assistance and 7 being complete independence. The scale reflects the time, energy, effort, and equipment that are used to achieve the task.

Scoring procedure. All patients were evaluated both with the SCIM III and the FIMTM questionnaires (see above) within a week after the beginning of the rehabilitation programme and within a week of discharge from inpatient rehabilitation. Each SCIM III and FIMTM item was scored by direct observation, by 3 expert professionals selected at each unit (a physician, nurses, occupational therapists, or physiotherapists). In exceptional cases, as for example in the case of bowel habits for which direct rater observation was troublesome, some specific tasks could be scored according to information obtained from a staff member who had been observing the patient during routine care. Patient data and SCIM III and FIMTM scores were collected in each participating UK unit, anonymized data was entered into Excel files, e-mailed to 2 of the authors, and pooled for analysis.

Conventional statistics

Descriptive statistics. This was undertaken to provide summary statistics of sample age, gender, injury aetiology, level and cause of injury, and ASIA classification.

Inferential statistics. Internal validity: agreement, reliability, internal consistency. Kappa co-efficients were computed between raters. Intra-class correlation coefficients (ICC) (model 2,1) were computed within the 4 sub-domains of the scale. Cronbach’s alpha was computed as an index of SCIM-III internal consistency.

External-concurrent validity (SCIM to FIMTM comparison). Pearson product moment correlation was used to establish the level of correlation between SCIM-III and FIMTM scores. McNemar test was used to establish the sensitivity to change within each of the 4 sub-domains of SCIM in comparison with FIMTM.

Rasch modelling. The technicalities of Rasch analysis (4) go beyond the scope of this paper (5), and its application with this patient group has been explored in earlier publications by the present authors (2, 4, 6) and will not be replicated here. The purpose of this paper is to evaluate, for a sample of patients treated in the UK, the following metric properties of SCIM-III:

• patient-scale item match (Rasch ruler);

• reliability of patient ability measures, separability, and discernible strata;

• fit between observed and expected scores;

• unidimensionality (“internal consistency”);

• category ordering;

• differential item functioning (DIF).

It is proposed that difference may exist between the UK and wider international samples used in previous publications by the group, for a number of practical and organizational reasons.

The UK sample of 86 patients was the second largest national group included in the original SCIM-III international papers (3, 4), and the sample characteristics differed from the other national groups in a number of areas. The group contained the largest percentage of male participants (83.7% compared with an mean of 69.3% for all other countries), the highest number of trauma cases (82% compared with an mean of 56% for all other countries), the highest percentage of tetraplegic patients (46.5% compared with an mean of 44% for all other countries), and the lowest percentage of patients with lower level of impairments (AIS grade D). The non-traumatic aetiologies reported were comparable to those reported in the international Rasch study (4), both in terms of condition and proportion; the UK group may therefore be considered the most traumatic and highest mean level of disability group within the international sample, which may impact on Rasch item match and DIF. Furthermore, as the case-mix and management policies (e.g. admission and discharge criteria) differ between the UK and many of the other countries included in the international data-set, it is possible that such differences might emphasize the limitations of the basic scale, producing differences in, for example, floor-ceiling boundaries and, more subtly, the hierarchy of item difficulties.

Statistical analyses

Software programs. Rasch and Factor analyses were performed using dedicated software packages (WinstepsTM, Rasch Measurement Software (Version 3.55) and FACETSTM, Many-facet Rasch analysis, both released by M. J. Linacre, www.winsteps.com, Chicago, 2005). The partial credit version of Rasch modelling was adopted.

RESULTS

Descriptive statistics

Demographic and clinical features. Demographic and clinical information patient data for each of the 4 participating centres in the UK is shown in Table I. For comparison, the mean scores for all 13 participating centres in the original multi-centre international collaborative study are also included.

Table I. Patient demographic data for each participating centre
Centre	n	Age, years Mean (SD)	Gender, n (%) Male/female	Aetiology, n (%) Trauma/non-trauma	Level, n (%) Tetraplegia/paraplegia	Initial ASIA grade (%)
Centre	n	Age, years Mean (SD)	Gender, n (%) Male/female	Aetiology, n (%) Trauma/non-trauma	Level, n (%) Tetraplegia/paraplegia	A	B	C	D
Stoke-Mandeville	16	38.3 (14.4)	14 (87.5)/2 (12.5)	13 (81.2)/3 (18.8)	8 (50.0)/8 (50.0)	10 (62)	2 (13)	2 (13)	2 (13)
Middlesbrough	10	51.5 (13.5)	8 (80.0)/2 (20.0)	7 (70.0)/3 (30.0)	3 (30.0)/7 (70.0)	4 (40)	1 (10)	5 (50)	0
Oswestry	23	43.8 (17.0)	18 (78.2)/5 (21.8)	16 (69.6)/7 (30.4)	7 (30.4)/16 (69.6)	13 (56)	3 (13)	4 (17)	3 (13)
Southport	37	42.8 (17.4)	32 (86.5)/5 (13.5)	33 (89.2)/4 (10.8)	22 (59.4)/15 (40.6)	14 (38)	7 (18)	8 (22)	8 (22)
UK – All	86	43.2 (16.5)	72 (83.7)/14 (16.3)	69 (80.2)/17 (19.8)	40 (46.5)/46 (53.5)	41 (48)	13 (15)	19 (22)	13 (15)
All countries (3, 4)	425	46.9 (18.2)	309 (72.7)/116 (27.3)	261 (61.4)/164 (38.6)	188 (44.2)/237 (55.8)	151 (36)	59 (14)	92 (22)	119 (28)
ASIA: American Spinal Injury Association impairment scale; SD: standard deviation.

Eighty-six patients were included in the study; 72 men and 14 women; 40 with tetraplegia and 46 with paraplegia. Mean age of the sample was 43.2 years (SD 16.5), age range 18–82 years. AIS grade was A in 41 patients (47.7%), B in 13 (15.1%), C in 19 (22.1%) and D in 13 (15.1%). Lesion aetiology was traumatic in 69 patients (80.2%) and non-traumatic in 17 (19.8%).

The non-traumatic aetiologies were benign tumour in one patient (1.2%), disc protrusion in one patient (1.2%), myelopathy of unknown origin in 3 cases (3.5%), vascular impairment in 4 cases (4.7%), and other in 8 patients (9.3%).

Inferential statistics

Reliability. Kappa co-efficients for each of the SCIM tasks (n = 19) are shown in Table II. The total agreement between raters is greater than 0.80 on 15 of the 19 SCIM-III tasks. For single items, kappa coefficients range from 0.491 (stair management) to 0.835 (mobility outdoors) and are all statistically significant (p < 0.001). Floor effect was evident in the item “transfers ground/wheelchair”, which was scored zero for 53 patients by both raters. The reduced variance explains why the agreement is high, yet non-significant.

Table II. Total agreement and kappa coefficients of Spinal Cord Independence Measure tasks (n = 84)
Task	Total agreement, %	Kappa
Feeding	86.9	0.810
Bathing upper body	75.0	0.629
Bathing lower body	75.0	0.627
Dressing upper body	85.5	0.786
Dressing lower body	86.9	0.634
Grooming	83.3	0.868
Respiration	86.9	0.791
Sphincter management – bladder	83.3	0.684
Sphincter management – bowel	86.9	0.777
Use of toilet	83.3	0.555
Mobility in bed	77.4	0.631
Transfers bed/wheelchair	91.7	0.815
Transfers wheelchair /toilet/tub	85.7	0.639
Mobility indoors	88.1	0.812
Mobility moderate distance	86.9	0.775
Mobility outdoors	91.7	0.835
Stair management	97.6	0.491
Transfers wheelchair/car	92.8	0.595
Transfers ground/wheelchair (n = 53)	100.0	NC
All kappa coefficients are significant (p < 0.001). NC: not computed because of invariability in scoring.

ICC scores for the SCIM-III total and the 4 sub-domain scores were 0.956 (SCIM total), 0.941 (self-care), 0.844 (respiratory and sphincter management), 0.945 (Mobility “in”) and 0.956 (Mobility “out”). Values greater than 0.75 are usually considered acceptable.

SCIM-III internal consistency scores were assessed (Cronbach’s alpha). Item scores should correlate with each other, giving rise ideally to Cronbach’s alpha values above 0.7 (7). The UK results show SCIM-III total Cronbach’s alpha scores of 0.770 and 0.780 for raters 1 and 2, respectively. However, the areas “respiration and sphincter management” (alpha 0.600 and 0.645) and “mobility in the room and toilet” (alpha 0.652 and 0.656) both show an unsatisfactory alpha level.

Validity. The Pearson correlation values, r, between SCIM-III and FIMTM scores were calculated for each of the 2 raters and were 0.798 (p < 0.01) and 0.782 (p < 0.01), respectively.

The ability to identify a 1-point change (admission to discharge) within the 4 areas of SCIM-III in comparison with the total FIMTM score were compared using the McNemar test. SCIM-III detected more numerous changes than FIMTM in 3 of the 4 areas; self-care, respiration and sphincter management, and mobility indoors and outdoors, but not mobility in the room and toilet. The differences between the 2 scales’ responsiveness to changes are not statistically significant.

Rasch analyses

Patient-scale item maps. The results of the “person-item map” were calculated for self-care, respiration and sphincter management, and mobility, respectively. These identified that in all SCIM III subscales, the spread of item difficulties matched the distribution of person ability measurements. The density of the item difficulty levels show a considerable gap in the “respiration and sphincter” sub-domain, and this is reproduced in Fig. 1.

Fig. 1. Respiration and sphincter management (n = 82).

Reliability and separability of ability estimates. The person reliability index (“real” version, see Bond and Fox (5) was 0.88 for the SCIM III self-care subscale, 0.61 for the respiration and sphincter management subscale, and 0.81 for the mobility subscale. The measurement process succeeded in distinguishing 4 strata of person abilities for self-care, 3 strata for mobility, but only 2 strata for the respiration-sphincter sub-domain.

Fit between observed and expected scores. The mean item fit (infit mean square, ideal = 1) was 1.01, 1.18 and 0.97, respectively. Individual item and category fit was also satisfactory in general. Problematic fit values (> 1.4) were most prevalent for the mobility sub-scale and this is shown in Table III. Within the categories (the grades within items), high fit values between observed and expected scores were found in respiration and sphincter management for category 2 and 4 in use of toilet, and in mobility for category 2 in the set of 3 transfer items, for category 5 in the set of 3 mobility items, and for categories 0, 1 and 2 in the stair management item.

Unidimensionality. Factor analysis of the residual variance supported the unidimensionality of the SCIM III subscales. Most of the observed score variance was explained by the model-expected measurements of ability: 99.2% for self-care, 98.2% for respiration and sphincter management, and 95.6% for mobility. Within the small “unexplained” percentages (residual variance), 0.3%, 0.7% and 1.6% of the total variance, was explained by the strongest extractable factor for self-care, respiration and sphincter management and mobility, respectively

Category ordering. Category thresholds were in correct hierarchical ordering for the majority of items in each of the 3 areas of SCIM-III (self-care, respiration and sphincter management and mobility). The majority of the categories for each of the 3 subscales were emergent, but disordered categories were observed in some of their items. These are highlighted for bowel management (Fig. 2). Fig. 2 shows that category 8 is “flattened” (non-emergent); the transition from score 0 to 10 occurs at a lower level of ability than the transition from score 5 to 8, thus contradicting the intended meaning of the categories.

Fig. 2. Respiration and sphincter management: Category probability curve of bowel management (n = 82).

Differential item functioning. DIF was of minor relevance for all items across contrasted subgroups comparing the UK data with the other 5 countries’ data from the international study. In all contrasts, item difficulty values in the X-Y plots lay within the 95% confidence bands surrounding the identity line. Fig. 3 highlights this trend for the mobility sub-scale items.

Fig. 3. Differential item functioning: UK vs other countries (Mobility). SCIM III: The Spinal Cord Independence Measure version III.

DISCUSSION

The UK sample of 86 patients was the second largest national group included in the original SCIM-III international Rasch paper (4), and the sample characteristics differed from the other national groups in a number of areas highlighted in the Methods section of this paper. Most pertinent perhaps is that the UK group contains the largest percentage of male participants (83.7% vs 69.3% for all other countries) and the greatest percentage of trauma cases (82% vs 56%), which contained the highest and lowest percentages of tetraplegic patients and patients with AIS grade D, respectively.

Inter-rater reliability estimates were high, both for the scale overall, and for each of the 4 areas assessed (self-care, respiration and sphincter management, mobility indoors and outdoors, and mobility in room and toilet). The Pearson correlation values between SCIM-III and FIMTM are high, suggesting that both scales tackle the disability of the UK SCI population, although from different perspectives. The FIMTM assesses disability from the perspective of the burden of care, whereas SCIM-III takes more into account the patient’s performance; a considerable difference in emphasis and purpose (8). The results of the McNemar test further indicate that SCIM-III identified greater changes in functioning than FIMTM in 3 of the 4 areas of the scale, although floor effects prevented the achievement of statistical significance. Were further refinement of SCIM-III to be considered, in particular when reviewing mobility in the room and toilet items, this might theoretically allow for clearer identification of the specificity of SCIM-III, with respect to FIMTM. In any case, the results suggest that improvement in motor functions not leading to significant decrease in burden of care, yet meaningful for patient’s health and satisfaction, may be better detected by SCIM-III specifically developed for people with SCI.

Rasch analysis confirms, on a sounder basis, the findings from conventional statistics, with respect to reliability, validity and usefulness in UK patients. Furthermore, the DIF findings indicate substantial metric equivalence of the instrument across all the countries engaged in the international Rasch study, whilst the Rasch analysis highlighted flaws affecting items and categories that systematically elicited unexpected responses, thus leading to significant mis-fit. Concerns remain with respect to mobility for long distances and stair management and, specifically in this study they are also raised for transfer from wheelchair to car. As suggested in the international Rasch study, this may indeed reflect a pattern of functioning specific to patients with high-level tetraplegia (impairment of upper limb functions, not only of lower limbs), a group of patients most highly represented in the UK sample. The 2 groups of patients most likely to present “misfit” to mobility items are those with tetraplegia who are able to control an electric wheelchair, or patients with central cord lesions, who may retain considerable lower limb function yet experience significant loss of upper limb function. Whilst this may be a peculiarity of the UK data-set, it is probably a conceptual challenge deserving close consideration in future SCIM-III studies. Considering this matter further, the UK data sample contains the lowest number of non-traumatic cases (18%) of any country from the international data-set (4). Furthermore, recent SCI research has shown not only that non-traumatic cases differ remarkably from traumatic cases (in so far as spinal cord lesions are more often incomplete and with more variable distribution) (9), but also that non-traumatic aetiology is related to a lower incidence of cervical and complete lesion (10), and equally importantly, that ASIA grading itself may be less reliable in incomplete lesions (11). Whilst none of these variables can be assumed to affect the international SCIM-III data-set, or the present data sample, the UK data-set is the most “consistently traumatic” group and may add weight to the SCIM-III findings from this sample for the UK generally.

The large numbers of patient ability levels around the bottom of the self-care and mobility “rulers” is an additional problem for these 2 subscales. The floor effect might be decreased at discharge, but it might also reflect the higher number of people with tetraplegia in the present UK sample.

Two categories of the respiration and sphincter management subscale were also troublesome. It had been suggested in the international Rasch study that problems may stem from difficulty in direct observation of people during toilet use, and that ambiguous interpretations of the categories may also contribute to misfit and disordering. Given the relatively high level of dependence of the current UK patient population, an observational bias seems unlikely, providing support to the hypothesis of misinterpretation of the wording of these items. The scale authors have commented that the respiration and sphincter management measures were included in the same subscale during development of the SCIM only for convenience, and the possibility of separation into separate subscales will be considered when any new version of the SCIM is developed.

As this UK study utilizes, in part, data from the international Rasch study, a potential bias exists towards favourable reporting of results from co-authors of the parent study. However, data analysis and interpretation were primarily supported by the second author, who was not involved either in developing the SCIM-III, coordinating the study, or collecting the data.

The various sources of specificity of the sample analysed in this study (country and clinical pictures) represent a challenge to generalizability of the results. The UK sample contained the highest numbers of higher level injuries, which might reasonably explain some of the mobility and self-care floor effects, and the higher levels of DIF. Furthermore, the modest UK sample size might, in part, obscure the statistical significance of other DIF findings. Of particular interest in a future UK investigation would be an assessment of how SCIM-III scores change between admission and discharge. Beyond the interest of measuring these changes themselves, this might show lower DIF values, given the more homogeneous status reached by patients at discharge. However, the UK sample provides an insight into a unique aspect of the international data-set, as they are the only group who in total receive their acute care and rehabilitation within the same care facility and under the care of the same clinical and rehabilitation team. Similarly, they are a non-selected, consecutively admitted group that reflects a realistic approximation of “real-time” incidence and severity in 4 major geographical areas of the UK, serving a population of almost 50% of the country. On this basis the study results can be considered to support a contention of generalizability of the SCIM results across all UK facilities.

In conclusion, conventional inferential and Rasch analyses justify the use of the SCIM-III in future research in the UK. The nature of the UK sample utilized in the study, and the high level of validity and reliability with the SCI population, would also support the contention that the scale may reasonably be used in clinical practice. The cross-cultural validity of the instrument may be improved further by fine-tuning parts of the scale. It is proposed that the SCIM-III instrument has sufficient merit for its use to be extended to a larger international data-set, incorporating recording at initial mobilization and (as a minimum) at completion of rehabilitation.

Acknowledgement

Professor Tesio was funded by Università degla studi (FIRST 2006) and Ministero della Salute (Ricerca strategica 2008).

REFERENCES

Original report

Spinal Cord Independence Measure, version III: Applicability to the UK spinal cord injured population

Comments