Content » Vol 43, Issue 2

Original report

Criteria for validating comprehensive ICF Core Sets and developing brief ICF Core Set versions

Eva Grill, DrPH, PhD1,2 and Gerold Stucki, MD,MS2,3,4

From the 1Institute for Health and Rehabilitation Sciences (IHRS), Ludwig-Maximilians-Universität München, Munich, Germany, 2ICF Research Branch of WHO Collaborating Centre for the Family of International Classifications in
German, 3Swiss Paraplegic Research, Nottwil and 4Seminar of Health Sciences and Health Policy, University of
Lucerne, Switzerland

OBJECTIVE: To describe the empirical processes used to (i) validate the comprehensive International Classification of Functioning, Disability and Health (ICF) Core Sets, and (ii) develop brief ICF Core Sets from the ICF Categories of these more comprehensive ICF Core Sets.

DESIGN: Prospective multi-centre cohort study.

Patients: Patients receiving rehabilitation interventions for musculoskeletal, neurological or cardiopulmonary injury or disease in acute hospitals or early post-acute rehabilitation facilities.

METHODS: Functioning was coded using the ICF. For validation, absolute and relative frequencies (prevalences) of impairment, limitation or restriction were reported at admission and end-point (discharge or 6 weeks after admission). Aspects not covered were extracted and translated into the best corresponding ICF category. The criterion for selecting candidate categories for the brief ICF Core Sets was based on their ability to discriminate between patients with high or low functioning status. Discrimination was assessed using multivariable regression models, the independent variables being all of the ICF categories of the respective comprehensive ICF Core Set. Analogue ratings of overall functioning as reported by patients and health professionals were used as dependent variables.

CONCLUSION: We present an algorithm to identify candidate categories for brief ICF Core Sets extracted from the comprehensive acute and post-acute ICF Core Sets.

Key words: ICF; rehabilitation; health status measurements; classification; regression analysis; outcome assessment.

J Rehabil Med 2011; 43: 87–91

Correspondence address: Eva Grill, Institute for Health and Rehabilitation Sciences, Ludwig-Maximilians-Universität München, DE-81377 Munich, Germany. E-mail:


Human functioning and its converse notion disability are universal experiences, which must be understood in the context of an individual’s personal resources, particular health conditions and expectations, and in interaction with the environment (1). Transient or permanent disability may arise from any acute injury or disease, interfering in the individual’s engagement in normal function. Indeed, the World Health Assembly in its resolution on disability, its prevention, management and rehabilitation, has called for the timely identification of disability in the clinical setting (2). Consequently, obtaining the means for objective measurement of functioning is a necessary first step towards recognizing and ameliorating the course of disability following acute illness. As Lord Kelvin said in his defence of empiricism, “when you can measure what you are speaking about, … you know something about it; but when you cannot measure it, … your knowledge is of a meagre and unsatisfactory kind” (3). This principle drawn from the physical sciences generalizes to the case of disability, the understanding and management of which requires the use of appropriate measuring scales or instruments (4).

Healthcare professionals in the acute hospital should be able to make a brief assessment of their patients’ functioning, and set in motion timely strategies for meeting their subsequent rehabilitation needs. Care providers have first to identify especially vulnerable patients, such as the aged, or those with co-morbidity. In order to communicate their patients’ particular needs with rehabilitation professionals, there must be a standard system of describing human functioning and rating disability. In situations entailing post-acute and long-term rehabilitation, professionals specialized in rehabilitation management should share this understanding of functioning, and utilize clinical assessment instruments that are based on a standard model of functioning. While a multitude of measuring instruments has been used in post-acute rehabilitation settings, typical instruments vary with respect to their underlying models and scales, and are tailored for specific populations. Accordingly, the methods differ in their sensitivity to discover incremental gains in recovery of functioning (5). Thus, there is urgent need for implementing improved and standardized outcome measurement in rehabilitation (6).

The International Classification of Functioning, Disability and Health (ICF), a part of the international family of classifications of the World Health Organization (WHO), was established as just such an approach to standardizing the assessment of functioning of individuals and populations. The ICF endeavours to organize all domains of functioning and their contextual factors that are encountered in human life, and may thus arguably constitute the prototypical framework for all medicine. It also provides the potential framework for transition along the continuum of care. For example, assessment of functioning in acute care cannot be carried over to other episodes of care, such as rehabilitation, unless there is a common assessment scheme. An assessment must be exhaustive by its very nature and becomes very complex in daily use unless it is transformed into practice-friendly tools. Comprising over 1400 categories, the entire volume of the ICF cannot be applied by the clinicians to all their patients. In daily practice clinicians will need only a fraction of the categories found in the ICF. Although there are generic instruments based on the ICF that are designed as practical translations of the ICF and are usable across a wide range of applications, the generic character may be a drawback in specific settings. Thus, in this trade-off between generalizability and the need to capture detail, the ICF must be adapted to the perspectives and needs of different users. The need to tailor ICF to the needs of particular contexts is the primary motivation behind the ICF Core Set project, which aims to extract selections of ICF categories from the entire classification that are relevant to specific health conditions or care situations.

In general, the ICF Core Set project defines on an empirical basis a category as relevant if it describes a problem frequently encountered in typical patients, measures an end-point in clinical trials, or emerges as relevant in discussion among health professionals. The resultant information is then summarized and implemented as the basis for a formalized consensus process involving expert health professionals (7). By including all potentially relevant categories, the selection process is comprehensive, omitting only those factors that proved to be irrelevant to designing treatment strategy or assessing outcome. Early feedback from health professionals suggested that the definition of ICF Core Sets was a step in the right direction towards establishing evidence-based measurement in rehabilitation. Due to the consensus process, the comprehensive ICF Core Sets in their present version are applicable for the assessment of individual problems and needs, and for the estimation of prognosis and the potential for rehabilitation, and for assessment of functioning in the acute and post-acute situation. As such, the comprehensive ICF Core Sets can be used to coordinate rehabilitation interventions, e.g. at the intensive care unit, or to communicate, e.g. in a rehabilitation team conference. However, a minimally sufficient data-set that is feasible in clinical practice may encompass only 20 different concepts or topics, but not much more as contained in the comprehensive ICF Core Sets. Thus, subsets can be extracted from the comprehensive Core Sets according to specific needs of the individual user.

In order to identify abbreviated ICF Core Sets, i.e. brief ICF Core Sets, suited for use in particular contexts, one must possess an adequate understanding of the methodological framework used for creating measures. The Outcome Measures in Rheumatology project identifies 3 different properties relevant to the applicability of measures, namely truth, discrimination and feasibility (8). The criterion truth refers to the question of what should be measured. As noted above, the process for the development of comprehensive ICF Core Sets assured that all the relevant aspects of functioning were included, but the empirical validation of the choice of categories remains to be completed. The criterion discrimination refers to the ability of a measure to discriminate between different states of functioning or medical conditions. A discriminating measure must enable the distinguishing between different patient groups in a cross-sectional manner, and assess change in functioning over time. Finally, the term feasibility is satisfied when a measure can, in practical terms, be applied by health professionals, given circumstances of restricted time and resources. Given this consideration, we settled on defining practical and applicable brief ICF Core Sets with no more than 20 items or ICF categories. Setting this upper limit was based on the precedent of generic health status measures, e.g. the SF-12 (9) with 12 items, or the Stanford Health Assessment Questionnaire (10) with 20 items. The categories must be selected with care, so as to remain representative of the comprehensive ICF Core Sets.

Therefore, to satisfy the criteria truth, discrimination and feasibility for these comprehensive and brief ICF Core Sets, we make a point of validating the comprehensive ICF Core Sets and identifying candidate categories for practical and applicable subsets, the brief ICF Core Sets.

The first objective of the present study was to describe the empirical process used for validating the comprehensive ICF Core Sets. A further objective of this study was to propose general methods for identifying candidate categories for brief ICF Core Sets, selected from the comprehensive acute and post-acute ICF Core Sets.


Study design and population

The study design was a prospective multi-centre cohort study conducted from May 2005 to August 2008. The study population was recruited from 5 acute hospitals and 9 early post-acute rehabilitation facilities, including 5 facilities specialized in geriatric rehabilitation (Appendix I). Patients were eligible if they were at least 18 years of age and received rehabilitation interventions for musculoskeletal, neurological or cardiopulmonary injury or disease. On the basis of these inclusion criteria, participants were selected consecutively by the study centre coordinators. Informed consent was obtained from the patients or from the patient’s care giver in cases where the patient was unable to make an informed decision. Approval was obtained from institutional ethics committees from all involved institutions prior to starting the study.

Appendix I. Participating institutions

Participating acute hospitals

University Hospital Vienna, Department of Physical Medicine and Rehabilitation, Vienna, Austria

Kaiser-Franz-Josef-Spital, Institute for Physical Medicine and Rehabilitation, Vienna, Austria

University Hospital Zurich, Department of Rheumatology and Institute for Physical Medicine, Zurich, Switzerland

Hannover Medical School, Department of Rehabilitation Medicine, Hannover, Germany

Orthopaedic University Hospital, Heidelberg, Germany

Participating rehabilitation facilities

University Hospital Munich, Department of Physical and Rehabilitative Medicine, Munich, Germany

General Hospital Schwabing, Clinic for Physical Medicine und Early Rehabilitation, Munich, Germany

Nuremberg Hospital, Clinic and Institute for Physical and Rehabilitation Medicine, Nuremberg, Germany

Ingolstadt Hospital, Institute for Physical and Rehabilitative Medicine, Ingolstadt, Germany

Sophienspital, Institute for Physical Medicine and Rehabilitation, Vienna, Austria

Kaiser-Franz-Josef-Spital, Institute for Physical Medicine and Rehabilitation, Vienna, Austria

Malteser Hospital Bonn, Clinic for Geriatrics, Bonn, Germany

Schön Klinik Rosenheim, Centre for Geriatric Rehabilitation, Rosenheim, Germany

Arbeiterwohlfahrt Clinic for Geriatric Rehabilitation, Würzburg, Germany


ICF Core Sets. The ICF is a multipurpose classification belonging to the WHO family of international classifications. The ICF provides a comprehensive framework for quantifying and depicting functioning, health and health-related domains (11), and was designed to facilitate communication between different users, including healthcare workers, researchers, policymakers and the public. The classification is organized in a hierarchical structure consisting of two main parts, each with separate components. The first part encompasses functioning and disability with 3 components: “Body Functions” (coded b) and “Body Structures” (s), and “Activities and Participation” (d). The second part of ICF covers contextual factors, and has two components: “Environmental Factors” (e) and “Personal Factors” (not coded). The ICF categories of each component, with exception of the “Personal Factors”, which are not yet classified, have a further hierarchical taxonomy, with as many as 4 levels, divided into dimensions and chapters. The hierarchical code system is represented as an abbreviation of the component, with an extension for the chapter number (e.g. b2 Sensory functions and pain), and further extensions for the second (e.g. b210 Seeing functions), third (e.g. b2100 Visual acuity functions) and fourth levels (e.g. b21000 Binocular acuity of distant vision).

We have developed the comprehensive ICF Core Sets in order to facilitate and encourage the use of the ICF in clinical practice and research. The comprehensive ICF Core Sets are selections from the entire list of ICF categories, which emerged from a multi-stage consensus process seeking to identify those aspects of functioning most relevant for patients in specific settings or with specific health conditions. The consensus approach integrated evidence from empirical studies and input from experts. In particular, a consortium consisting of the ICF Research Branch of the WHO Collaborating Center of the Family of International Classifications (Deutsches Institut für Medizinische Klassifikation und Information, DIMDI, Germany) at the University of Munich, Germany, the Classifications, Assessments and Survey Team and its partner organizations, developed 6 comprehensive ICF Core Sets for patients with neurological, cardiopulmonary and musculoskeletal conditions in the acute and post-acute situation, and one comprehensive ICF Core Set for aged patients (12–18).

For scoring of the Core Sets, the ICF suggests using qualifiers ranging from 0 to 4 for each category. Because the properties of all qualifiers are not yet sufficiently evaluated, in the present study we used a simplified qualifier, defined as follows. Each category of the components Body Functions and Activities and Participation was graded with the qualifiers 0 for “no impairment/limitation”, 1 for “moderate impairment/limitation”, and 2 for “severe impairment/limitation”. The categories of the component Body Structures were graded with the qualifiers 0 for “no impairment” and 1 for “impairment”. The categories of the component Environmental Factors were graded either as facilitator or barrier, or both, with 0 for “no barrier/facilitator” and 1 for “barrier/facilitator”. Impairments of body functions or structures, and limitations or restrictions of activities and participation were recorded if they were directly associated with the condition necessitating rehabilitation. In order to investigate the completeness of the comprehensive ICF Core Sets, the interviewers were asked to identify any aspects of functioning relevant to their patients not covered by the comprehensive ICF Core Sets.

Visual analogue scale for functioning. To describe an overall view of functioning, the patients were asked to appraise their personal limitations in overall functioning at the using a horizontal visual analogue scale, ranging from 0, for complete limitation in all aspects of functioning to 10, for no limitation in functioning. “Overall functioning” was defined as encompassing all aspects of physical or mental state, of daily living, mobility and interaction with the environment and with others. Patients were asked to relate to their current health condition and their present state. Independently, and blinded to the patients’ responses, the health professionals were asked to appraise their patients’ functioning on the same analogue scale, also for the current health condition and the present state.

Data collection procedures

Patients were recruited and interviewed by health professionals trained in the application and principles of the ICF. Interviewers were trained during a structured one-day meeting, and were provided with a comprehensive manual. Ongoing supervision of interviewers was ensured by periodic telephone calls between each interviewer and the responsible member of their research team. Data were collected primarily from patients’ medical record sheets, by interview with health professionals in charge of the patients, and by patient interviews. ICF Core Set categories from the components Body Functions, Body Structures and Activities and Participation were assessed within the first 24 h after admission (baseline) and within the last 36 h before discharge or, if length of stay was longer than 6 weeks, at 6 weeks after admission (end-point). ICF categories from the component Environmental Factors were assessed only at baseline, since no change was to be expected during the hospital stay. The incoming case record forms were checked for conspicuous errors by a member of the research team before being entered in the database, with consultation of the responsible interviewer as required to resolve discrepancies.

Statistical analysis

Validation of the comprehensive ICF Core Sets. For the categories of the ICF components Body Functions, Body Structures and Activities and Participation, we calculated the absolute and relative frequencies (prevalences) of impairment, limitation or restriction at baseline and end-point. For the categories of the ICF component Environmental factors, we calculated the absolute and relative frequencies (prevalences) of persons who regarded a specific category as constituting either a barrier or facilitator. Relative frequencies of persons for whom the ICF category changed during the study period were calculated, along with their 95% confidence intervals.

Aspects of functioning not covered by the comprehensive ICF Core Sets but identified as relevant were extracted and translated into the best corresponding ICF category (19). Absolute and relative frequencies of occurrence of those ICF categories were reported; any such category with prevalence below 5% and not showing significant change over time was considered as not relevant. Significance was evaluated using binomial tests, with significance level set at 0.05. Because of the exploratory nature of the test procedure, we refrained from correcting for multiple testing.

Decision rules for candidate categories for ICF Core Sets. The criterion for selecting candidate categories for the ICF Core Sets was based on their ability to discriminate between patients with high or low functioning status. Discrimination was assessed using multivariable regression models, in which the independent variables were all of the ICF categories of the respective comprehensive ICF Core Set. Analogue ratings of overall functioning as reported by patients and health professionals were used as dependent variables. To improve prediction accuracy, and to derive small subsets of independent variables having the strongest effects on the dependent variable, we used the least absolute shrinkage and selection operator (LASSO) (20). This procedure minimizes the residual sum of squared errors with a bound on the sum of the absolute values of the coefficients. To avoid large variance, as often occurs in ordinary least square regression, the LASSO sets some regression coefficients to zero and shrinks others based on a pre-set regularization parameter, the so-called penalty. Thus, the method acts recursively to select valid subsets with adequate discrimination. The number of variables, i.e. ICF categories, included in the subsets can be increased or decreased by changing the penalty. It can be interpreted that those categories included in the model with a high penalty value have stronger effects than those entering later in the process, when the penalty is relaxed.

To validate the approach for selection of ICF Core Sets described above, we additionally used the Random Forest algorithm, which is based on Classification and Regression Trees (CART) non-parametric regression techniques. CART divides a population into several subpopulations depending on certain characteristics defined by successive binary splits in predictor variables. Successive subpopulations emerge as homogenous as possible with regard to the outcome variable, in the case the overall functioning as reported by patients and health professionals. Of the many different ways to construct CART, we employed the technique proposed by Breiman (21) and Breiman et al. (22).

A brief description of the CART procedure follows. All predictor variables are considered for possible splits, with selection of that split leading to the teo most homogenous subgroups with regard to the outcome. The data-set is then partitioned according to the predictor variable that yields the most homogenous subgroups with regard to the outcome by using a single binary split. After initial partitioning, the subsets are considered for re-partitioning based on the remaining predictor variables applied in random sequence. This algorithm is repeated until a pre-set stop criterion is reached. The recursive partitioning strategy results in a tree, wherein the root is the whole data-set, and the leaves are the final subsets, which are as selected so as to be as homogenous as possible with regard to the dependent outcome variable. Using the Random Forest algorithm, 10,000 different trees were then calculated, for each of which n cases were randomly drawn by replacement, where n equalled the original sample size of patients. Observations that were not used in the fitting process of each tree were then used to validate the same tree. Thus, we calculated for each predictor variable two mean square errors: one for the original values, and a second after randomly permuting each predictor variable. The first mean square error estimate stands for the population value with the observed association to the outcome, the second estimate from the random permutation stands for a population wherein predictor and outcome are only randomly associated. The difference of these two mean square errors yields the so-called variable importance measure. The optimization is based on the expectation that the random permutation of an informative predictor variable, i.e. a predictor variable associated with the outcome should highly increase the mean square error, while random permutation of a non-informative predictor variable should have little effect on the mean square error. The difference of the two mean square errors can thus be interpreted as having variable importance, such that greater difference indicates greater importance of the variable in determining the outcome.

All data analyses were carried out with R 2.9.0 (23).


In this report we have described the empirical and theoretical process used to validate the comprehensive ICF Core Sets for the acute hospital and for post-acute rehabilitation and by extension propose a selection method for defining candidate categories for brief ICF Core Sets. The development of comprehensive ICF Core Sets has become highly standardized and straightforward. Thus, it is timely and appropriate to develop an equally standardized algorithm for their empirical validation and for the selection of briefer ICF Core Sets. Three criteria were applied to the comprehensive ICF Core Set categories, namely truth, discrimination and feasibility.

To validate the comprehensive ICF Core Sets, truth was the foremost criterion. Analysis of frequency eliminates those candidate categories that are impaired or restricted only in a minority of patients. This process surely reduces the occurrence of floor effects, notwithstanding that frequency is not synonymous for relevance, and that the 5% threshold employed for “sufficiently frequent” is arbitrary. Since even an initially infrequent aspect of functioning may become important over the time course of therapy we additionally reported significant change as an important characteristic to monitor. The resulting comprehensive ICF Core Sets consequently contain categories that are either prone to change, or are impaired in more than 5% of the cases, or both. Including patients’ expressed goals for rehabilitation is another validation criterion, and serves to indicate categories that should not be omitted from consideration.

To propose valid candidates for ICF Core Sets that are relatively briefer and thus more practical tools, we used the second criterion, discrimination. We included categories indicating the initial (admission) and the final (discharge) status of functioning so as to apprehend those categories accounting for disability at the beginning and conclusion of rehabilitation. By using both initial and final status and by considering the perspectives both of patients and health professionals we tried to minimize bias.

By restricting the number of categories for the brief ICF Core Sets we made a concession to the third criterion, i.e. feasibility. We are well aware that one or the other relevant aspect may then be missing from the brief ICF Core Sets. However, since comprehensive ICF Core Sets are already available, they might serve as default tools for a more comprehensive assessment.

Selecting categories by 3 empirical criteria, truth, discrimination and feasibility, however, also has several limitations. First, it is important to recall that the ICF was first developed as a reference classification and not as a tool for assessment. Thus, any direct application of the ICF categories in a clinical context may be called into question. There is, however, limited evidence that ICF categories can in fact be used reliably for assessment in the hands of experienced health professionals (24). Secondly, the process of selecting categories is data driven. The frequency of any given symptom or problem is therefore dependant on the choice of the sample, and is thus subject to selection bias. We contend that a sufficiently representative sample was studied, recruited from 13 institutions, such that selection bias was minimized. Thirdly, discriminative validity also depends on the sample, such that regression models can deliver highly unstable results that should undergo further validation in a different independent sample or by split sample techniques, such as cross-validation. By using several outcomes and two different regression techniques, both of which are inherently more stable than conventional linear regression, we hope to have stabilized results. Nonetheless, any selection has limitations. Specifically, scale building techniques such as Rasch analysis can serve to assure that the categories represent the whole spectrum of functioning. Further attempts to validate the brief ICF Core Sets in different samples are in progress.

We present here an algorithm to identify candidate categories for brief ICF Core Sets extracted from the comprehensive acute and post-acute ICF Core Sets. The algorithm furthermore validates the ICF Core Set categories for implementation in a clinical context. Appropriate selection and validation processes will ultimately result in the formulation of sets of categories that are useful for health professionals in acute and post-acute situations.


The authors thank Dr Paul Cumming for critical reading of the manuscript. The project was supported by the German Ministry of Health and Social Security (BMGS) grant no. 124-43164-1/501 and by the LMUinnovativ project Münchner Zentrum für Gesundheitswissenschaften (TP 1).



Do you want to comment on this paper? The comments will show up here and if appropriate the comments will also separately be forwarded to the authors. You need to login/create an account to comment on articles. Click here to login/create an account.