Ngoc Quan Phan1, Christine Blome2, Fleur Fritz3, Joachim Gerss4, Adam Reich5, Toshi Ebata6, Matthias Augustin2, Jacek C. Szepietowski5 and Sonja Ständer1
1Clinical Neurodermatology, Department of Dermatology and Competence Center Chronic Pruritus, 3Department of Medical Informatics, 4Institute of Biostatistics and Clinical Research, University Hospital Münster, 2Center for Dermatological Research (CeDeF), Health Economics and QoL Research Group, German Center for Health Services Research in Dermatology (CVderm), University Hospital Eppendorf, Germany, 5Department of Dermatology, Venereology and Allergology, Wroclaw Medical University, Poland, and 6Department of Dermatology, The Jikei University School of Medicine, Tokyo, Japan
The most commonly used tool for self-report of pruritus intensity is the visual analogue scale (VAS). Similar tools are the numerical rating scale (NRS) and verbal rating scale (VRS). In the present study (initiated by the International Forum for the Study of Itch) assessing reliability of these tools, 471 randomly selected patients with chronic itch (200 males, 271 females, mean age 58.44 years) recorded their pruritus intensity on VAS (100-mm line), NRS (0–10) and VRS (four-point) scales. Re-test reliability was analysed in a subgroup of 250 patients after one hour. Statistical analysis showed a high reliability and concurrent validity (r>0.8; p<0.01) for all tools. Mean values of all scales showed a high correlation. In conclusion, high reliability and concurrent validity was found for VAS, NRS and VRS. On re-test, higher correlation and less missing values were observed. A training session before starting a clinical trial is recommended. Key words: itch; measurement tools; clinical trial; International Forum for the Study of Itch; concurrent validity.
(Accepted August 3, 2011.)
Acta Derm Venereol 2011; 91: XX–XX.
Sonja Ständer, Competence Center Chronic Pruritus, Department of Dermatology, University Hospital Münster, Von-Esmarch-Str. 58, DE-48149 Münster, Germany. E-mail: sonja.staender@uni-muenster.de
Chronic pruritus is a frequent symptom with a prevalence of approximately 17% in adults, which occurs in dermatological, systemic, neurological, and psychiatric diseases (1). During the past years, new findings in the neurobiology of pruritus have enabled the development of new therapies, leading to a growing number of clinical trials worldwide (2). To date, there is no clear definition or straightforward recommendation of measurement tools for the study of pruritus. Although it is still difficult to objectively assess all the attributes of pruritus, obtaining information on the intensity, severity and course of pruritus in a consistent way is essential for the baseline assessment of the symptom, evaluation of the treatment efficacy and comparability of studies. Although various methods have been described to evaluate pruritus (Table I), validation of these instruments in chronic pruritus is still pending. The International Forum for the Study of Itch (IFSI) established a special interest group (SIG) for the evaluation and harmonization of measurement tools for clinical trials (www.itchforum.net). In this first study, the aim was to investigate the reliability and validity (criterion, concurrent and construct validity) and the internal consistency (Cronbach’s alpha) of three pruritus intensity scales; namely, the visual analogue scale (VAS), numerical rating scale (NRS) and verbal rating scale (VRS) in patients with chronic pruritus.
Table I. Referenced assessment of pruritus: patient self-reporting (scales and questionnaires) and scratching measurement tools
Category
|
Name
|
Author
|
Scales of pruritus intensity
|
Multidimensional scale
|
Pruritus grading system
5-D Pruritus Scale
Itch Severity Scale
|
Szepietowski & Schwartz (18)
Elman et al. (19)
Majeski et al. (20)
|
Unidimensional scales
|
Visual analogue scale (VAS)
Numeric rating scale (NRS)
Verbal rating scale (VRS)
|
Wahlgren (21), Reich et al. (17), Phan & Ständer (22)
Jenkins et al. (23)
Wahlgren et al. (24, 25)
Jenkins et al. (23)
|
Questionnaires
|
Pruritus questionnaires (PQ)
|
Eppendorf PQ
The short-form of McGill PQ
Heidelberg PQ
NeuroDerm PQ
|
Darsow et al. (26)
Yosipovitch et al. (27)
Weisshaar et al. (28)
Ständer (29)
|
Quality of life
|
DLQI
Itchy-QoL
|
Finlay & Khan (30)
Desai et al. (31)
|
Anxiety, depression
|
HADS
|
Zigmond & Snaith (32)
|
Patients’ needs
|
Patient benefit index – Pruritus (PBI-p)
|
Blome et al. (33)
|
Measurement of scratching
|
Observation of excoriations and lichenifications
|
(scratch symptom score- under development)
|
Ständer, Augustin (unpublished)
|
Movement measurement
Wrist movement
Forehand movement
Scratch movements of the hand
Limb movement
Whole body movement
Fingernail vibration transducer
|
Accelerometer
Actigraphy
DigiTrac
ActiTrac
Electromyogram
Paper gauge
Pressure sensor
Scratch radar
Movement sensors
Movement sensors
Infrared video recording
Piezo film technology
Pruritometer 2
(Piezo sensor)
|
Benjamin et al. (34)
Bringhurst et al. (35)
Hon et al. (36)
Ebata et al. (37)
Savin et al. (38)
Aoki et al. (39)
Endo et al. (40)
Mustakallio (41)
Summerfield & Welch (42), Felix & Shuster (43)
Felix & Shuster (43)
Ebata et al. (44)
Talbot et al. (45); Molenaar et al. (46)
Bijak et al. (47)
|
Measurement of itch using technical devices
|
Perceptual matching
Assessment reminder
|
Symtrack
|
Stener-Victorin et al. (48)
Hägermark & Wahlgren (49)
|
PATIENTS AND METHODS
Over a period of 7 months, a consecutive collective of 471 randomly selected patients (200 males, 271 females, range 16–92 years, mean age 58.44 years with standard deviation (SD) of 15.68 years) with chronic pruritus (> 6 weeks) of any origin were included in the study. According to the classification of the IFSI (3), patients were grouped according to the clinical appearance of the skin as follows: pruritus on non-inflamed skin (n = 272); pruritus on inflamed skin (n = 83); pruritus with chronic scratch lesions (n = 116).
Patients were asked to record their current pruritus intensity (over the last 24 h) on a VAS on a horizontal 100-mm line, on a NRS from 0 to 10 and on a four-point VRS on a questionnaire (Fig. 1) (visit (V) 1). For test-retest reliability, 250 of 471 patients (102 males, 148 females, range 16–91 years, mean age ± SD 55.95 ± 16.68 years) recorded their pruritus intensity again on a questionnaire with a different order of the scales 1 hour later (V2). Fifty-two of 250 patients (25 males, 27 females, range 24–88 years, mean age 59.08 ± 14.61 years) completed the pruritus intensity scales again after 3–8 weeks (V3). If patients did not complete one scale this was defined as a “missing value”.
The study was approved by the local ethics committee of the University of Münster. Patients gave written informed consent for clinical data collection and analysis.
Assessment scales
The VAS, first developed in 1921 by Hayes & Patterson (4), is commonly used to measure, for example, panic, depression, fatigue and pain (4–8). To assess the intensity of pruritus, a VAS is also the most commonly used tool. For example, VAS is part of the SCORAD (SCORing Atopic Dermatitis) in atopic dermatitis (9). VAS is a graphic tool with a 100-mm horizontal line with the left end marked as ”no symptom” and the right end marked as ”worst imaginable symptom” (Fig. 1). The patient is asked to draw a vertical line to indicate the horizontal scale at a point that corresponded to the intensity of the symptom. The length from the left end to the vertical mark made by the patient is measured in millimetres. Separation in one-hundredths is regarded as sufficiently sensitive (10). The NRS is a similar tool and has also been validated for the measurement of pain (8). Patients were asked to assign a numerical score representing the intensity of their symptoms on a scale from 0 to 10, with 0 for having no symptoms and 10 having worst imaginable symptoms. The VRS consists of a list of adjectives describing different, usually four, levels of symptom intensity, e.g. 0 = none, 1 = mild, 2 = moderate and 3 = severe/intense (8).
Fig. 1. Assessment scales: visual analogue scale (VAS), numerical rating scale (NRS) and verbal rating scale (VRS).
Statistical analysis
SPSS 18.0 was used for statistical analysis of data.
Criterion validity. Concurrent validity measures how well the scale correlates with other (ideally gold standard) measures of the same variable (11, 12). For this analysis, we did not have a gold standard; correlation coefficients were estimated between the three instruments used to measure pruritus intensity. Inspecting QQ-plots and histograms we found that none of the VAS, NRS and VRS data were normally distributed. Therefore, Spearman’s correlation coefficients were estimated between all three instruments. In addition, we also investigated Cronbach’s alpha.
Construct validity. The extent to which a particular measure performs in accordance with theoretical expectations is known as construct validity (12, 13). It can be expected that the scores of VAS, NRS and VRS all increase with pruritus intensity. This should be similar in different subgroups of chronic pruritus patients. We therefore expect a similar correlation and increased scores in accordance with pruritus intensity in the 3 different clinical groups according to IFSI (3). Spearman’s correlation coefficients were estimated between all 3 instruments. In addition, we also investigated Cronbach’s alpha.
Re-test reliability. If between two time-points, a patient’s status that might affect the parameter being measured does not change, then measurements taken at these times should be the same, or very similar. Given that pruritus intensity varies over the day and is influenced by factors such as mood, treatment, and activity level, we chose to estimate re-test reliability one hour after the first assessment.
Due to the fact that the VRS is ordinal scaled and VAS and NRS are metrical scaled, the intraclass correlation coefficients (ICC) were determined for the reliability of the three scales after one hour. In the case of the VRS, for reasons of comparability both the Kappa and the ICC coefficient were determined. In general, test-retest reliability coefficients above 0.9 are considered as high, and between 0.7 and 0.8 are considered as acceptable for research tools (12).
RESULTS
Missing values
If patients did not complete one scale this was defined as a “missing value”. Most patients completed the NRS and VRS (Table II). The highest number of missing values could be observed in the VAS assessment. At V1, 12.5% of 471 patients did not record their pruritus intensity on VAS, 4.2% on NRS and 7.2% on VRS. After repeat assessment, fewer missing values could be observed in V2 and V3. Age-dependent analysis of missing values in patients <60 years, compared with patients ≥ 60 years at V1 showed that there are nearly twice as many missing values in VAS and NRS assessments in elderly patients than in patients under the age of 60 years (Table II). Interestingly, VRS showed a lower number of missing values in the elderly population.
Assessment of pruritus intensity using VAS, NRS and VRS
Of the 471 randomly selected patients with chronic pruritus, 36 (7.6%) reported currently having no pruritus on the VRS (“0”), which correlated with a mean VAS value of 0.18 points and an NRS value of 0.10 points (Fig. 2). A total of 189 patients (40.1%) reported having low intensity (“1”) pruritus (mean VAS/mean NRS: 1.90/2.28), 174 patients (37.0%) moderate (“2”) pruritus (mean VAS/mean NRS: 5.12/5.52), and 38 patients (8.1%) severe (“3”) pruritus (mean VAS/mean NRS: 8.57/8.93), while 34 patients (7.2%) did not complete the VRS (Table II). NRS values were slightly higher than VAS values (Fig. 2). Comparison of VAS and NRS with VRS showed a high correlation with similar mean values of VAS and NRS. Comparison of pruritus ratings according to gender and age (patients < 60 years vs. ≥ 60 years) showed no significant difference between men and women (VAS, p = 0.340; VRS, p = 0.496; NRS, p = 0.841) nor between older (≥ 60 years) and younger (< 60 years) patients (VAS, p = 0.934; VRS, p = 0.201; NRS, p = 0.335).
Table II. Missing values: percentage of patients with chronic pruritus who did not complete visual analogue scale (VAS), numerical rating scale (NRS) or verbal rating scale (VRS)
|
|
|
Missing values (%)
|
n
|
Visit
|
|
VAS
|
NRS
|
VRS
|
471
|
V1
|
|
12.5
|
4.2
|
7.2
|
|
< 60 years
|
|
20/229, 8.7
|
12/229, 5.2
|
11/229, 4.8
|
|
≥ 60 years
|
|
39/242, 16.1
|
22/242, 9.1
|
9/242, 3.7
|
250
|
V1
V2
|
|
13.6
8.0
|
4.0
2.4
|
7.6
5.2
|
52
|
V1
V2
V3
|
|
17.3
9.6
13.5
|
5.8
7.7
0.0
|
5.8
11.5
1.9
|
Mean VRS values were almost identical in the three clinical groups (Fig. 3). Interestingly, NRS and VAS values were slightly higher in patients with pruritus on inflamed skin (i.e. dermatoses) than in the two other groups. In patients with pruritus on non-inflamed skin and pruritus with chronic scratch lesions, NRS values were slightly higher than VAS values, as observed also in the analysis of the total cohort (Fig. 2); the opposite was the case in patients with pruritus on inflamed skin (i.e. dermatoses).
Fig. 2. Correlation of verbal rating scale (VRS) with mean numerical rating scale (NRS) and visual analogue scale (VAS) (all patients n = 471; V1).
Fig. 3. Assessment of pruritus intensity in different clinical groups of chronic pruritus at V1: pruritus on non-inflamed skin (normal skin); pruritus on inflamed skin (dermatoses) and pruritus with chronic scratch lesions (e.g. prurigo nodularis).
Concurrent validity
Correlation of VAS, NRS and VRS by Spearman’s correlation coefficient showed statistically significant high values. In particular, correlation of VAS with NRS showed high correlation coefficients (r > 0.8; p < 0.01) at each visit (V1–V3). After repeat assessment, higher correlations could be observed (Table III). In addition to the Spearman’s correlation coefficient, also Cronbach’s alpha showed qualitatively similar high values (Table III).
Table III. Concurrent validity: Spearman’s correlation coefficients and Cronbach’s α between visual analogue scale (VAS), numerical rating scale (NRS) and verbal rating scale (VRS)
n
|
Visit
|
VAS–NRS
|
|
VAS–VRS
|
|
NRS–VRS
|
Spearman’s correlation coefficient
|
Cronbach’s α
|
|
Spearman’s correlation coefficient
|
Cronbach’s α
|
|
Spearman’s correlation coefficient
|
Cronbach’s α
|
471
|
V1
|
0.865*
|
0.935
|
|
0.752*
|
0.541
|
|
0.847*
|
0.604
|
250
|
V1
V2
|
0.827*
0.884*
|
0.899
0.936
|
|
0.699*
0.811*
|
0.481
0.584
|
|
0.809*
0.837*
|
0.571
0.615
|
52
|
V1
V2
V3
|
0.829*
0.892*
0.960*
|
0.920
0.945
0.980
|
|
0.644*
0.819*
0.854*
|
0.411
0.538
0.624
|
|
0.732*
0.768*
0.888*
|
0.487
0.515
0.655
|
*p < 0.01.
Re-test reliability
Statistical correlation of the one hour difference showed high values between 0.74 and 0.80. The NRS showed the best reliability, with an intraclass correlation coefficient (ICC) of 0.801. The ICC of VAS was 0.749 and of VRS 0.740. We also performed Kappa’s level of agreement in the ordinal scaled VRS, which was 0.643.
Nevertheless, correlation of the scales and their values after one hour was not very high (r < 0.900), possibly due to slightly different evaluation/rating after reflecting on pruritus intensity.
DISCUSSION
Pruritus is a subjective symptom with multiple dimensions that cannot be measured objectively to date. Also, scratch lesions cannot serve as a mirror for pruritus severity, since a broad inter-individual variety can be observed. Therefore, the best option is to let the patient report the symptoms, for example pruritus intensity, as he or she valuates them. In our study, a total of 471 chronic pruritus patients were asked to record their pruritus intensity on the VAS, NRS and VRS. Statistical analysis showed a high reliability and concurrent validity (r > 0.8; p < 0.01) for all tools. Mean values of all scales showed a high correlation. Low pruritus (VRS = 1) was equivalent to a mean VAS value of 1.9 and mean NRS value of 2.3; moderate pruritus (VRS = 2) was equivalent to a mean VAS value of 5.1 and mean NRS 5.5, severe pruritus (VRS = 3) was equivalent to a mean VAS value of 8.57 and mean NRS 8.93. These data show a high discrimination sensitivity of VAS and NRS values. However, a tendency to the middle of the VAS and NRS scales can be observed in the category moderate pruritus (VAS/NRS values of around 5). This tendency is frequently observed in daily routine and hampers interpretation of the pruritus intensity. In our study, all patients were Caucasians. It is speculated that other ethnic groups experience other itch intensities (e.g. lower itch ratings in Japanese patients compared with Caucasians; Reich A et al. unpublished observation; 14). A comparative study concerning the various intensity scales between different ethnic groups is pending. Moreover, in our study, we did not observe differences in monitoring itch intensity related to age, gender or clinical patient group, except the observation that men tend to rate itch intensity slightly higher than women.
Patients repetitively assessed the different scales. A high reproducibility of the scales with consistent values after a short interval of assessment is desirable. This item can be tested if patients complete the scales twice within a short period of time (re-test reliability). Re-test reliability testing was performed in 250 patients. VAS, NRS and VRS were repeated one hour after the first assessment. The intraclass correlation coefficient for the three scales varied between 0.741 and 0.801. In acute pain studies, a correlation coefficient between 0.97 and 0.99 was achieved showing a high reliability (15). The authors scored pain in an interval of one minute instead of one hour, possibly explaining the higher correlation coefficient (15). In general, pruritus intensity can be influenced by a variety of external and internal factors, such as worsening by stress or weather. Changes may occur quickly, explaining variations even within one hour. Given that in our study 24.4% of patients with chronic pruritus were over 70 years of age, cognitive impairment cannot be ruled out in one or another individual case. However, we found a high correlation between the first and second assessment in all scales. The sensitivity of VAS, NRS and VRS to detect clinical relevant changes and the minimal clinically important difference (MCID) in pruritus intensity, either worsening by bothersome factors or improvement by therapies, has not been investigated and no conclusions can be drawn on this issue from this study. The assumption behind the use of VAS is that it is possible to grade a phenomenon on a linear scale from one extreme to another. However, it has to be assumed that the VAS is not linear but exponential. In a study investigating rheumatic pain, comparison of the VAS scores with improvement in quality of life demonstrated that a reduction of even one VAS level was of benefit (16). Studies investigating the MCID of the different scales in patients with chronic pruritus are currently performed. In pain studies, VAS is ascribed high sensitivity, but VRS is thought to have not enough number categories to measure small changes (15). Reich et al. (17) therefore introduced one more category for pruritus assessment with VRS and could demonstrate that the cut-off levels of VAS and NRS correspond very well with the new VRS.
A total of 52 patients completed all scales at three time-points and missing values, i.e. number of questionnaires that have not been completed by patients, could be investigated. Interestingly, VAS showed the highest number of missing values at all time-points (8.0–17.3% of patients) depending on the age of patients (under 60 years of age: 8.7%; over 60 years of age: 16.1%). The missing values were lowest in NRS. Also, a decrease in missing values at visits 2 and 3 could be observed, probably due to a learning effect along with repeated competition of the tools. This is of high relevance for clinical trials. Missing values seem to occur because the scales are complex, not self-explanatory, and patients are unfamiliar with these tools. In particular, VAS presented only as a line without a landmark can be misunderstood. It seems that rating of a subjective sensation on a line or into a number is a more complex process, especially in elderly people. Explanation of the diary and a training session before the start of the study are recommended to increase data integrity.
In conclusion, high validity and concurrent validity in pruritus intensity assessment was shown not only by VAS, a traditional and widely-used instrument, but also by VRS and NRS. Discrimination of pruritus intensity by VAS is more sensitive than NRS or VRS. In data evaluation, physicians have to be aware of confounding factors, such as the tendency of patients to rate the middle of the scales and patients’ unfamiliarity with the tools provided. In particular, VAS showed a high rate of missing values, so that data integrity in clinical studies must be carefully checked. After repeat assessment, there were fewer missing values. We therefore also recommend using more than one scale and a combination of different scales to evaluate pruritus intensity, and a training session for using the VAS before starting a clinical trial. The sensitivity and required change for the tools being used remain to be investigated. However, these tools can be recommended for use in clinical trials and daily routine to assess the course of pruritus intensity.
ACKNOWLEDGEMENTS
We thank Rajam Csordas-Iyer for assistance in preparation of the manuscript.
The authors declare no conflicts of interest.
REFERENCES