Tuesday, May 26, 2009

The measurement of Pain and the Assessment of People Experiencing pain (Turk & Melzack, 2001)

Turk, D. C., & Melzack, R. (2001). The measurement of Pain and the Assessment of People Experiencing pain. In D. C. Turk & R. Melzack (Eds.), Handbook of Pain Assessment (pp. 3-11). New York: The Guilford Press.

"Just as 'my pain' belongs in a unique way only to me, so I am utterly alone with it. I cannot share it. I have no doubt about the reality of the pain experience, but I cannot tell anybody what I experience. I surmise that others have 'their' pain, even though I cannot perceive what they mean when they tell me about them." (Illich, 1976 cited in D.C. Turk & R. Melzack, 2001)

The associations between reported pain and physical abnormality are fairly weak. Studies have found that, while patients with significant pathology might sustain only little pain or no pain (e.g., Boden, Davis, Dina, Patronas & Wiesel, 1990; Jensen, Brant-Zawadski, Wiesel, Tsourmas, & Malkasian, Ross, 1994) there are other patients who suffered from unproportional amount of pain despite the limited amount of identified pathology (e.g., White & Gordon, 1982). In a lot of patients, physical pathology underlies their pain actually could not be identified.

Some experts suggested that results of lab tests and imaging techniques should be used as the basis of pain assessments. Again, even with techniques sophisticated like MRI scans, research studies remain to find weak associations between physical pathology and pain.

In addition, weak associations have also be reported between the degree of physical disability and functionality, of returning to work and, of treatment outcomes. It has been suggested that the observed weak correlation between "pathology, symptoms and outcome" might have something to do with the reliability of the examination procedure.

The weak correlations between pathology and pain led experts to consider whether personal factor, such as a "pain-prone personality" or "psychogenic pain" might have contributed to such a weak association. Existing research provided little evidence supporting such an argument although there might be promises associated with this area of research.

The differences among nociception, pain, pain behavior and suffering

  • Nociception: the processing of stimuli that are defined as related to the stimulation of nociceptors and capable of being interpreted as pain.
  • Pain: A perceptual process involving multiple perspectives.
  • Suffering: the interpretation and subsequent response to the perception of pain.

Interesting study by Reesor and Craig (1988): cognitive process seems to distort or amplify patients' pain experiences and suffering. Unfotunately, since I have not read this paper, I can't quite tell you yet how they came up with this conclusion about "distortion" and "amplification". One naive question I have now is... how do you come up with a baseline to make such an argument?

It is important to realize that disability is not solely a function of the extent of physical pathology or the level of pain. Disability is a complex phenomenon that multiple factors such as the physical pathology and the environmental factor. (Check the disability related articles for more information).

Like the instruments used to assess functional disability, currently, there is no single measure that has been used to assess pain and there are too many competing instruments. These instruments have often been developed for particular group of people or for particular diagnosis. In addition, even if an instrument has been validated for certain population, researchers and practitioners often select items of their interest from existing instruments to develop their own assessment. This practice, thus, would result in problems with the scale validity and reliability; in addition, it also makes it difficult to compare the results yielded by different studies.

Friday, May 15, 2009

Dimensionality and hierarchical structure of disability measurement. (van Boxel, Roest, Bergen, & Stam, 1995)

van Boxel, Y. J., Roest, F. H., Bergen, M. P., & Stam, H. J. (1995). Dimensionality and hierarchical structure of disability measurement. Archives of Physical Medicine and Rehabilitation, 76(12), 1152-1155.

Institute of Physical Medicine and Rehabilitation, Erasmus University Rotterdam, The Netherlands.

Since the D-code of the International Classification of Impairments, Disabilities, and Handicaps (ICIDH) in its full form has proven to be impractical, an instrument based on a selection of 28 items is used to measure disability in Dutch patients undergoing rehabilitation. The items are categorized into 5 domains of physical, activities of daily living (ADL), social, psychological, and communicative activity. Measurement is made on a 4-point scale ranging from 0 (not disabled) to 3 (severely disabled). As a result of the ordinal character of the rating, statistical and mathematical manipulations of the scores are complicated. The aim of this study was to obtain more insight in the dimensionality and hierarchical structure of the items, to overcome problems in comparing disability between items, between patients, and within patients between different moments in time. Mokken scale analysis of the disability scores from 1,967 rehabilitation inpatients showed that the 28 items constitute hierarchical scales. However, categorization of the items into the 5 original domains was not replicated. Five other scales or dimensions were investigated, measuring the level of extended ADL, extended psychological, fine motoric, work/leisure, and hearing/seeing activity, respectively. The number of items per dimension ranges from 14 in the extended ADL dimension to 2 each in the work/leisure and hearing/seeing dimensions. Although each disability item may be of importance in clinical case management, a reduced set of extended ADL items suffices to describe the disability level in this dimension for epidemiological research purposes. The other dimensions need further specification to provide reliable and sensitive measuring of disability.

Direct link to this article



  • The authors thought that the concept of disability should have breadth and it is needed to incorporate indicators from multiple domains. For instance, in this study, the authors included 28 items from ICIDH (International Classification of Impairments, Disabilities, and Handicaps); these items fall into 5 domains including physical activity, activity in daily life, social activity, psychological activity and communication.
  • This instrument uses a 4-point ordinal scale to rate individual's disability level with 0 indicating not disabled and 3 indicating severely disabled.
  • Dimensionality is defined as the extent to which scale items can be combined to provide information on the same dimension. (Wright & Linacre, 1989)

    1. Wright BD, Linacre JM. Observations are always ordinal; measurements, however, must be interval. Arch Phys Med Rehabil 1989;70:857-60.

Physical activity

  • Transfer lying-sitting
  • Transfer sitting

Daily life activity (ADL)

  • Feeding


  • Walking indoors
  • Walking outdoors
  • Climbing stairs
  • Reaching
  • Manipulating
  • Endurance
  • Bending
  • Lifting

Using lavatory

  • Bathing
  • Clothing

Psychological activity

  • Orientation
  • Memory
  • attention
  • Behavior
  • mood
  • Learning abilities


  • Understanding speech
  • Talking
  • Hearing
  • Seeing
  • Writing

Social activity

  1. Transport
  2. Housing*
  3. Employment
  4. Family role
  5. Recreation


  • Dutch inpatients during hospitalization
  • The authors used a computer program called MSP to conduct Mokken scale analysis for polychotomous items.
  • Given the ordinal nature, Spearman's rank correlation coefficient or Spearman's rho was used to assess the reliability of the scales. (SPSS).
  • Results of this study fail to reconstruct the original 5-factor structure. The new categorization consist of 5 new subdomainS: extended ADL, extended psychological activities, fine motoric activities, work/leisure activities, and hearing/seeing activities, respectively.

Archives of Physical Medicine and Rehabilitation

Articles in some of the back issues are free for the journal titled "Archives of Physical Medicine and Rehabilitation"


Combining Activities of Daily Living with Instrumental Activities of Daily Living to Measure Functional Disability (Spector & Fleishman, 1998)

Spector, W. D., & Fleishman, J. A. (1998). Combining Activities of Daily Living with Instrumental Activities of Daily Living to Measure Functional Disability. Journal of Gerontology: Social Sciences, 53B(1), S46-S47.

Nagi's framework: differentiating between

  1. Impairments: Incontenence such as bowel and bladder incontinence (Jagger, Clarke,& Davies, 1986; Linacre et al., 1994)
  2. Functional limitations: walking up the stairs etc (e.g.,bending and reaching; Clark et al., 1997; Haley, McHorney, & Ware, 1994; Jette, 1980; Linacre et al., 1994; (Wolinsky & Johnson, 1991))
  3. Disability: focusing only on a core set of ADL and IADL activities (Clark et al., 1997; (Fitzgerald, Smith, Martin, Freedman, & Wolinsky, 1993)).

3 Approaches about ADL and IADL measures:

  1. Treating ADL and IADL as distinct concepts and maintaining separate ADL and IADL measures (e.g., Jette et al., Stern, 1995; Sloane, Hoerger, & Picone, 1996
  2. Include information on IADL only for those without ADL disabilities (e.g., Altman & Walden, 1993; Lagorge, Spector, and Sternberg, 1992, Tennstedt, Crawford, & McKinlay, 1993; Spector & Kemper, 1994).
  3. Combining ADL and IADL into one measure

The issues to be addressed when considering whether it is a legitimate to combine ADL and IADL

  1. Unidimensionality and local independence
  2. How to combine responses and turn them into a composite score

Research on dimensionality

  1. The multidimensionality of functional disability (Clark, Stump, & Wolinsky, 1997; (Fitzgerald et al., 1993); (Wolinsky & Johnson, 1991))
  2. The unidimensionality (Kempen & Suurmeijer, 1990;Spector et al., 1987; Suurmeijer et al., 1994)

IRT model: The authors did a good job in providing an introduction for IRT models

  1. Some other research done using IRT to assess health status measure (Haley, McHorney, & Ware, 1994; Granger et al., 1993; Linacre et al., 1994; Silverstein et al., 1992; Teresi, Cross, & Golden, 1989)
  2. Previous studies include items for ADLs, IADLs, impairments, and functional limitations together. This study only looks at ADLs and IADLs.
  3. Θ or theta as the latent dimension of disability
  4. ICCs or item characteristic curves shows the probability of positive response as a function of theta
  5. The advantage of IRT over CTT

    1. IRT item parameters are invariant to the population distribution of the trait being measured
    2. When analyzing dichotomous models, nonlinear models such as IRT models are preferable.
    3. IRT provides reliability estimate for each of the items
    4. IRT provides a framework to assess item bias
  6. IRT model and the parameters

    1. Beta: difficulty and location
    2. Alpha: discriminating ability
  7. Existing studies using 1 PL without double-checking its appropriateness (Haley, McHorney, & Ware, 1994; Heinemann et al., 1993)


  • Data are from the 1989 National Long-Term Care Survey (NLTCS)

    • Purpose of the Survey: to provided a national sample of functional disabled people
    • To estimate the change in functional disability between 1998 and the previous years
  • Measures of functional disability include Katz's Activities of Daily Living Scale (1963) and the Instrumental Activities of Daily Living Scale by Lawton and Brody (1969)
  • Positive or disabled response:

    • ADL: the receipt of human help to perform task
    • IADL: respondents did not perform the task, could not perform the task and their disability is a result of a health problem.
  • Katz et al (1969): the differentiation between dependence and independence
  • Bilog-MG (Zimowski et al., 1996) is used to perform IRT analyses.

    • The item parameters are estimated using the maximum likelihood procedure
    • the person parameters are estimated using Bayesian (EAP: expectation a posteriori) approach (Bock & Aitkin, 1981) The EAP method could provide estimation for people with all correct or all incorrect response.


  • Sample data: age, frequency of positive response
  • Exploratory and confirmatory factor analysis:

    • Tetrachoric correlations
    • The scree method (Rummel, 1970)
    • If the first eigen value is relatively larger than that for the second one, the item is relatively unidimensional (Lord, 1980)
    • The number of eigenvalues greater than 1
    • Bentler's fit index (1990)=> crit=.90
    • Root Mean Square Error of Approximation (RMSEA) RMSEA <.05 indicates an acceptable model (1993)
  • IRT Analysis

    • 1 PL and 2 PL
    • Model comparison: differences in λ2 (calculated as the -2 Log Likelihood) and the ratio of the difference over λ2 for 1 PL
    • Large sample size makes it easier to obtain significant λ2 statistics
    • Criteria to evaluate model fit

      • The meaningfulness of λ2 test
      • The size of the ratio
      • The differences in the magnitude of the slops
      • The correlation between the estimated theta for 1 PL and 2 PL
      • Adding more parameters adds more complexity
    • Item location Parameter

      • the breadth
      • How well an item differentiate between ability levels within a given range
      • Large gaps identify the point at which the scale is less precise and suggest for the inclusion of new items within the range
    • DIF found in gender


  • Unidimensionality
  • 1 PL is sufficiently good
  • People with al zeros: we need to have items that are more sensitive or milder
  • Lawton and brody (1969) suggest IADLs to be more complicated than ADLs. As a result, the IADLs should have a lower value in the location paramter.
  • Studies showing that ADL and IADL overlaps (Spector et al., 1987, Suurmeijer et al., 1994, Kempen, Myers, and Powell, 1995)

Fitzgerald, J. F., Smith, D. M., Martin, D. K., Freedman, J. A., & Wolinsky, F. D. (1993). Replication of the multidimensionality of activities of daily living. Journal of Gerontology, 48(1), s28-s32.

Wolinsky, F. D., & Johnson, R. J. (1991). The Use of Health Services by Older Adults. Journal of Gerontology, 46(6), S345-S357.

Spector and Fleishman (1998) used exploratory and confirmatory factor analyses as well as Item Response Theory to analyze data of the combined ADL/IADL scale, which included 16 items and was collected through the NLTCS project. This study used only data from individuals who reported disabled in at least one of the disability indicators. Results of exploratory and confirmatory factor analyses provided supportive evidence for the unidimensionality of the combined scale. In addition, the authors made a comparison between the goodness-of-fit of the one-parameter and that of the two-parameter IRT model. It was found that the one-parameter model yielded a sufficient good fit, which is used as the supporting evidence for the feasibility of using a composite score to summarize the ADL/IADL data.

Tuesday, May 5, 2009

Aggregated measures of functional disability in a nationally representative sample of disabled people: analysis of dimensionality according to gender and severity of disability. (Cabrero-García & López-Pina, 2008)

Cabrero-García, J., & López-Pina, J. A. (2008). Aggregated measures of functional disability in a nationally representative sample of disabled people: analysis of dimensionality according to gender and severity of disability. Quality of Life Research, 17(3), 425-436.

Department of Nursing, University of Alicante, Campus de San Vicente del Raspeig, Ap. 99, 03080 Alicante, Spain. julio.cabrero@ua.es

OBJECTIVE: To determine (i) the dimensional invariance of instrumental and basic activities of daily living (IADL/ADL) by gender subgroups, and (ii) the extent to which ADL dimensionality varies with the inclusion or exclusion of nondisabled people.

METHODS: Data were taken from the 1999 Spanish Survey on Disability, Impairment and State of Health. The analysis focused on 6,522 people aged over 65 years who received help to perform or were unable to perform IADL/ADL items. Unidimensional and multidimensional item response theory (IRT) models were applied to this sample.

RESULTS: In the female sample, IADL/ADL items formed a scale with sufficient unidimensionality to fit a two-parameter logistic IRT model. In the male sample, the structure was bidimensional: self-care and mobility, and household activities. When the sample was composed of IADL/ADL disabled people, ADL items formed a unidimensional scale; when it was composed only of ADL disabled people, they formed a bidimensional structure: self-care and mobility.

CONCLUSIONS: IADL/ADL items can be combined in a single scale to measure severity of functional disability in females, but not in males. Separate aggregated scores must be considered for each subdomain, basic mobility and self-care, in order to measure the severity of ADL disability.



  • Knowledge about the prevalence and incidence of functional disability is important in estimating the service to be provided as well as the planning for accommodating programs at the population level
  • At the individual level, on the other hand, outcome assessment concerning functional disability is essential in deciding one's eligibility to participate in long term care programs as well as other types of care and assistant.
  • The subdomains for ADL are basic mobility and self-care (Avlund, 1997; Lindeboom, Vermeulen, Holman, & De Haan, 2003) and Kempen, Miedema, Ormel & Molenaar, 1996) while that for IADL were not clearly specified (Lindeboom et al., 2003) (Coster, Haley, Andres, Ludlow, Bond & Ni., 2004).
  • The most commonly identified subdomains for IADL are household activities, outdoor mobility and cognitive activities (Lindeboom et al., 2003) (Lawton & Brody, 1969).
  • Spector et al (1987) were the first ones to combine ADL and IADL into one measure.
  • Yet, there is still need for further research to examine the practice of using one single composite score to summarize ADL and IADL (Spector & Fleishman, 1998, Coster, Haley, Andres, Ludlow, Bond & Ni., 2004; Breithaupt & McDowell, 2001).
  • Studies of unidimentional IRT model: Spector & Fleishman, 1998; Saliba, Orlando, Wenger, Hays & Rubenstein, 2002; Kempen, Myers & Powell, 1995
  • Studies supporting the multidimensionality of the scale: Breithaupt & McDowell, 2001; Thomas, Rockwood & McDowell, 1998; Johnson & Wolinsky, 1994; Ng, Niti, Chiam & Kua, 2006
  • There have also been concerns regarding the compatibility of the IADL items for men and women such as DIF studies (See paper for gender related literature for I will not list them here given that it is not my interest)
  • Lazaridis (1994) questioned the appropriateness of implementing the deterministic Guttman scale, which could be considered as a limited version of the probabilistic Rasch model, to summarize the ADL data.
  • On page 626 and in the first paragraph on the right column, the author provides a good explanation about "extreme scores" such as those with a zero score.


  • Instrument: The 1999 Spanish Survey on Disability, Impairment and State of Health (EDDES). Through information gathered through interviews concerning whether the interviewee is dependent or not on the others to perform the IADL/ADL tasks, each interviewee is assigned either a disabled or not disabled status for one of the 14 IADL/ADL items.

    Analysis method

  • 1 PL and 2 PL
    • The fit of the two models are calculated using the difference between the -2 log likelihood values for each of the model
    • The percentage of reduction in the Chi square is calculated to evaluate model fits
    • Infit and outfit statistics were used to evaluate the item fit for Rasch model with the critical range between 0.7 to 1.3 (Linacre & Write, 1994)


  • An important finding for me in this paper is that the dependency on sample size for the likelihood ratio chi square statistics is very evident. The authors found more misfit items when the sample size was 4618 then when the sample size is 500. No wonder all items are misfitted in my analyses since the sample size is something like 21574.
  • Page 429 gives an example on how to write up IRT report
  • Table 5 on Page 431 shows an example of presenting item parameter estimations.


  • Different results were obtained when analyses included data from only people with ADL, ADL/IADL and the whole sample.
  • Also, for females, items are sufficiently unidimensional while bidimensional for male (self-care and mobility, and household activities.

    Avlund, K. (1997). Methodological challenges in measurements of functional ability in gerontological research. A review. Aging, 9(3), 164-174.

    Cabrero-García, J., & López-Pina, J. A. (2008). Aggregated measures of functional disability in a nationally representative sample of disabled people: analysis of dimensionality according to gender and severity of disability. Quality of Life Research, 17(3), 425-436.

    Lindeboom, R., Vermeulen, M., Holman, R., & De Haan, R. J. (2003). Activities of daily living instruments: optimizing scales for neurologic assessments. Neurology, 60(5), 738-742.

The dimensionality and validity of the Older Americans Resources and Services (OARS) Activities of Daily Living (ADL) Scale. (Doble & Fisher, 1998)

Doble, S. E., & Fisher, A. G. (1998). The dimensionality and validity of the Older Americans Resources and Services (OARS) Activities of Daily Living (ADL) Scale. Journal of Outcome Measurement, 2(1), 4-24.

School of Occupational Therapy, Dalhousie University, Halifax, Nova Scotia, Canada.

The psychometric properties of the OARS ADL scale, comprised of seven physical activities of daily living (PADL) and seven instrumental activities of daily living (IADL) items, were examined using a Rasch measurement approach. Two of the PADL items failed to demonstrate acceptable goodness-of-fit with the measurement model but the remaining 12 items could be combined into a single measure of ADL ability. Although the OARS ADL scale was designed to identify those community-dwelling elderly who need supports and services to continue to live in the community, the scale items were found to be poorly targeted to community-dwelling elderly since almost half of our sample received maximal scores. Rasch analysis identified how we might improve the sensitivity of the OARS ADL scale but its utility in outcome and longitudinal studies remains questionable.


Literature review

  1. The Older Americans Resources and Services (OARS) Activities of Daily Living (ADL) Scale

    1. 7 Physical Activities of Daily Living (PADL)
    2. 7 Instrumental Activities of Daily Living (IADL)
  2. The importance of PADL and ADL; the dependency in PADL and IADL are associated with

    1. Poor quality of life
    2. Increased risk of nursing home placement
    3. Increased risk of death
  3. 2 major problems associated with self- and proxy-based assessment of PADL and IADL

    1. Evaluating PADL and ADL with few items

      1. Ceiling and floor effect
      2. Compromised reliability
      3. In a study conducted by Suurmeijer et al (1994), it is found that when PADL and IADL items are included in a single scale, the unidimensionality of the scale items is compromised. However, results of other researchers supported the unidimensionality of the combined scale (Finch, Kane, & Philp, 1995; Kempen & Suurmeijer, 1990, Siu, Reuben & Hays, 1990; Silverstein, Fisher, Kilgore, Harley, & Harvey, 1992; Spector et al., 1987)..
    2. Summing ordinal rating

      1. Compromise the true measure
      2. Guttman scaling is a deterministic model and it requires the data to maintain an absolutely rigid structure, which is rarely the case in social science. In addition, this model holds that people will pass all items lower than their ability level and fail all items higher than their ability level. As a result, the differences have to be big between item difficulties in order for a scale to confirm to this expectation. The resulting trade-off is that it might reduce the sensitivity to identifying changes within an individual over time or small changes across individuals.
  4. Rasch model

    1. Difficulty level of items are based on the Likelihood of the item being passed
    2. Rasch analysis could generates item-parameter related statistics

      1. a calibrated difficulty level for each test item
      2. Mean square residual (MnSq) with an expected value of 1. MnSq greater than 1.4 is no good
      3. A standardized goodness-of-fit statistics (z) with an expected value of 0. z greater or equal to 2 is no good.
    3. Rasch model also provided us with estimation for person-related parameter, theta, and the associated statistics. Greater separation suggests higher sensitivity.


Elderly, at least 60 years old and at least one health condition affecting daily living


  1. Conducted using BIGSTEPS:

    1. Item separation statistics: whether subjects separate items into levels of difficulty
    2. Person separation statistics: whether items separate subjects into levels of ability
    3. Item difficulty measure
    4. Person ability measure
    5. Goodness-of-fit statistics
    6. Unidimensionality of the ADL scale was determined by examining MnSq and associated z fit statistic for each item
  2. Greater separation suggests increased sensitivity of the scale
  3. If more than 5% of the subjects' failed to demonstrate goodness-of-fit with the measurement model when z is set at 2, then there might be problem with the validity of the scale


  1. Fit of items to the measurement model p. 12-17

    1. Item separation statistic
    2. Items located within .25 logitsè actual order might be different because the distance is too small
    3. Item order (Finch et al., 1995, Fillenbaum, 1988; Kempen & Suurmeijer, 1990; Siu et al., 1990; Silverstein et al., 1992; Spector et al., 1987; Suurmeijer et al., 1994; Lawton and Brody, 1969)
    4. Check for large gaps to see whether we might be missing items to measure certain ability level
    5. Check for items with large error estimates
    6. Check goodness-of-fit statistics
    7. Get rid of the bad items and redo the analyses

  1. Fit of the persons to the measurement model P. 17-20

    1. Person separation statistics

    2. Check the number of groups

Doble and Fisher (1998) conducted a study to evaluate the psychometric properties of the Older Americans Resources and Services (OARS) Activities of Daily Living (ADL) Scale, which contains 7 ADL and 7 IADL items. Participants were community-dwelling elderly adults in either Canada and United States age 60 and above. Most subjects were interviewed directly through self-report while, for certain participants', such as those having dementia or memory impairments, information was obtained through proxy report. All except for the continence item were rated based on a 3 point scale with 0 indicating the subject is completely incapable of performing the task and 2 indicating no problem at all. The authors used the program BIGSTEPS to run the Rasch model analysis. Their results found that, after dropping two misfitting items (i.e., bathing and continence), the remaining items were found to be hierarchically ordered and measure the same underlying construct (e.g., perceived ADL ability). However, based on the parameter estimates of the ability score, the authors concluded that the OARS ADL items have limited applications for the community-dwelling individuals they were designed for.

Saturday, May 2, 2009


