Improving Evaluation of the CanMEDS Collaborator Role:
Reliability of the Interprofessional Collaborator Assessment Rubric (ICAR) and Gender Bias in Multi-Source Feedback
by
© Mark Hayward B.Sc. B.Ed.
A thesis submitted to the School of Graduate Studies
in partial fulfillment of the requirements for the degree of
Masters of Science Faculty of Medicine
Memorial University of Newfoundland
May 2014
St. John’s Newfoundland
ii Abstract
Since the inception of the Royal College of Physicians and Surgeons of Canada (RCPSC) CanMEDS framework, there has been inequality between the assessment of the Medical Expert role and the six non-Medical Expert roles. The purpose of the study was to
evaluate the reliability of the use of the Interprofessional Collaborator Assessment Rubric
(ICAR) in a multi-source feedback (MSF) approach for assessing post-graduate medical
residents’ CanMEDS Collaborator competencies. A secondary investigation attempted to
determine whether characteristics of raters (i.e., experience, gender, or frequency of
interaction with resident) had any influence on overall ICAR score. The ICAR is a 17-
item (and global score) assessment tool utilizing a 9-point scale and two open-text
responses. The study involved medical residents receiving ICAR assessments from three
(3) rater groups (physicians, nurses, and allied health professionals) over a single four-
week rotation. Residents were recruited from four (4) unique medical disciplines. Of
those participating residents, sixteen (16) residents were randomly chosen. Six (6) of
those received at least two (2) assessments from each rater group and were included in the
analysis. All nurses and allied health professionals in participating medical / surgical
units were invited to participate and were excluded from analysis if they were absent for
at least one week of normal shift work or explicitly stated they did not interact with
resident. Physicians were self-appointed by the residents. Statistical analysis utilized
Cronbach’s alpha, compared overall ICAR scores using one-way and two-way, repeated
iii measures ANOVA, and logistic regression. Missing data using a single imputation
stochastic regression method and was compared to the missing data from a pilot study using pair-sample t-test. Results revealed a high response rate (76.2%) with a statistically significant difference between the gender distributions in each rater group, male
physicians (81.8%), female nurses (92.5%), and female allied health professionals (88.4%), p < .001. Missing data decreased from 13.1% using daily assessments to 8.8%
utilizing an MSF process, p = .032. An overall Cronbach’s alpha coefficient of α = .981 revealed high internal consistency reliability. Each ICAR domain also demonstrated high internal consistency, ranging between .881 - .963. The profession of the rater yielded no significant effect with a very small effect size (F
2,5= 1.225, p = .297, η
2= .016). The only significant, main-effect on overall ICAR score was found to the gender of the rater (F
1,5= 7.184, p = .008, η
2= .045). Female raters scored residents significantly lower than male raters (6.12 v. 6.82). Logistic regression analysis revealed that male raters were 3.08 times more likely than female raters to provide an overall ICAR score of above 6.0 (p = .013) and 3.28 times more likely to score above 7.0 (p = .005). A significant interaction effect resulted from a two-way repeated measures ANOVA analysis involving the frequency of interaction between raters and residents across items (F = 2.103, p = .025, η
2