Wednesday, October 15, 2008

GLOBAL RATING SCALES: USELESS AS A TOOL FOR IMPROVING PERFORMANCE

Probably, all of us can agree that the ACGME Six Core Competencies are important measures of a good physician. They have resonated with medical schools so well that a large number of schools have incorporated these competencies into their expectations for graduates. Who could argue with the concept that we want our graduates to be knowledgeable, clinically proficient, professional, good communicators, lifelong learners, and good stewards of the healthcare system in service to their patients.

The problem, as I see it, is not the competencies, but how we assess them. For the first five years after the introduction of the Competencies, new Global Rating Forms (GRFs) were introduced as the "answer" for assessing the competencies. In this approach, students or residents are assigned a number from a scale, for example, a "5" out of a possible "10" on one of the competencies. The faculty member has "done his/her duty", but how satisfying is the process for our teachers, and what in the world does that student or resident do with that "5".

Our medical school (CCLCM) is a competency-based school that approaches assessment systematically by emphasizing formative narrative feedback organized through a portfolio sytem. Students never are assigned a number, rather they receive narrative feedback about their "strengths" and "areas needing improvement". This seems to be working really well, although, there is a bit of a "learning curve".

So.. what do you think? Are there some strengths to GRFs that I am missing? Have you found ways to make those "numbers" tell a story that leads to improvement? Let us know what you think? Any residents or students reading this link? What do you think?

13 comments:

Gary Williams said...

Global rating scales alone, without comments, are of limited value. Without comments, particularly descriptions of skills or behaviors that were directly observed by the faculty supervisor, they provide little to help learner advance or improve, or to recognize what they have done well.
The global rating, when paired with such direct observation, can be useful if used to provide learners with with some idea of performance and progress compared to their peers. This can often be a powerful motivational tool for improvement.
The addition of brief descriptions of what the learner was observed to do well, as well as one or two targeted areas for improvement (TAFI) as used in the evaluation process for our Cleveland Clinic Lerner College of Medicine students, would be an effective addition to the global rating forms. We are trying this approach in pediatrics, and hope to improve the quality of useful feedback we provide to our residents. This will require faculty development. I believe the efforts to improve Faculty evaluations and feedback for our students can serve as the standard for our residency program.

Steve said...

Personally, I think the numbers give us an indication if the resident is having any major problems in one of the competency areas. that is all you can really expect from an evaluation system. If one of my residents gets a 2 from one of the staff, then I go find out what the story is and work with the resident

Neil said...

My cynical take on the Rating scales is that they are done to satisfy RRC requirements (that residents are getting evals), and to help program directors crunch numbers quickly (as opposed to reading narrative feedback)
Maybe we use these because we assume faculty/attendings are not going to provide more detailed feedback - so why even try?.

With the limited duty hours, staggered call hours, brief hospital rotations (2 weeks as opposed to 1 mth) there are fewer contact hours between attendings and residents. Providing such numeric summative feedback in absence of sufficient evidence seems inappropriate.

There is also the big question of comparing the resident performance to peers vs. standards.

The other issue is faculty development - some attendings rate all residents in all competencies as 7 or 8. So the rating scales data which is of questionable value to start with, becomes even more useless!

What we need to do is study how often do residency programs change something (curriculum or other intervention) based on data from these scales. Identify what kind of data prompts intervention....

Moises said...

I agree with all previous comments. The GRS are by themselves without a written feedback and comment, of poor academic value.
The student requires a specific notion of what should be improved and therefore numeric value will be useless if there are no specific comments that will provide further feedback and insight.
Numbers are useful to compare between peers, but as evaluators are different, some subjective component may affect the overall score. In comparison, written feedback will individualize specific things that should be continued or improved.
I agree. A formal study comparing the effect of the GRS in curriculum changes is required.

Alta Chavez said...

The previous comments all provide useful points for condsideration. I would emphasize the notion that a brief description of what the learner does well as well as what needs improvement, provides more useful feedback for the learner than a simple numerical rating. The simple numerical rating is, of course, subjective. Adding narrative containing concrete examples can help more effectively communicate what I mean when I give a rating.

Steve said...

Hmm... If everyone agrees that these are useless and maybe even dangerous considering that these arbitrary numbers are "meaned" and residents are categorized based on this, why do most programs still use them??

Anonymous said...

I'm guilty of using the number ratings and not providing any other feedback because it's there on the form, it's easy and fast, and then I'm done. I have provided verbal comments about the trainee if there's a meeting and open discussion, and this feedback does get recorded and given back to the trainee.
So what's the study to quote to provide the evidence that narrative feedback is better than GRS and that narrative feedback produces meaningful changes in medical education? BEME Guide #7, 2006? Cochrane review 2003?

Christine said...

I'm interested in responding to the last "anonymous post". It seems as if what you describe as "meeting with the resident" and having verbal comments transcribed for the resident is exactly what we are proposing in documenting narrative comments from the beginning. I doubt that there would be a study that would find differences between the two. However, if there is only numbers, the learner suffers. Do I have a study to support this? there are a number of studies in the self-regulation literature that support the formalizing steps of self regulation as a means to "internalize the process". Great question, by the way. I think we need to have our "feet held to the fire"

amy n. said...

I agree that a global rating scale number such as "7" alone provides little information to a student about performance. Descriptive feedback is definitely better at elucidating areas of strength and weakness. This type of feedback allows the student to internalize and re-assess their performance with opportunity to adapt to expectations and/or standards. I do however want to point out that global rating scales do provide a simple and extremely easy way to identify outliers with poor performance (or outstanding performance, but usually evaluators tend to rate everyone more favorable than not) where additional action, investigation or intervention is warranted. Thus, I tend to think that a system which utilizes both the descriptive and the numerical rating scale is best. Plus, if you are already taking the time to write the descriptive feedback, it will only take an extra 30 seconds to provide a rating number.

Jeanne said...

I agree that there are pros and cons to these forms. However, for competency assurance, there must be objective evidence for the individual. Besides our portfolio system, these types of ratings could provide a numerical rating for the individual that can be compared to others and can be tracked over time to become information for curriculum development and future planning programs also.

Anonymous said...

I agree that there are pros and cons to these forms. However, for competency assurance, there must be objective evidence for the individual. Besides our portfolio system, these types of ratings provide a numerical rating for the individual that can be compared to others.

Gretchen Lovett said...

Yes, specific comments, either delivered in a one-on-one feedback session, or via the GRF itself, enhance the learner's ability to use the feedback for improvement. At our medical school, we use GRFs to measure communication skills; the GRFs are completed by our Standardized Patients. We have 200 students going through a 12-station OSCE, so that's 2400 GRF's. That's a lot of information. I think that the information from these GRFs becomes more meaningful when there are a number of raters. In this case, if a student consistently is scroing much lower than his or her peers, (across several stations) then we can intervene with that student and provide targeted input, correction, remediation -- whatever you want to call it. Yet, the students say that they learn the most from the narative comments that the SP's put on each section of the form, noting what the student did well and what could be improved. I don't think that it's really a question of GRFs vs. narative feedback; we need both as they meet different aims - the former is more sumnative and the latter is more formative in nature.

Collin said...

I agree with the points being given, and offer 2 added points that come to mind with things like this:

1) Just like with Likert scales, things are given a numerical assessment and each number is exactly the same distance from the other, yet what these numbers mean is not uniform in distance. For example, the distance between what someone would rate a 5 and an 8 on these scales is much smaller than the difference between 1 and 4. In fact I don't even know if 1 exists! This renders the scale even more arbitrary than what is just personal point of view of what an 8 means, etc. This has direct implications on using the numbers in any way - for example averaging and comparing.

2) At an APDIM conference a couple of years ago, Eric Holmboe showed us all a video of a resident doing couseling, and asked us to give it a global rating. This group of educational "experts" game the resident anywhere from a 3 to a 10, which of course was the point. His solution was that each point on the rating meant something more than the traditional anchors used, and I agree. Now someone just needs to define all of those discreet competencies for everything residents and students do and we should be all set.

But of course, as one educator says often in these settings: "show me a great resident with a global rating of 3, and I'll stop using them".