Grade deflation

By Eric Crampton 18/08/2014


Results from Princeton’s grade deflation experiment are well worth reading. Catherine Rampell reports on it here and here. Princeton aimed to reduce the proportion of awarded As to reduce variability across departments. Some departments responded more vigorously than did others, but mostly in their 100- and 200-level offerings.

At Canterbury, we always had a measure called the Difficulty Index. It wasn’t perfect, but it was a lot better than just targeting the proportion of As. It worked as follows. Each student’s grade in a particular class was compared to that student’s average across all other courses that year. So if a student in my Public Choice class got a B+ but got an A in all the other classes, my class would get a notch up in the difficulty index for that student. Do that for all students in all courses and you have a difficulty index. Where there are enough students who do courses across different departments, you can get cross-departmental measures of difficulty.

Every year, we’d get reports, to the Department, of which courses were more than a standard deviation away from average difficulty, on either side. The Teaching Committee or HoD would have a chat with the lecturer in those courses about it, unless there were some particular reason we expected that course would prove consistently difficult or easy. We aimed to keep our big first year courses right at the middle and didn’t mind so much if our core math-based second year theory papers proved difficult. I think one year my Current Policy Issues came up as being too difficult: students in that course earned lower grades than they did, on average, across their other courses. So I adjusted my grading a bit.

I always wished that we could do away with straight Grade Point Average measures and provide instead difficulty-adjusted GPAs. A student earning straight A+s in courses where everybody gets an A+ would then get something like a B unless the students taking those courses also did exceptionally well in their other courses.

I think the Princeton experiment shows that edicts from on high can help a bit, but I wonder whether those gains can be sustained. I also wonder whether it doesn’t unduly penalise the really hard courses that are only taken by the brilliant. If everybody in some upper-division crazy-hard math course gets an A+, and they all deserve it because they’re brilliant, the Department shouldn’t be penalised for that. Strict GPA proportion targeting penalises students in those courses. A difficulty-index would instead show that course to be of average difficulty if the students in those courses also earned an A+ in every other course, and of above-average difficulty if some lower ability students came in and did badly.*

GPA-seekers would flip from looking for easy courses to looking for those courses where they’d do really well relative to other students. That’s not perfect, as there are are plenty of students who could be very well advised to take an essay-based course because they’re not very good at writing essays and need to learn it (or math courses for the less mathy), but it’s surely better than having them seek the ones where they have a relative advantage and where the grading is easy.

These kinds of measures can also be really helpful in setting University-wide scholarships where GPAs across Departments are otherwise not comparable. I’d always worried that our Econ students suffered in University-wide scholarships for graduate study because we kept a much tighter lid on top-end grades than did some other Departments. Difficulty-adjusted GPAs could likely have solved that problem.

And now a bit of chart-porn from the Princeton study. The x-axis has the percentage of As awarded in the early period; y- has the proportion during the deflation period. Economics maintained a pretty low proportion of As in both periods. The top chart has 100 and 200-level courses; the bottom, upper-division.

* We can imagine more complicated versions of the difficulty index where it’s adjusted for these kinds of composition effects, so that the difficulty of each of the courses in which a student is enrolled is factored into the calculation of any particular course’s difficulty.