# The curious problem of assessing physics

Here’s a bit more on the NCEA Physics assessments that I heard about at the NZ Institute of Physics Conference last week. I alluded to it very briefly in a previous post. This comes from my notes of the presentation given by David Lillis, a statistician at the NZ Qualifications Authority.

Unsurprisingly, NZQA throw lots of statistical analyses at the various exams. That’s quite reassuring – they are looking at whether each exam is doing its job properly. That is, does it provide a fair and reasonable assessment of a candidate’s performance in a particular area? They do things like look for gender/culture bias in particular questions, whether there is redundancy in questions (i.e. two questions on the exam ask basically the same thing – this can be checked through correlations between candidates’ scores on different questions), whether the question/exam succeeds in distinguishing ‘not achieved’, ‘achieved’, ‘merit’ and ‘excellence’, what is the relative ‘difficulty’ of an exam (e.g. by looking at how candidates in one exam fared in other exams) and so on.

But the analysis that was most interesting for me was the Principal Component Analysis on the students’ responses to the physics assessment standards.

Principal Component Analysis is a pretty mathematical thing to work through, but, explaining in non-mathematical terms, it’s about finding out how many and what ‘dimensions’ are required to explain the variation in the results. So, an example – if you look at how the contestants do on ‘Masterchef’ or some equivalent competition, you might find that most of their performance is down to how well they can cook (the first, most important dimension), but part of their performance (the second dimension) might be attributable to their ability to select combinations of flavours that work, and a little bit (a third dimension) on their ability to present their dishes. PCA on an exam or assessment helps to draw out what the assessment really is doing. One would hope that it is really assessing what it is meant to be assessing.

Now, here’s the thing with physics. For nearly all NCEA assessment standards EXCEPT physics, there is only ever one important dimension. That is, students who do well on (say) question 1, will also do well on question 2, and well on question 3, because, fundamentally, all aspects of all questions are addressing the dimension. However, for physics assessments, there are usually two important dimensions. So, for example, if we take the standard ‘Understanding waves’, we see that the first dimension, the one that explains most of the variation in the students’ scores, is how well do the students understand waves? And that’s good. But there is another dimension that’s apparent, and that’s to do with the quantitative or qualitative nature of questions, in other words, how much mathematical work is in a question. To get really good marks in the standard, you have to not only understand about waves, but be able to undertake quantitative calculations too. So really there are two different aspects being examined at the same time.

My understanding from the talk was that nearly all physics assessments show this, but it is rare in another subject (even a science subject).

So, the conclusion that can be made from that is that physics exams don’t just assess physics; they also assess to some extent maths. This probably won’t come as a surprise to most physics teachers, but it came as a surprise to me to see it so clearly evidenced through a careful statistical analysis. Is this two-dimensional nature a good or bad thing? I’m not sure on that one. My gut feeling is that physics should be about physics, not mathematics, and our assessments should reflect it, but on the other hand there’s no getting away from the fact that physics is a quantitative subject. It is good, though, to see that in NCEA at least the ‘Understanding the Physics’ is the primary dimension, NOT the ‘doing maths’. I’d love to analyze some of the exams we set here to see if that remains the case at university.

## One Response to “The curious problem of assessing physics”

It was definately an interesting talk. There was also a third ‘dimension’ that was also statistically significant (by the speaker’s own definition) in some of the standards.

The first dimension was whether the student could demonstrate understanding

The second dimension was whether the the question was qualitative or quantitative in nature

I’m not sure he suggested what the third dimension could have been. Did anyone catch what it was?