Critical mass or is mass critical?

By Shaun Hendy 20/10/2010

In research and development, it’s often taken for granted that teams require a certain critical mass to be successful.  Indeed, in a recent paper [1] two European researchers claim to have seen the effects of critical mass in the UK Research Assessment Exercise (RAE) and its French equivalent (HT: Mark Wilson).  However, I think that their findings may be an artifact of the assessment process, rather than evidence for critical mass.

By looking at how group research quality depended on the size of a group, the researchers observed a linear relationship between size and assessed quality.  Indeed, I have seen a similar correlation between inventor network size and productivity in patenting

However, in their study they found that this correlation holds only up to a certain group size, which they hypothesize is the ‘critical mass’ for scientific collaboration.  Above this critical mass, quality grows less rapidly with group size.  This would seem to suggest there is an optimal group size for scientific collaboration.

On the other hand, I’ve not found any evidence for such an effect in my patent studies, where some of the most productive groups of inventors are collaborating in networks of tens of thousands.  So is there such a thing as critical mass?

Research Assessment

If you are part of the university system, you will be familiar with the idea of research assessment.  In New Zealand, we have the Performance Based Research Fund (PBRF), which allocates funding to universities on the basis of the aggregate research performance of their staff.

In the New Zealand system, each university researcher is assessed by a panel on the basis of their research portfolio, and given a grade A, B, C or R (where you want to be an A not an R).  This grade is based on a weighted average of the assessed quality (a score between 1-7, with 7 being world-leading) of research output (70%), peer esteem (15%) and contributions to the research environment (15%).  In the UK, a similar system is used with quality levels assessed on a scale of 0 to 4.  

One problem with this approach is that each of these three criteria is ranked on a scale from 1 to 7.  Although research output is given the heaviest weighting, it is not possible with such a scoring system to discriminate between two researchers whose research output is world leading — each would be ranked a 7, even if one researcher was much further ahead of the ‘world’ than the other.

Instead, such researchers might only be distinguished through their peer esteem or contribution to the research environment scores, even though these criteria are weakly weighted.  The scoring system used by FRST in its assessment of research proposals also suffers from this inability to discriminate on the basis of its most heavily weighted criteria.

Measuring Excellence

I suspect this inability to discriminate may be behind the ‘discovery’ of critical mass in [1].  Indeed, to their credit the authors acknowledge that such a bias in the RAE may explain their findings.

If scoring systems like these suffer from an inability to discriminate at the top end, why are they used by our PBRF, the UK RAE and FRST?  Chiefly, I believe, it is for the benefit of the assessors — it is much easier to rank each criteria on a scale of 1-7 than to use a more fine-grained scoring system that might have a different range for each criteria.  Having reviewed more than 100 research proposals this year, I certainly appreciate the benefits of such a simple system.          

I once heard Sir Peter Gluckman suggest that the PBRF needed an extra grade for top researchers called A*.  If we wanted to implement such a scheme (I am not saying we actually need to!) then I think we would need to revise our approach to ranking research portfolios.  For instance we could allow reviewers to add an extra decimal place to the scores of portfolios or proposals with a 6 or a 7 (e.g. 6.5) — this would allow reviewers to distinguish between research  on the basis of the most heavily weighted criteria, not on the weakest.

Critical mass? 

So is there such a thing as critical mass? Well, if there is, I don’t think this study [1] is likely to have found it. I also haven’t seen anything like a critical mass in looking at the OECD’s patent databases: networks of tens of thousands are more productive than networks of thousands.  Personally, I think that the skills and knowledge base of scientists and engineers are becoming increasingly specialized, requiring them to work in larger and larger teams. If this is the case, then it is mass that is critical, not the the other way round!

[1] Kenna, R., & Berche, B. (2010). The extensive nature of group quality EPL (Europhysics Letters), 90 (5) DOI: 10.1209/0295-5075/90/58002