Posts Tagged bibliometrics

CRI bibliometric performance: Part III Shaun Hendy Feb 19

No Comments

Last week, John Key signalled in a speech to Parliament that there would be changes to the way the Crown Research Institutes are funded.  Indeed, the debate over CRI funding has continued pretty much unabated since they were created.  In earlier posts, we looked at the growth in the total bibliometric output of the CRIs and at the increase in their citation impact relative to the rest of New Zealand.  In this post, I will look at the relationship between CRI funding and bibliometric output.  The data suggest to me that the growth in bibliometric output has been driven by the development of new revenue sources.

CRI total revenueFirst, I want to look at CRI revenues since 1994.  It is clear that CRI revenue has increased by about 30% over this period, once adjusted for inflation (figures are given in 2008 $).  Not all CRIs seem report their levels of public good science funding (or PGSF, which I will define here as the level of FRST and capability funding), but for those that do (most), I also plot PGSF revenue after adjusting for inflation.  Note that the PGSF revenue, at least for those CRIs that report it, has remained static over this period.

This is especially interesting given statements made when the CRIs were established.  Here is Sir James Stewart, Chair of the CRI Implementation Steering Committee:

“The science staff surpluses are not an outcome of the restructuring, but in part stem from chronic underfunding of science … Science departments had carried too many people for the money available.”

So how have staffing levels changed at the CRIs?  Statistics NZ collects FTE data from the CRIs, assigning research FTEs to the categories of researcher, technician and support staff.  Here is how Statistics NZ defines the different categories:

Researchers
Researchers are defined as those staff engaged in the conception and/or creation of new knowledge/products; personnel involved in the planning or management of scientific and technical aspects of R&D projects, and software developers.

Technicians
Technicians are defined as staff engaged in technical tasks in support of R&D, normally under the direction and supervision of a researcher.

Other Supporting Staff
Other Supporting Staff are described as staff providing specific information acquisition and treatment (for example drafting, typing, maintaining libraries etc. or specific administrative support such as bookkeeping, personnel services etc.)

CRI staff ratiosThe COMU website reports that just over 80% of CRI staff were involved in research in 2008.  On the right, the plot shows how the numbers of these research staff in each of the Stats NZ categories have changed according to the Stats NZ R&D survey.  (Note – in an earlier post, I reported on the numbers of researchers at CRIs, but there I used government sector researchers as a proxy, as not all the CRI data has been published.  The data on the right is the actual CRI data kindly supplied to me by MoRST.)  From the plot, we see that research staff FTEs have steadily increased at the expense of technical and support staff.  The decline in support staff since the mid-1990s is particularly dramatic.  This is something that has been very noticeable to me during my time as a CRI scientist.

CRI publications per dollar Now let’s look at how the revenues above scale with staff FTE and bibliometric output.  In the plot on the left, I give the total revenue (in 2008 $) per Researcher FTE (not research staff). This has remained relatively stable since 1994, fluctuating at around $400k per Researcher FTE.  On the other hand, revenue per paper published declined sharply in the 1990s, but then stabilised at roughly $500k per paper over the last decade. Of course, a good fraction of the research conducted in the CRIs will not lead to a publication, so this number does not reflect the true cost of a publication.

As we have seen, the CRIs’ bibliometric output has risen since their creation, and their citation impact has grown faster than the rest of New Zealand.  It also seems that they have become much less dependent on PGSF funding since they were created, with total revenue growing by 30% while PGSF revenue remained static.  Researcher FTE levels have risen, albeit at the expense of support and technical staff (although this may be typical of many businesses?), while the revenue per researcher FTE has remained static. Thus, the generation of revenue from non-PGSF sources, has led to increases in researcher staffing levels, which has in turn lifted the bibliometric output of the CRIs. To go any further, we will need to look more closely at the performance of individual CRIs.

CRI bibliometric performance: Part II Shaun Hendy Feb 10

8 Comments

In a post a few weeks ago, I looked at the total published output of the CRIs from 1993. Now I want to look at the citations to CRI papers. I will use two citation measures. The first is a two year impact factor, which is a measure that is often used to rank journals. The impact factor of a CRI in 2008, for example, is the average number of citations in 2008 for papers published by authors at that CRI in 2006 and 2007. The second measure I will use is a 5-year impact factor i.e.  the average number of citations to papers in 2008 that were published between 2003-2007 is the 2008 5-year impact factor.

Now, the analysis I am going to give below is somewhat naive. I should really be breaking down the citations by subject area (as pointed by Crikey Creek’s Daniel Collins in a comment last year). This is important because rates of citations differ considerably between disciplines – unfortunately I haven’t had the time to do this, except in a few special cases such as my own Institute. Thus, differences in impact factor between Institutes will depend on the areas in which they work. Changes in that difference over time may reflect changes in focus within Institutes, rather than changes in impact of the research conducted.

Why do citation rates differ between disciplines? At least part of the difference comes from the degree of empiricism within a discipline. Medical science frequently makes use of the aggregation of meta-data from many studies, some of which may be too small to have statistical significance on their own. So if your small study suggests that  smoking is a risk factor for diabetes, it will be important to cite as many other studies of smoking and diabetes as possible to give your reader context. Mathematics on the other hand relies on mathematical proof. To prove the Reimann hypothesis, you may only need to cite a handful of papers that contain results you rely on in your proof. You hardly need to cite every paper on the Reimann hypothesis that has appeared in print. Not surprisingly, journals in medical science typically have much higher impact factors that mathematics journals.

CRI Impact vs NZ On to the results. Firstly I have plotted the CRI (2 year) impact factor from 1995 to 2008 (on the right) against the New Zealand impact factor as calculated from the Thompson Reuters database. Firstly, we note that both data series show large increases over this time period. However, in 1995 the CRIs trail New Zealand as a whole, whereas in 2008 the CRIs lead New Zealand. The data is sufficiently noisy that one can’t to assert that the CRIs are significantly different from the rest of the country with much confidence however.

CRI 5yr Impact However, with the 5-year impact factor, the trend seems clearer: the 5-year impact factor of the CRIs is below those of New Zealand as a whole at the end of the 1990s, but by the mid 2000s it surpasses those of the rest of the country. As I mentioned above, there could be a number of explanations for this. CRI citations per paper have grown faster than New Zealand as a while. For example, I wonder if this could reflect a diversification of research activities at Universities, where disciplines with lower impact factors have started publishing more, perhaps as a result of the Performance Based Research Fund.

Unfortunately, without breaking down citations by discipline we can’t really tell whether this does reflect an increase in relative impact by CRI researchers. However, the data does suggest that this would be a worthwhile exercise: why has CRI impact surpassed that of the rest of New Zealand in the last decade?

CRI bibliometric performance: Part I Shaun Hendy Jan 26

5 Comments

In a post last year, I looked at New Zealand’s bibliometric productivity in the university and government research sectors using data from the SCImago bibliometric site.  Over the next few weeks, I will report on some further bibliometric analyses using the Thompson Reuters Web of Science.  While providing substantially the same New Zealand-wide results as SCImago, the Web of Science database also allows me to break down publication data by research institute (and by individual author if needed).  Unfortunately, it is not freely accessible – I have institutional access through Victoria University of Wellington.

I’ll start this series of posts by looking at the total published outputs of the Crown Research Institutes (CRIs).  The CRIs were established in 1992 by scientists from the Department of Scientific and Industrial Research (DSIR), the research division of the then Ministry of Agriculture and Fisheries, and the New Zealand Forestry Service.  Shortly thereafter, a significant portion of the Crown funding for science became contestable through the Public Good Science Fund, open to the CRIs, universities, and businesses or other organisations conducting research and development.

At the time, the restructuring of government science into the CRIs was highly controversial‘Is New Zealand shooting itself in the brain?’ wrote New Scientist magazine.  ‘A small country does something like this at its peril’ said John Stocker, chief executive of Australia’s main research organisation, the CSIRO*.  The DSIR had given the world Marlborough Sauvignon Blanc, earthquake resistant lead-rubber bearings for building foundations and high-temperature superconductors, yet the Government of the day thought that the new Institutes would be better placed to contribute to New Zealand’s economic growth.

Almost two decades later, our new Government is wondering how the experiment went.  While the government scrutinises CRI balance sheets closely, other aspects of CRI performance receive very little attention.  This is surprising, since the reason the crown owns such research institutes has nothing to do with their balance sheets at all.    CRI total publications

Here I will look at how the CRIs have performed bibliometrically, starting with their total published output from the year following their establishment. The figure on the left shows the number of papers in the Web of Science database published by scientists at the CRIs since 1993. It can be seen that the annual number of publications doubled from 600 in 1993 to 1200 in 1997, a level where it has remained to the present.  The increase in output from 1993 to 1997 was substantial, but how was it achieved?

CRI productivityThe next figure shows the total researcher FTEs in the CRI sector from 1994-2006, and the corresponding productivity (in papers per FTE) over the same time period.  Researcher FTEs increased from 1996 to 2002, but have then declined by 20% since their peak in 2002.  Note that the productivity of researchers, in papers per FTE, remains relatively static over the period in question.  This largely reflects the New Zealand situation as a whole, where productivity has remained steady, and changes in levels of published outputs have been driven by changes in FTEs.

(Update: I received some better FTE data from Statistics NZ so the figure above was replaced on 18 March 2010. The data is similar to that shown in the original figure, but with the addition of the 2008 data, we see there was large jump in researcher FTE from 2006 to 2008, reversing the decline since 2002).

In my next post on the CRIs, I will look at how the number of citations of their papers have changed over time.  I will then look at how the CRIs have been able to lift their researcher FTEs from 1300 in 1994 to over 1800 in 2006.  After that I will move on to the Universities.

* Australia still has the CSIRO, and although some reforms have taken place since John Stocker made his comments, Australia has resisted introducing the direct competition between CSIRO and university scientists that has characterised our science system.

What’s ahead in 2010? Shaun Hendy Dec 29

No Comments

2010 is shaping up to be a defining year for New Zealand’s RS&T system. We will be hearing how the Government will set its RS&T priorities and what these priorities will be. The CRI task force will be reporting back, and we will find out how the Government is going to encourage R&D in the business sector. As I discussed in a post last month, the money spent by the business sector on R&D is strongly correlated with patenting.

What will I be blogging about in 2010? I have devoted quite a bit of time this year discussing data that we have extracted from the OECD patent database. There is still a lot of information to be mined from this database and I will be continuing to discuss this in the New Year. For instance, I will have more to say about inventor networks and how their structure changes with network size. I also want to look at some specific networks in more detail, comparing their size and structure between countries.

I will also follow up on some of my earlier posts on bibliometrics. It took me a while to get permission from Thompson Reuters to start publishing citation data, but this has now come through, so I will be looking at how the impact factors of New Zealand institutions have changed over the last 20 years. I also want to follow up with more detail on the surprisingly large co-author networks that exist within the New Zealand science community.

Of course, from time to time, I will blog on other matters that interest me throughout the year. I have been following the progress of some new types of collaborative research: mathematicians have been learning how to use the mass collaboration that blogging allows to prove theorems and solve original research problems. This is, after all, the reason that the web was created in the first place.

Collaboration, whether through blogs or other means, may be the key to New Zealand taking its own R&D to scale.

In the meantime, I hope you are all enjoying your holidays!

The University co-author network Shaun Hendy Oct 27

4 Comments

Uni coauthor networkIn an earlier post I looked at the 2008 CRI co-author network. Now let’s turn to the University network. Using the Thomp­son Reuters Web of Sci­ence again, I found 5116 publications in 2008 with authors from New Zealand universities. In total 13930 authors contributed to these papers. The network is shown on the right.

Again, a remarkably large fraction of authors belong to the giant component. In the 2008 CRI co-author network, 2325 of the of the 4496 authors belonged to the largest connected component. Here, 9771 of the 13930 authors belong to the largest component – that’s a remarkable 70%.

We can make some other comparisons between the CRI  and the university networks. In the university network, on average each author has 8.4 collaborators; in the CRI network, each author has 5.1 collaborators. Apparently, university authors are more collaborative.

Degree distribution However, just comparing the average numbers of co-authors is misleading. I’ve graphed the distribution of co-author numbers for the universities and the CRIs on the left i.e. the proportion of authors with certain numbers of co-authors. From the graph it’s apparent that the difference between the university and CRI networks lie in the tails of the distributions. There are a number of university authors that participate in very large collaborations. For instance, there are a dozen or so authors in the network whose only published work in 2008 was one with 343 co-authors. Big science!

It is probably not surprising that university researchers are more likely than those in a CRI to participate in very large overseas collaborations. This skews the average number of co-authors for university researchers relative to CRI researchers, making the mean number of co-authors larger.

New Zealand’s RS&T priorities Shaun Hendy Oct 23

1 Comment

MoRST have just released a discussion document which introduces a new structure for RS&T investment, aimed at allowing for greater strategic priority setting. Submissions are due by November 18th. Get it here.

What do you think of the feedback document and where would you put our rather modest science dollar?

World bibliometric output

While you are pondering this, here is where the rest of the world puts its intellectual grunt: the pie chart below shows the proportion of papers published by subject area over the last ten years. The physical and medical sciences account for two thirds of the world’s publications.

In contrast, here is where New Zealand puts its efforts:

NZ bibliometric output

Setting priorities is clearly nothing new for New Zealand – as the charts show, our science system strongly emphasises agricultural and environmental sciences at the expense of physical sciences. Have we got the balance right?

The CRI co-author network Shaun Hendy Oct 19

6 Comments

CRI coauthor network To what extent do scientists at Crown Research Institutes (CRIs) collaborate? Using the Thompson Reuters Web of Science, I have constructed the CRI co-author network for 2008. As best I can determine, the Web of Science database contains 1271 papers from 2008 with CRI authors. In total, 4496 authors contributed to this set of papers – not all these authors are from CRIs of course, but they have all co-authored a paper with someone from a CRI. The network is shown on the left: the green dots are authors, with blue links between pairs of authors indicating co-authorship on at least one paper.

What surprises me is the extent of the largest  set of authors that can be connected to each other by co-authorship. This largest connected component can be seen sitting in the centre of the 2008 network diagram, containing 2325 of the of the 4496 authors (52%). It contains authors from many of the CRIs (including me and a number of my colleagues at IRL) and from a number of Universities, both in New Zealand (including many from the the MacDiarmid Institute) and overseas. The next largest connected component contains only 31 authors.

Connected component If you look at the size of the largest connected component in the CRI co-author networks each year, 2008 is the largest. Just after the CRIs were established, in 1994,  the largest component contained only 195 authors, occupying only 12% of the network. One reason for the growth of the largest component is that since 1994, the average number of co-authors each author has in a given year has risen from two to five. In other words, CRI scientists are collaborating more extensively in 2008 than they were in 1994.

New Zealand’s recent bibliometric productivity Shaun Hendy Oct 02

2 Comments

As discussed in an earlier post, there are a number of sources for bibliometric data. Scimago Journal and Country Rank is a freely accessible bibliometric analysis site developed by a Spanish research group using Elsevier’s Scopus bibliometric database, which holds country and journal summary information. For example, there is a New Zealand summary statistic page that has data on publications each year since 1996.

NZ total pubsThe first thing that leaps out at you on the NZ summary page is the large increase in publications per year evident from 2003, as I have replotted on the right. This increase is substantial: NZ has gone from publishing 5000 scientific articles per year to more than 8000.

Actually, 2003 was the year in which the first performance-based research fund (pbrf) assessment round was held. This was part of a change in the way university funding was allocated, from a system where funding levels were set largely by full-time student numbers, to a system where levels are partially determined by research performance. The performance measures used were based on the quantity and quality of research performed by individual researchers, with assessments taking place in 2003 and 2006.

NZ-FTE-ProductivityIt is tempting to attribute the growth in annual publication to the pbrf exercise, with researchers responding to this assessment by increasing their output. However I mentioned in an earlier post that Statistics NZ provides an estimate of the number of full-time equivalent (FTE) researchers in the university, government and business sectors every two years. This allows us to calculate the number of papers per FTE researcher, which is a measure of researcher productivity. On the left I’ve plotted the number of university and government FTE researchers (including post-grads), and the productivity in papers per FTE researcher from 1996-2006.

This shows that while total publication output has increased significantly, so has the number of FTE researchers, leaving productivity in papers per FTE surprisingly static. Most of this increase in FTE researchers comes from a large expansion in post-graduate student numbers. In many disciplines, we are now training more post-graduate students than ever before. This is good news (especially given the discussion here), but as I’ll discuss in a later post, this growth in post-grad numbers is not uniform across the disciplines.

Thus on the face of it, the introduction of the performance-based research fund has not led to an increase in bibliometric productivity. However there are claims that the pbrf has led to an increase in research quality, as measured by citations. One way to test this is to compare university citations with those from the CRIs – this will be the subject of a later post.

A measure of science Shaun Hendy Sep 26

3 Comments

As a theoretical physicist and applied mathematician, I’m interested in using numbers to describe all sorts of phenomena. And as a researcher in the MacDiarmid Institute, I’m also interested in innovation. So for me, it’s natural to try to study innovation quantitatively. One of the goals of this blog will be to look at science innovation using tools developed to study complex systems, drawing on quantitative data sources and statistics.

What is already out there? In a New Zealand context, MoRST publishes an RS&T scorecard and also commissions a national bibliometric report from time to time – there is one due out this year. The Ministry of Education also recently published a bibliometric analysis of the Universities in order to assess the impact of the performance based research fund. Similarly, the Marsden fund commissioned a bibliometric study to look at the impact of scientific papers that were produced as a result of its funding.

A bibliometric study counts or analyses the scientific journal articles that scientists are publishing. It can give information on the subject areas scientists are working in, and can provide an assessment of the impact that those scientists are having in their field. One way to assess impact is through citations – i.e. looking at where and how often a particular journal article is being referenced by later journal articles. The value of bibliometric studies is controversial, particularly when they are used to rank individual scientists who are competing for funding. Nonetheless, as journals are the most important forum for communication of scientific ideas and results, bibliometrics is here to stay.

Another measure that is frequently used is the number of patents produced by a country. Patents are principally produced by researchers in the private sector, so they complement scientific publications, which are mainly authored by researchers in the public sector. MoRST’s RS&T scorecard has some interesting information on the patents produced by New Zealand. Further information can be obtained from national patent offices:  New Zealand’s Intellectual Property Office has a searchable database of patents. Counting patents has its drawbacks too. Assessing the value of a patent is a difficult task.

The OECD is another organisation that monitors scientific performance. Its studies are interesting because they put New Zealand’s scientific output into an international context. The OECD reviewed the New Zealand innovation system in 2007 – in this document you will find a large amount of financial data:  business expenditure in research and development, dollars spent on basic versus targeted research, etc. In fact, much of the quantitative discussion on innovation focuses on the dollars spent.

Another piece of the puzzle is provided by Statistics New Zealand. It publishes a report every two years on the number of scientists and researchers in New Zealand. The Ministry of Education also tracks the number and subject areas of advanced degrees (such as PhDs) granted at Universities in New Zealand.

What will I add to these sources? There is a wealth of data available, but it is held in diverse locations. One thing I’ll try to do is pull some of this information together. For example, I’ll look at how the number of scientific papers and dollars per researcher has changed over the past 20 years. I’ll also try to use new tools for looking at the data, particularly some of the methods that have been developed recently for studying complex systems. New Zealand’s innovation system is, if nothing else, complex.