Dr Spendlittle and the Pushmi-pullyu: a terrible tale of research evaluation
Quality metrics offers a cheap technological fix that heads off in two directions at once, at the expense of accurate assessment, argues Claire Donovan.
I recently edited a special journal issue on the future of national research evaluation schemes and was struck by how the UK's dogged pursuit of simple quality metrics goes against international expert opinion.
The proposed research excellence framework relies on a simple citation metric to inform funding decisions and dispenses with peer review of scientific research. After intense lobbying, light-touch peer judgement will be retained for the social sciences, humanities, arts, mathematics and statistics, for which quality metrics are apparently being sought.
Yet there is an elephant in the room: it is well known in the scientometrics community and science policy circles that what we call quality metrics, such as citation data or research income generated, simply do not measure research quality. Best practice uses quantitative indicators sensitive to varying disciplinary norms of scientific communication and practice. Crucially, this combines a variety of metrics with expert peer judgment as a check and balance on each other. As one of my colleagues put it, metrics are "a trigger to the recognition of anomalies".
There is a gathering international momentum towards combining "quality" assessments of scientific excellence with evaluating the wider societal benefits or "impact" of research, which the REF does not clearly address.
So while international research evaluation practice is undergoing fundamental reorientation, UK policy is retrograde and divorced from innovation and so looks set to keep the bathwater and throw out the baby.
But why is there such a gulf between expert opinion and the current preferred REF form? One reason is a detachment from advances in scientometrics and a lack of policy learning from research evaluation around the world.
Another reason is that the politics of designing research evaluation exercises resembles the Pushmi-pullyu of Dr Doolittle fame - a two-headed llama that attempts to travel in opposite directions at once.
The concerns of Government and academic interests interact to produce a variety of tensions. A push towards external audit is offset by a pull towards internal peer-based appraisal. A push towards broader relevance is met with a pull towards scientific autonomy. A push towards business interests is counterbalanced by a pull towards broader public benefits. But what kind of beast are we left with?
There has been a great deal of consultation with the higher education sector about the REF's shape, with the latest round due to close on 14 February. So far this process has led to a quantum based on research income being ditched, and allowed peer review to be retained for some subjects.
Yet I wonder whether the final form of the REF will have clear direction or remain a confused Pushmi-pullyu. There is no clear logic connecting the broad aims of the UK's public policy to the raison d'être of publicly funded research to how this is best accounted for. Am I alone in thinking the tail has been wagging the dog, and cheap technological solutions have been driving the policy?
A fundamental problem is the mania for metrics. It is assumed that when compared with peer review, metrics are "scientific": objective, independent, simple, less time-consuming. But quality metrics are as infused with human values as peer review. A reliance on, for example, citations based on journal publishing alone creates a nascent scientism that devalues not only the social sciences, humanities and arts, but also core science disciplines. This produces circular metrics that reward an imagined hierarchy of science at the expense of other research fields.
A similar fate has befallen impact metrics, which traditionally seek economic returns from research in the form of funding attracted from business and industry, patents, commercialisation and spin-off companies created. These measures reveal low-level benefits from research and privilege private over public interests.
The Higher Education Funding Council for England was allocated £60 million for 2007-08 to distribute on the basis of maximising the economic impact of research, or "the relative amount of research universities undertake with business". The REF is not aflame with the desire to seek out research "impact" beyond this simple economic rationalisation.
Yet the latest turn in impact evaluation is towards contextual qualitative frameworks. This connects with the idea of "triple bottom-line accounting" to find the social, economic and environmental public value of research. In this light, traditional quantitative impact metrics are increasingly becoming disconnected from science policy.
Qualitative, contextual approaches are essential to both "quality" and "impact" evaluation, which can helpfully be informed by appropriate quantitative metrics. The future of research evaluation is qualitative, particularly if governments wish to connect research policy to public value. This entails more messy and time-consuming processes that capture complexity and diversity and make visible the academic and public value of research in all fields.
While simple quality or impact metrics are relatively easy to collect, the REF must decide between what is easy and what is right, not only for the higher education sector but for the accountability of science to society.
Claire Donovan is a research fellow in the Research Evaluation and Policy Project at the Australian National University, and is currently a visiting fellow at the Science and Technology Policy Research Unit, University of Sussex.
The special edition of Science and Public Policy (vol. 34 no. 8) Future Pathways for Science Policy and Research Assessment: Metrics vs. Peer Review, Quality vs. Impact is available at www.ingentaconnect.com/content/beech/spp.