World University Rankings subject tables: Robust, transparent and sophisticated

16 September 2010

Phil Baty explains how in-depth consultation with the global academic community has produced the most exact and relevant world rankings yet devised

It is, of course, rather crude to reduce universities to a single number.

We are aware that higher education institutions are extraordinarily complex organisations. They do many wonderful, life-changing and paradigm-shifting things that simply cannot be measured. Data on some of their most valuable endeavours simply do not exist or cannot be meaningfully compared on a global scale; many of the proxies commonly used are less than satisfactory.

The 2010-11 Times Higher Education World University Rankings have been compiled with these limitations very much in mind.

The tables' methodology was determined only after 10 months of detailed consultation with leading experts in global higher education: more than 50 senior figures across every continent provided extensive feedback on our plans, amounting to more than 250 pages of commentary. The wider university community also had its say via more than 300 postings on our website.

So, despite the inherent limitations, these tables represent the most comprehensive and sophisticated exercise ever undertaken to provide transparent, rigorous and genuinely meaningful global-performance comparisons for use by university faculty, strategic leaders, policymakers and prospective students.

The aim over the past 10 months has been to create a genuinely useful tool for the global higher education community and beyond, not just an annual headline-driven curiosity.

So what is the result of perhaps the largest consultation exercise ever undertaken to produce world university rankings?

The tables use 13 separate indicators (up from just six under our old system) designed to capture a broad range of activities, from teaching and research to knowledge transfer.

These elements are brought together into five categories:

The weightings for the five categories, and the 13 indicators within them, vary considerably. High weightings are given where consultation has shown unmistakable enthusiasm for the indicator as a valuable proxy and clear confidence in the data we have. Lower weightings are employed where confidence in the data or the usefulness of the indicator is less pronounced.

The future

This is the first year of a highly ambitious new rankings system. In all such systems, compromises must be made, proxies must be applied and data-collection issues will arise.

However, we are confident that by creating our methodology in open and detailed consultation over the past 10 months, we have produced a robust and evidence-based ranking that paints a realistic picture of the global landscape.

But the tables are just the start of the global conversation — please, come join the debate.


Rankings: the methodology

Weighting scheme for rankings scores

To calculate the overall ranking score, "Z-scores" were created for all datasets.

This standardises the different data types on a common scale and allows fair comparisons between the different types of data — which is essential when combining diverse information into a single ranking.

Each data point is given a score based on its distance from the average (mean) of the entire dataset, where the scale is the standard deviation of the dataset. The Z-score is then turned into a "cumulative probability score" to give the final totals.

Simon Pratt, project manager for institutional research at Thomson Reuters, who analysed the data for the World University Rankings, says: "A cumulative probability score indicates for any real value what the probability is that a normally distributed random value will fall below that point."

For example, if University X has a score of 98, then a random institution from the same distribution of data will fall below this university 98 per cent of the time.

Exclusions

Universities were excluded from the World University Rankings tables if they do not teach undergraduates; if their research output amounts to less than 50 articles per year; or if they teach only a single narrow subject.

Data sign-off

Each institution listed in these rankings opted in to the exercise and verified its institutional data. Where institutions did not provide data in a particular area (which occurred in only some very low-weighted areas), the column has been left blank.

A very important principle of the new Times Higher Education World University Rankings in the first year of a brand new system is that all universities that we list have actively cooperated with the system and signed off their data. The rankings are designed to be a useful and rigorous tool for the global higher education community, and we are delighted that the vast majority of universities around the world have embraced this exercise and have actively participated. Unfortunately after repeated invitations to participate in the Global Institutional Profiles Project and World University Ranking by e-mail and telephone by Thomson Reuters, some institutions did not respond, and therefore could not be included. Some also declined to participate.

Reputational surveys

A worldwide Academic Reputation Survey was carried out during spring 2010. Some 13,388 responses were gathered across all regions and subject areas. The results make up a total of 34.5 per cent of the overall ranking score (15 per cent for teaching and 19.5 per cent for research).


Weighting scheme for ranking scores

Industry income — innovation

This category is designed to cover an institution's knowledge-transfer activity. It is determined by just a single indicator: a simple figure giving an institution's research income from industry scaled against the number of academic staff.

We plan to supplement this category with additional indicators in the coming years, but at the moment we feel that this is the best available proxy for high-quality knowledge transfer. It suggests the extent to which users are prepared to pay for research and a university's ability to attract funding in the commercial marketplace — which are significant indicators of quality.

However, because the figures provided by institutions for this indicator were patchy, we have given the category a relatively low weighting for the 2010-11 tables: it is worth just 2.5 per cent of the overall ranking score.

Teaching — the learning environment

This broad category employs five separate indicators designed to provide a clear sense of the teaching and learning environment of each institution, from both the student and academic perspective.

The flagship indicator for this category uses the results of a reputational survey on teaching.

Thomson Reuters carried out its Academic Reputation Survey — a worldwide poll of experienced scholars — in spring 2010. It examined the perceived prestige of institutions in both research and teaching. There were 13,388 responses, statistically representative of global higher education's geographical and subject mix.

The results of the survey with regard to teaching make up 50 per cent of the score in the broad teaching environment category, and 15 per cent of the overall rankings score.

This broad category also measures the number of undergraduates admitted by an institution scaled against the number of academic staff. Essentially a form of staff-to-student ratio, this measure is employed as a proxy for teaching quality — suggesting that where there is a low ratio of students to staff, the former will get the personal attention they require from the institution's faculty.

As this measure serves as only a crude proxy, and our consultation exposed some concerns about its use, it receives a relatively low weighting: it is worth 15 per cent of the teaching category and just 4.5 per cent of the overall ranking scores.

This contrasts with the 20 per cent weighting the measure was given in our previous rankings.

The teaching category also examines the ratio of PhD to bachelor's degrees awarded by each institution. We believe that institutions with a high density of research students are more knowledge-intensive, and that the presence of an active postgraduate community is a marker of a research-led teaching environment valued by undergraduates and postgraduates alike.

The PhD-bachelor's ratio receives a 7.5 per cent weighting in its category and is worth 2.25 per cent of the overall ranking scores.

The teaching category also uses data on the number of PhDs awarded by an institution, scaled against its size as measured by the number of academic staff.

As well as giving a sense of how committed an institution is to nurturing the next generation of academics, a high proportion of postgraduate research students also suggests teaching at the highest level that is attractive to graduates and good at developing them.

Undergraduate students also tend to value working in a rich environment that includes postgraduates. Worth 20 per cent of the teaching environment category, this indicator makes up 6 per cent of the overall score.

The final indicator in this category is a simple measure of institutional income scaled against academic staff numbers.

This figure, adjusted for purchasing-price parity so that all nations compete on a level playing field, indicates the general status of an institution and gives a broad sense of the general infrastructure and facilities available to students and staff.

This measure is worth 7.5 per cent of the category and 2.25 per cent overall.

Citations — research influence

A university's research influence — as measured by the number of times its published work is cited by academics — is the largest of the broad rankings categories, worth just under a third of the overall score.

This weighting reflects the relatively high level of confidence the global academic community has in the indicator as a proxy for research quality.

The use of citations to indicate quality is controversial — their use in distributing more than £1.5 billion a year in UK research funding under the forthcoming research excellence framework, for example, has been dramatically scaled back after lengthy consultation.

Nevertheless, there is clear evidence of a strong correlation between citation counts and research performance.

The data are drawn from the 12,000 academic journals indexed by Thomson Reuters' Web of Science database. The figures are collected for every university, with data aggregated over a five-year period from 2004 to 2008 (there has been insufficient time for the accumulation of such data for articles published in 2009 and 2010).

Unlike the approach employed by the old rankings system, all the citations impact data are normalised to reflect variations in citation volume between different subject areas. This means that institutions with high levels of research activity in subjects with traditionally very high citation counts will no longer gain an unfair advantage.

Research — volume, income and reputation

As with the teaching category, the most prominent indicator in research volume, income and reputation is based on the results of our reputational survey.

Consultation with our expert advisers suggested that confidence in this indicator was higher than in the teaching reputational survey, as academics are likely to be more knowledgeable about the reputation of research departments in their specialist fields. For this reason, it is given a higher weighting: it is worth 65 per cent here and 19.5 per cent of the overall score.

Some 17.5 per cent of this category — 5.25 per cent of the overall ranking — is determined by a university's research income, scaled against staff numbers and normalised for purchasing-power parity. This is a controversial measure, as it can be influenced by national policy and economic circumstances. But research income is crucial to the development of world-class research, and because much of it is subject to competition and judged by peer review, our experts suggested it was a valid measure.

The research environment category also includes a simple measure of research volume scaled against staff numbers. We count the number of papers published in the academic journals indexed by Thomson Reuters per staff member, giving an idea of an institution's ability to get papers published in quality peer-reviewed journals. This indicator is worth 15 per cent of the category and 4.5 per cent overall.

Some 2.5 per cent of the category — worth just 0.75 per cent overall — is a measure of public research income against an institution's total research income. This has a low weighting to reflect concerns about the comparability of self-reported data between countries.

International mix — staff and students

Our final category looks at diversity on campus — a sign of how global an institution is in its outlook.

The ability of a university to attract the very best staff from across the world is key to global success. So in this category we give a 60 per cent weighting to the ratio of international to domestic staff, making up 3 per cent of the overall score.

The market for academic and administrative jobs is international in scope, and this indicator suggests global competitiveness. However, as it is a relatively crude proxy, and as geographical considerations can influence performance, the weighting has been reduced from the 5 per cent used under our old rankings system.

The other indicator in this category is based on the ratio of international to domestic students. Again, this is a sign of an institution's global competitiveness and its commitment to globalisation. As with the staff indicator, our consultation revealed concerns about the inability to gauge the quality of students and the problems caused by geography and tuition-fee regimes. So the measure receives a 40 per cent weighting and is worth 2 per cent of the final score.


Citation impact: it's all relative

Citations are widely recognised as a strong indicator of the significance and relevance — that is, the impact — of a piece of research.

However, citation data must be used with care as citation rates can vary between subjects and time periods.

For example, papers in the life sciences tend to be cited more frequently than those published in the social sciences.

The rankings this year use normalised citation impact, where the citations to each paper are compared with the average number of citations received by all papers published in the same field and year. So a paper with a relative citation impact of 2.0 is cited twice as frequently as the average for similar papers.

The data were extracted from the Thomson Reuters resource known as Web of Science, the largest and most comprehensive database of research citations available.

Its authoritative and multidisciplinary content covers more than 11,600 of the highest-impact journals worldwide. The benchmarking exercise is carried out on an exact level across 251 subject areas for each year in the period 2004 to 2008.

For institutions that produce few papers, the relative citation impact may be significantly influenced by one or two highly cited papers and therefore it does not accurately reflect their typical performance. However, institutions publishing fewer than 50 papers a year have been excluded from the rankings.

There are occasions where a groundbreaking academic paper is so influential as to drive the citation counts to extreme levels — receiving thousands of citations. An institution that contributes to one of these papers will receive a significant and noticeable boost to its citation impact, and this reflects such institutions' contribution to globally significant research projects.

Simon Pratt is project manager, institutional research, Thomson Reuters