. 2015 Aug 19;10(8):e0134794.

doi: 10.1371/journal.pone.0134794. eCollection 2015.

The Pagerank-Index: Going beyond Citation Counts in Quantifying Scientific Impact of Researchers

Upul Senanayake¹, Mahendra Piraveenan¹, Albert Zomaya²

Affiliations

¹ Centre for Complex Systems Research, Faculty of Engineering and IT, The University of Sydney, NSW 2006, Australia.
² Centre for Distributed and High Performance Computing, School of Information Technologies, The University of Sydney, NSW 2006, Australia.

PMID: 26288312
PMCID: PMC4545754
DOI: 10.1371/journal.pone.0134794

The Pagerank-Index: Going beyond Citation Counts in Quantifying Scientific Impact of Researchers

Upul Senanayake et al. PLoS One. 2015.

. 2015 Aug 19;10(8):e0134794.

doi: 10.1371/journal.pone.0134794. eCollection 2015.

Authors

Upul Senanayake¹, Mahendra Piraveenan¹, Albert Zomaya²

Affiliations

¹ Centre for Complex Systems Research, Faculty of Engineering and IT, The University of Sydney, NSW 2006, Australia.
² Centre for Distributed and High Performance Computing, School of Information Technologies, The University of Sydney, NSW 2006, Australia.

PMID: 26288312
PMCID: PMC4545754
DOI: 10.1371/journal.pone.0134794

Abstract

Quantifying and comparing the scientific output of researchers has become critical for governments, funding agencies and universities. Comparison by reputation and direct assessment of contributions to the field is no longer possible, as the number of scientists increases and traditional definitions about scientific fields become blurred. The h-index is often used for comparing scientists, but has several well-documented shortcomings. In this paper, we introduce a new index for measuring and comparing the publication records of scientists: the pagerank-index (symbolised as π). The index uses a version of pagerank algorithm and the citation networks of papers in its computation, and is fundamentally different from the existing variants of h-index because it considers not only the number of citations but also the actual impact of each citation. We adapt two approaches to demonstrate the utility of the new index. Firstly, we use a simulation model of a community of authors, whereby we create various 'groups' of authors which are different from each other in inherent publication habits, to show that the pagerank-index is fairer than the existing indices in three distinct scenarios: (i) when authors try to 'massage' their index by publishing papers in low-quality outlets primarily to self-cite other papers (ii) when authors collaborate in large groups in order to obtain more authorships (iii) when authors spend most of their time in producing genuine but low quality publications that would massage their index. Secondly, we undertake two real world case studies: (i) the evolving author community of quantum game theory, as defined by Google Scholar (ii) a snapshot of the high energy physics (HEP) theory research community in arXiv. In both case studies, we find that the list of top authors vary very significantly when h-index and pagerank-index are used for comparison. We show that in both cases, authors who have collaborated in large groups and/or published less impactful papers tend to be comparatively favoured by the h-index, whereas the pagerank-index highlights authors who have made a relatively small number of definitive contributions, or written papers which served to highlight the link between diverse disciplines, or typically worked in smaller groups. Thus, we argue that the pagerank-index is an inherently fairer and more nuanced metric to quantify the publication records of scientists compared to existing measures.

PubMed Disclaimer

Conflict of interest statement

Competing Interests: Since all authors are professional scientists, a fair research productivity index for scientists, such as the pagerank-index presented in the paper, is in the common interest of all authors. This does not alter the authors’ adherence to PLOS ONE policies on sharing data and materials.

Figures

**Fig 1. A citation network of documents with differing impacts.**
Document ‘A’ is a document with seemingly high impact: therefore citation from document A to document X could be weighed by the ‘impact’ of document A. However, it is clear that document A receives its high impact status from documents P, Q, R and S, which are themselves low impact documents. The documents P, Q, R and S could have been deliberately created to give more credibility to A. Similarly, document B which is also a high impact document receives its high impact status from low impact documents. Therefore, the citation counts of documents cannot be directly used to weigh the citations, since these weights themselves could be manipulated. A more nuanced approach is therefore necessary.

**Fig 2. The process of computing pagerank-index.**
Stage I involves running pagerank algorithm on the citation network and obtaining a pagerank value for each paper. Stage II involves assigning a suitably weighted proportion of each pagerank value to the authors of the corresponding paper. Stage III involves summing all pagerank value ‘shares’ an author has obtained, comparing it with other authors in the community and assigning a percentile value to that author accordingly. The considered community could be the world scientific community at large, or a subset thereof.

**Fig 3. Spread of the h-index for each manipulative and non-manipulative author (as absolute values) in the first simulation scenario.**
For authors of similar seniority, the ‘manipulative’ author group has a clear advantage.

**Fig 4. Spread of the pagerank-index for each manipulative and non-manipulative author (as absolute values) in the first simulation scenario.**
Neither group of authors have a clear advantage here.

**Fig 5. Variation of average h-index and pagerank-index for non-manipulative and manipulative authors at each timestep for simulation scenario 1.**
The difference is much smaller between the two groups when pagerank-index is considered.

**Fig 6. Variation of h-index and pagerank-index for highest ranking non-manipulative and manipulative authors at each timestep for simulation scenario 1.**

**Fig 7. The spread of h-index for collaborative and non-collaborative authors (as absolute values) in scenario 2.**
For authors of a similar level of seniority (as indicated by their IDs), the ‘collaborative’ authors have a clear advantage.

**Fig 8. The spread of pagerank-index for collaborative and non-collaborative authors (as absolute values) in scenario 2.**
No group of authors have a clear advantage over the other group.

**Fig 9. Variation of average h-index and pagerank-index for collaborative and non-collaborative authors at each timestep in simulation scenario 2.**
The difference between groups is much smaller when pagerank-index is used.

**Fig 10. Variation of h-index and pagerank-index for highest ranking collaborative and non-collaborative authors at each timestep in simulation scenario 2.**
The difference between groups is much smaller when pagerank-index is used.

**Fig 11. The spread of h-index for quantity oriented authors and quality oriented authors (as absolute values) in scenario 3.**
Considering authors with the same level of seniority (as indicated by the IDs), the ‘quantity-oriented’ authors have a clear advantage over ‘quality-oriented’ authors.

**Fig 12. The spread of h-index for quantity oriented authors and quality oriented authors (as absolute values) in scenario 3.**
No group of authors have a clear advantage over the other group.

**Fig 13. Variation of average h-index and pagerank-index for quantity oriented authors and quality oriented authors at each timestep in scenario 3.**
The difference between groups is much smaller when pagerank-index is used.

**Fig 14. Variation of h-index and pagerank-index for highest ranking quality oriented authors and quantity oriented authors at each timestep in scenario 3.**
The difference between the highest ranking authors is much smaller when pagerank-index is used.

Fig 15. Part of the collaboration network highlighting authors *R. Han* (h-index: 8, pagerank-index: 79.6%), *X. Xu* (h-index: 8, pagerank-index: 95.1%), *J. Wu* (h-index: 1, pagerank-index: 89.3%), and *M. Shi* (h-index: 3, pagerank-index: 79.1%) in the field of Quantum Game Theory.

**Fig 16. Part of the collaboration network highlighting the prominent authors from the Quantum Game Theory Google Scholar profile.**
The details of the highlighted authors are listed in Table 2.

**Fig 17. Part of the collaboration network highlighting P. Frackiewics (h-index:1 (h-index percentile 55.5%),pagerank-index:98.4%).**
It is clear that this author plays an important role in the field by being the ‘bridge’ between two sets of authors who work perhaps in two sub-fields. Note that though the collaboration network is not an input in computing the pagerank-index, the pagerank-index is able to recognize and reward authors who perform such an important role in the development of the field, as indicated by the relatively high pagerank-index of this author. The h-index, being a relatively simplistic citation count measure, fails to recognize this fact.

**Fig 18. The h-index and pagerank-index of the best 5% authors (according to h-index) in the field of quantum game theory.**
Since the pagerank-index is a percentile, percentile values were used for the h-index as well, rather than actual h-index values. Note here that the pagerank-index value varies from 70% to 100%. That is, some authors who are among the top 5% in terms of h-index are not even among the top 25% when pagerank-index is considered.

**Fig 19**
(A) The variation of h-index and pagerank-index for two groups of authors during the evolution of quantum game theory field. The x-axis corresponds to each new paper added and the time line of the evolution is from 1955 to 2014. One group of authors are classified as ‘collaborative’ and another group as ‘non-collaborative’. The way this classification was done is explained in the text. It is clear that while the h-index favours the ‘collaborative’ authors, the pagerank-index, in general, tends to favour the ‘non-collaborative’ authors. (B) The average ‘papershare’ of collaborative and non-collaborative authors during the evolution of quantum game theory field. The ‘papershare’ is calculated as the summation of proportional contributions made to papers. For example, if an author has contributed two papers each with two other co-authors, he has a total of 4/3 paper-shares. It is clear that the ‘non-collaborative’ authors work harder and have more ‘paper-shares’ than collaborative authors. Contrasting with part (A), we may see that the pagerank-index highlights this fact by favouring the ‘non-collaborative’ authors, while the h-index arguably unfairly favours collaborative authors who on average produce less ‘paper-shares’.

**Fig 20. The h-index and pagerank-index of the best 5% authors (in terms of h-index) in the HEP-TH dataset.**
Since the pagerank-index is a percentile, percentile values were used for the h-index as well, rather than actual h-index values. Note here that the pagerank-index value varies from 65% to 100%. That is, some authors who are among the top 5% in terms of h-index are not even among the top 25% when pagerank-index is considered.

See this image and copyright information in PMC

References

1. Kozak M, Bornmann L (2012) A new family of cumulative indexes for measuring scientific performance. PloS one 7: e47679 10.1371/journal.pone.0047679 - DOI - PMC - PubMed
1. Hirsch JE (2005) An index to quantify an individual’s scientific research output. Proceedings of the National Academy of Sciences of the United States of America 102: 16569–16572. 10.1073/pnas.0507655102 - DOI - PMC - PubMed
1. Bornmann L, Daniel HD (2007) What do we know about the h index? Journal of the American Society for Information Science and Technology 58: 1381–1385. 10.1002/asi.20609 - DOI
1. Bornmann L, Mutz R, Daniel HD (2008) Are there better indices for evaluation purposes than the h-index? A comparison of nine different variants of the h-index using data from biomedicine. Journal of the American Society for Information Science and Technology 59: 830–837. 10.1002/asi.20806 - DOI
1. Costas R, Bordons M (2007) The h-index: Advantages, limitations and its relation with other bibliometric indicators at the micro level. The Hirsch Index 1: 193–203.

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

LinkOut - more resources

Full Text Sources
Other Literature Sources
- scite Smart Citations
Research Materials
- NCI CPTC Antibody Characterization Program
Miscellaneous
- NCI CPTAC Assay Portal

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

The Pagerank-Index: Going beyond Citation Counts in Quantifying Scientific Impact of Researchers

Affiliations

The Pagerank-Index: Going beyond Citation Counts in Quantifying Scientific Impact of Researchers

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

References

MeSH terms

LinkOut - more resources

Full Text Sources

Other Literature Sources

Research Materials

Miscellaneous