Replicability Ranking? Not Quite There Yet
A few days ago, the correspondent of a prominent scientific journal drew my attention to a recent Replicability-Ranking of 100 Social Psychology Departments. “Do you think it is worth reporting?” I was asked in earnest. My answer was brief and very conflicted, and here I'll articulate why.
The idea per se has my full support. We should forget about h-index, citations, or other myopic measures of “impact” and embrace metrics of genuine scientific quality. Public rankings based on such measures might be the most powerful, and yet the most unintrusive levers that can nudge the system in the right direction.
Kudos, therefore, to Ulrich Schimmack for trying to do so. Too bad that, in my humble opinion, his replicability ranking doesn't quite deliver on its promises.
The very name “Replicability-Ranking” seems a little misleading. This ranking is based on what the author called a “Replication-Index”, which is not, disappointingly, about replication. At least not directly. Put very simply, the R-index takes a set of published studies and compares the number of statistically significant results within them to that expected based on their average statistical power.
Anyone hoping to see a counting of replication successes and failures will be disappointed, and rightly so. Only such metric, it seems to me, could legitimately boast the title of “Replication-Index”.
Other indices are based on this same idea, but the R-index claims to overcome their limitations. Based on the description provided by the authors, this might be true to some extent. However, the index remains a statistical tool, which relies on theoretical assumptions (e.g. are included p-values independent of one another?) and methodological choices (e.g. which statistics do we include in the analysis?) whose effects are essentially untested.
To be accepted as an “Index”, let alone “Replication-Index”, this metric and/or its assumptions need to be convincingly validated. Not having yet been published in a peer-reviewed journal does not help the index’s cause much, either.
It only took a mere hinting at these doubts to make the journalist decide against writing about it. All and all, I agree with that choice. Yet look forward to the day when a Replicability Ranking will be the talk of the town.