Using citations to measure science is insidious

Some arguments for using citations as a way to measure science

  • Great work becomes highly cited eventually
    • Is this actually true?
    • Selection bias
  • We don’t have anything better

Comparing metrics across fields

Citations are not a normalized or consistent metric across fields. Citation traditions vary across fields - in some domains you only cite a paper if you are leaning directly on their arguments to build your own and in others citations are like a hat tip to people who have done similar work. The most cited paper ever is a biological method (I think an assay) just because biology has a tradition of citing where an experimental method originated. Is the most cited paper the most important paper ever? The answer is pretty inarguably no.^1

The straightforward response to different citation traditions is to only use citations to compare work within a field. But then where do you draw the lines between fields? How can you compare fields? I would also worry that forcing citations to live within fields would also further reenforce the divisions between fields. Categories that are coupled to metrics tend to take on a life of their own.

Great work is cited eventually

One of the most compelling arguments for citations as a metric is that “the most important papers get cited eventually.”

Nobody cites Newton when they calculate force in a paper, so why should we expect other papers that have become part of the intellectual milieu to be highly cited. Instead of importance, citations capture “perceived importance” which is a different thing that also has to do with salience and novelty. Perceived importance is incredibly path dependent, favoring work that is recent enough to be salient in areas where people are publishing many other papers - this roughly describes well-known work done in an area just as it is beginning a hype cycle.

“The most important papers get cited eventually” argument smells a little bit like “the best startups will win in the end.”

Citations are subject to the Malcolm Gladwell effect — often the canonical, most-cited paper on a concept is the most useful/interesting/from a famous author formulation of it, not the original appearance of the idea. Call the former the ‘popularizing paper’ and the latter the ‘originating paper.’ There is nothing wrong with people citing the paper they actually read! However, there is something wrong with then turning around and using those citations as evidence that the funding for the originating paper was poorly spent and funding for the popularizing paper was well-spent.
You could even make an argument that the ‘popularizing paper’ is truly much more important than the ‘originating paper.’ “Ideas are cheap!” However, that seems like an unsettled argument so secretly embedding that philosophical stance in every citation-based piece of meta science seems insidious indeed.

This idea of a ‘popularizing’ vs. ‘originating’ paper suggests one mechanism that citations can be subject to Goodhart's Law by pushing everyone towards popularizing papers. If funding organizations are judged based on whether they fund highly cited papers then they will bias towards research and researchers that are likely to produce popularizing papers and researchers in turn will shift towards work and papers that could potentially become popularizing in order to get funded. This critique does assume that it is possible to know a priori which researchers and work would produce highly-cited papers. I suspect this is an area where people can have a non-evenly-distributed sense of prior probabilities. Some parts of science have much more uncertainty . The insidious part is that (I suspect) people are better at predicting which work is likely to be highly cited because the researcher is a rockstar or the research is riding a hype cycle than they are at predicting which work is likely to be highly cited because it is a significant discovery. If citations are quality metric, this will shift support towards the former category and away from the latter.

Citations as a metric ignores the (fact?) that there are multiple mechanisms for affecting the way we see the world. Some people are not impactful because many people listen to them but because a few powerful people listen to them. Similarly, some work is important because it influences a few pieces subsequent work that themselves are significant. We stand on the shoulders of too many giants to give them all credit.

Citations as a metric screws up citations as a knowledge tool. Abstractly, citations would be like pointers to encapsulated arguments that someone has made elsewhere.

I wonder how citations per paper have changed over time. My hunch is that as they became more important as a measurement tool, total citations per paper skyrocketed. Of course, over the same period of time, the number of researchers and papers that could possibly do citing also skyrocketed. It’s probably impossible to disentangle those effects. It does emphasize the fact that at best, citations are a relative measure of quality at a single point in time. However, it’s not as though you can only cite a fixed number of papers per paper. You could cite zero or you could cite a hundred. Citation printer go brrr. So if paper is cited it’s not really a tradeoff for the author citing it. The don’t even need to have read it.

The upshot of all of this is that I believe the burden of proof is on people who want to use citations as evidence to make statements about science or science management to show why they are better than no metric at all.


^1: I don’t know if it’s even possible to say what the most important paper is, but it is possible to say which ones aren’t it.

Web URL for this note

Comment on this note