Why Do We Cite Papers?

Excerpts from a conversation with a PhD student in 2005.

First, definitions.

A reference is the publication information about a paper. It should have enough information for a reader to find the paper.

A citation appears in the paper and points to the reference, which is usually at the end of the paper.

When a paper does not cite a key reference (or several), there is a concern. There are actually several possibilities:

  1. The references should be there just as a matter of record.
  2. The references tell the reader that the author knows the field.
  3. If the author does not know the key papers, he or she may well be making mistakes in the work. There are at least four categories of mistakes:
    1. Repeating work that was already done
    2. Finding solutions to problems that are not as good as already published solutions
    3. Finding solutions that are less complete than previously published solutions
    4. Going in the wrong direction

If the author is lucky, then the only issue is number (1). Issue (2) will make it harder to get the paper accepted, for example, if the reviewer doubts that the author is sufficiently prepared to work in the area. If the problem is (3), the paper should not be published, and if it is published, it makes the author look dumb and the conference or journal irresponsible. If references are missing but the work is still sound, the paper should be accepted and the author should be told of the missing references. That is, a lack of references in and of itself should not be a reason for rejecting a paper.

Of course, I have omitted an all too common issue: The author omitted one of the reviewer's papers and missed the chance to stroke the reviweer's ego. Judging a paper on whether it makes our egos happy is unscientific and unprofessional. The fact that software engineering authors have to worry about it is an unfortunate comment on the lack of maturity of our field.