When we at PunditTracker initially discussed the notion of creating a website to hold pundits accountable, the first question on our minds was: “Why doesn’t this exist already?” There is clearly demand for such a service: the Google search “pundits accountability” yields more than one million results.
As we began developing the site, we quickly realized the answer to our question: Because it is more difficult than it sounds.
Predictions are rarely black-and-white and therefore fail to fit a neat scoring system. Consider the following examples:
- Pundit: “Gold will go up 20% this year.”
Outcome: Gold goes up 19%.
Should the pundit receive some (partial) credit?
- Pundit: “Donald Trump will probably run for president.”
Outcome: Trump does not run.
Was the pundit wrong, or does the hedge “probably” provide an escape hatch?
- Pundit: “Dwight Howard will get traded.”
Outcome: Dwight Howard is still on the Magic.
Given that there is no specified end date, when can this call officially be marked wrong?
- Pundit: “The new iPhone will be groundbreaking.”
Outcome: The new iPhone is released to rave reviews.
How do you define groundbreaking? Isn’t it purely subjective?
From close calls to hedged calls to unbounded calls to subjective calls, pundit predictions are typically bursting with shades of grey. A cynic (realist?) would ascribe this to a deliberate ploy by pundits to garner media attention while evading blame. A recent Wall Street Journal article discussed the brilliance of the “40% rule.” (Our running joke internally is that there is a 49.99% chance that PunditTracker will be a smashing success).
Moreover, even when predictions are black-and-white, scoring them is not necessarily so. A fairly obvious scoring system employs what’s called a “hit rate” or “batting average” approach: take the number of correct calls and divide it by the number of total calls. If I make ten calls and get seven right, my hit rate is 70%. The problem is that this figure is useless without context. If I predict each day that the sun is going to rise tomorrow, I am (hopefully) going to have a perfect hit rate. Using this system, I receive a score twice as high as the pundit who predicted the 2008 financial collapse but then missed a trivial call the following year. That hardly seems fair, which suggests that predictions should somehow be calibrated for “boldness.”
We have wrestled with all these issues—and more—while designing our scoring system. We will share some of our solutions with you over the coming weeks. As always, we welcome any feedback.