Scoring Challenges

January 5, 2012  |  Finance, Politics, Sports

When we at PunditTracker initially discussed the notion of creating a website to hold pundits accountable, the first question on our minds was: “Why doesn’t this exist already?” There is clearly demand for such a service: the Google search “pundits accountability” yields more than one million results.

As we began developing the site, we quickly realized the answer to our question: Because it is more difficult than it sounds.

Predictions are rarely black-and-white and therefore fail to fit a neat scoring system. Consider the following examples:

  • Pundit: “Gold will go up 20% this year.”
    Outcome: Gold goes up 19%.
    Should the pundit receive some (partial) credit?
  • Pundit: “Donald Trump will probably run for president.”
    Outcome: Trump does not run.
    Was the pundit wrong, or does the hedge “probably” provide an escape hatch?
  • Pundit: “Dwight Howard will get traded.”
    Outcome: Dwight Howard is still on the Magic.
    Given that there is no specified end date, when can this call officially be marked wrong?
  • Pundit: “The new iPhone will be groundbreaking.”
    Outcome: The new iPhone is released to rave reviews.
    How do you define groundbreaking? Isn’t it purely subjective?

From close calls to hedged calls to unbounded calls to subjective calls, pundit predictions are typically bursting with shades of grey. A cynic (realist?) would ascribe this to a deliberate ploy by pundits to garner media attention while evading blame. A recent Wall Street Journal article discussed the brilliance of the “40% rule.” (Our running joke internally is that there is a 49.99% chance that PunditTracker will be a smashing success).

Moreover, even when predictions are black-and-white, scoring them is not necessarily so. A fairly obvious scoring system employs what’s called a “hit rate” or “batting average” approach: take the number of correct calls and divide it by the number of total calls. If I make ten calls and get seven right, my hit rate is 70%. The problem is that this figure is useless without context. If I predict each day that the sun is going to rise tomorrow, I am (hopefully) going to have a perfect hit rate. Using this system, I receive a score twice as high as the pundit who predicted the 2008 financial collapse but then missed a trivial call the following year. That hardly seems fair, which suggests that predictions should somehow be calibrated for “boldness.”

We have wrestled with all these issues—and more—while designing our scoring system. We will share some of our solutions with you over the coming weeks. As always, we welcome any feedback.

Comments

comments

 


3 Comments


  1. another aspect of boldness to consider is *when* a prediction is made and whether it runs with or against the prevailing conventional wisdom.

    for example, predicting Romney as the GOP nominee today is nearly worthless compared to predicting it in 2011 or, better still, right after the passage of ACA (which many pundits at the time claimed would doom Romney’s primary chances).

  2. Perhaps the usage of a Brier Score might be of assistance. This would require degree of confidence in prediction though.

    http://en.wikipedia.org/wiki/Brier_score

  3. The reason this hasn’t been done already is probably because it’s a very labor intensive undertaking. Accuracy is also a concern. If you make a mistake, the pundit in question could get very angry.

    I’ve selectively pointed out some pundits/gurus predictions over the years. I recall Peter Schiff sending me a comment because I didn’t include all his stock picks, but only mentioned one (which turned out to be a dud).

    I wish you the best of luck. I will definitely be visiting from time to time to see what you come up with.

Trackbacks

  1. PunditTracker’s Scoring System | PunditTracker
  2. NFL Pundits: 2011-12 Report Card | PunditTracker

Leave a Reply

Current month ye@r day *

  1. Have a pundit you would like us to track? Or some general feedback on the site? Let us know!