On the Mark! Our Predictive Scoring Model Wins The Marathon

As data geeks, there is nothing quite as thrilling as building a predictive model that nails the EXACT outcome of an event or client initiative. We even do it just for fun.

Take this past Sunday's Pittsburgh Marathon, for example. A group of us signed up to run, and then decided, "Let's try and predict the average finish time this year!" So, we built a simple model based on eight years' worth of race data and boldly published our prediction last week: that, based on the forecasted race-day temperature of 54 degrees, the average Pittsburgh Marathon finish time would be 4 hours and 33 minutes. This week, we've got the results.

How disappointed we were to find we were off by 10 minutes using the forecast's degrees–until we punched in the race-day actual temperature of 48° and found our model nailed it EXACTLY, at 4 hours and 23 minutes!

So congrats, marathon runners, for running 26.2 miles through the city at what ended up being the fastest finish since the race started back up in 2009. And thanks for the thrill–you validated our predictive model!



Wouldn't it be great to be able to predict nearly exact results nearly all the time? Sometimes very simple models (like ours, in this case) can be advantageous, and the data to feed them is often at your fingertips. Just using a single predictor variable (in our model, average race-day temperature) turned out to be highly effective. More complex objectives obviously can layer in additional variables, as we could have done with equally accessible data points, such as trends in age, gender, and country of origin of each runner.

While the idea of predictive analytics may sound intimidating to some, the premise is simple: solving a problem using data to find patterns and trends in order to predict behaviors and outcomes. In fact, predictive algorithms are used in everyday activities that you probably aren’t even aware of. For instance, when you go to purchase something online, retailers are collecting and analyzing your shopping behavior and purchase history data to predict your future purchases.

In today’s “Data Do-or-Die” world, data is being collected at rates like no other, and the ability to understand and leverage this data is more important than ever. IBM reports that 90% of that data in the world today has been created in the last two years alone. Predictive analytics is just one way to take advantage of the rich insights that can be drawn from your data, and we have demonstrated above that just a few simple data points can lead to accurate and powerful insights.