Extracting SIPs from Historical Data
By Pace Murray and Dr. Sam L. Savage
When toddlers encounter their first rolling ball, they often crawl to where the ball was a second ago and then iterate the process by crawling to where it was a second later, and so on, until it finally stops, and they catch up. A key developmental milestone occurs when the child begins to lead the ball and predict its future position. Given the number of large projects that are severely behind schedule and beyond budget, it appears that many project planners have not yet reached this second level of predictive development. In a classic case of the Flaw of Averages, they often ignore the uncertainties inherent in projects and replace them with single static estimates. But help is on the way.[1]
Kahneman and Flyvbjerg
Observing this sorry state of prediction, the late Nobel Laureate, Daniel Kahneman defined what he called reference-class forecasts. Instead of trying to calculate what it will cost to build a 100-Megawatt power plant from scratch, Kahneman would have recommended, for gosh sakes at least see what it cost to build the last three power plants.
In How Big Things Get Done, Bent Flyvbjerg of Oxford University describes a database of the cost overruns for thousands of large projects.[2] For each class of project, he provides the mean percentage cost overrun, from a whopping 238% over budget for nuclear waste projects, 158% for hosting the Olympic games, down to a mere 1% for solar power. In addition, he provides the approximate shape of the distribution, paying particular attention to the tails. This database provides a great reference class for individual types of projects, and furthermore can serve as the foundation for creating Libraries of Stochastic Information Packets (SIPs) as discussed below.
SIP Libraries for Construction
Stochastic Data can come in many forms and is as old as uncertainty. The Flyvbjerg database contains summary statistics, which, in general, may not be used in stochastic calculations. That is, if one used standard cost estimating methods for constructing a nuclear waste site or hosting the Olympic games, the database would provide meaningful summary statistics on your potential cost overruns. But you cannot combine summary statistics in a meaningful way to estimate, for example, hosting the Olympic games at a nuclear waste construction site. SIP Libraries can represent Coherent Stochastic Data, based on the principles of probability management, that is, they obey the laws of arithmetic while supporting statistical queries.
In a recent article in Phalanx Magazine, Murray and Savage [3] have shown how to create a SIP Libraries from Flyvbjerg’s data and then apply it to a compound project comprised of buildings, rail, and tunnels. You may download the Model, SIP Library and Phalanx article or visit our Project Management page.
Pace Murray is an Army captain with 8 years of service in the Infantry. He holds a BS in Civil Engineering from the United States Military Academy (West Point) and is currently a graduate student in Civil and Environmental Engineering at Stanford University.
Dr. Sam Savage is Executive Director of ProbabilityManagement.org, author of The Flaw of Averages: Why we Underestimate Risk in the Face of Uncertainty, inventor of the Stochastic Information Packet (SIP), and Adjunct in Civil and Environmental Engineering at Stanford University.
A View from the Trenches
By Jimmy Chavez
Twenty years ago, I was given the responsibility of bidding and project managing multiple 8-figure contracts for my family's heavy civil construction business based in Southern California. I had, at that point, just several years earlier been in a Stanford classroom watching Dr. Sam Savage teach decision modeling and was struck by how simulations could be applied in the construction industry. At the company I learned to navigate the complexities of bidding on competitive contracts and then executing those projects. Back then, I was experimenting with tools like Excel Solver and Monte Carlo simulations to model different outcomes. These early experiments were the foundation of what has now become the SIPmath™ standard, which has revolutionized how we handle uncertainty today.
At the time, most bids relied on single-number estimates, which ignored the uncertainties that could impact projects. Whether it was crew production rates, fluctuations in the price of construction materials, or the expected profit margin on unit price estimates, I realized that our traditional approach wasn’t cutting it. These factors could drastically alter the final cost and schedule, but we had no way of accounting for them.
Monte Carlo simulations changed all of that. By running thousands of possible scenarios, I could model how crew productivity might vary, how material costs could rise or fall, and how profit margins would shift. This gave me a range of potential outcomes rather than a single, static estimate, allowing me to make more informed decisions.
Fast forward to today. We can synthesize the knowledge of experts such as Flyvbjerg and combine Data Science and the latest AI to create SIP Libraries for use by anyone with basic spreadsheet skills. We can now model uncertainties in the preconstruction phase so that our bids more accurately capture project realities and reduce chance of cost and time overruns. Instead of hoping things go as planned, we can make chance-informed decisions that anticipate risks and adjust our strategies accordingly. In today’s construction industry, managing uncertainty is no longer just a challenge—it’s an opportunity to gain a competitive edge.
Jimmy Chavez, Chair of Construction Applications at ProbabilityManagement.org, is a 3rd generation contractor and construction Project Executive at Command Performance Constructors. VP of Operations and Division Lead for federal contracting. Experienced in construction project management and probabilistic estimating.