Mathematical Elegance Coupled to Computational Efficiency
by Sam Savage
After receiving his PhD in Decision Analysis from Stanford, Tom spent 40 years in analytical consulting, including an 18-year stint at the prestigious Strategic Decisions Group, where he was Worldwide Managing Director. Tom was struck by general management’s inability to compute uncertainties, and has developed a flexible family of continuous data-driven probability distributions based on pragmatic consulting experience. The Metalog distributions, as he calls them, combine mathematical elegance with computational efficiency [i], [ii], [iii].
To put Metalogs in perspective, I remind the reader that the theory of probability and statistics is powerful and elegant. But so is the steam locomotive, and they were developed around the same time. By the 1970’s, computational approaches to statistics such as bootstrapping arose. These were based on the brute force of computer simulation instead of 19th century calculus. Although Metalogs are also based on simple mathematical principles, they are intended to be fit to data sets, not adjusted by parameters such a mean and standard deviation. And they output the ideal food for simulations: inverse cumulative functions. These functions are the most common way to generate random variates in simulations. The Excel function NORMINV(rand(),Mean,Sigma), for example, will produce Normal random variables with the specified mean and standard deviation with every press of the Calculate Key.
The informative Metalog Distributions website contains extensive documentation and implementations in numerous environments, including Excel and R. We have already implemented some of the Metalogs in the SIPmath™ Modeler Tools as described below. You can also download any of the Excel templates from the Metalog website and use them with the tools. Just be sure to replace the “random” cells in the templates with either RAND() or HDR generators from the SIPmath tools.
It is still early innings for Metalogs. For example, last year Tom and I discovered how to generalize the concept to solve a vexing problem in simulation. Suppose you are modeling an uncertain number of risk events, such a transformer failure. Each failure will cause a fire with a skewed, lognormally, distributed adverse consequence. On a given simulation trial you may get 3, 5, 8 or some other number of failures, and need to add up 3, 5, 8 or some other number of lognormal distributions. But you don’t know in advance how many you will have so you don’t know how many you need to generate. Until our approach with the Generalized Metalog, there was apparently no closed form solution for expressing sums of lognormals. We (mostly Tom) wrote this up for publication, and with his help we built sums of lognormal and triangular distributions into the Enterprise SIPmath™ Modeler Tools. Tom is now Chair of Data-Driven Distributions at ProbabilityManagement.org, and we will keep you apprised of future Metalog developments, several of which are in progress.
Using Metalogs in the SIPmath Tools
All the latest versions of the tools support the SPT (symmetric percentile triplet) Metalog, which can produce a wide range of distribution shapes as shown below [iv].
Furthermore, Tom has written a nice tutorial on their use in the SIPmath Tools.
The sums of identical triangular and lognormals are implemented in the Enterprise version of the tools, as described below.
Sum of Lognormal Risk Consequences in the SIPmath Tools
Suppose your organization is subject to a risk characterized by an average of 5 adverse events per year, each with a consequence that is lognormally distributed with a 50th percentile of $1Million, and a 90th percentile of $3Million.
The steps below show how to model this situation in the Enterprise SIPmath Tools
1. Poisson number of events
After initializing the file, we model the number of events per year as a Poisson variable.
2. Creating a sum of IID lognormals based on the Poisson number of events
We then create a lognormal in cell E5, checking the box on Sum multiple IIDs box (IID stands for independent, identical distributions). The number of lognormals to sum will be the number of events generated in cell C5, which varies with each simulation trial.
3. Specifying Risk as Output
We now specify E5 as an output of the simulation named “Risk” (cell E4) and denote cells F4 through G7 for a sparkline histogram.
4. Querying Statistics
Once the output is specified, you may specify any statistics, such as percentiles as shown below.
Now if you change any of the inputs (C3, E3, F3) the model will instantly update. And like all models created with the SIPmath modeler tools, the file is pure Excel, and uses no macros or add-ins, so you may share it with 1 billion of your closest friends.
[i] Keelin, T.W. and Powley, B.W., 2011. Quantile-parameterized distributions. Decision Analysis, 8(3), pp.206-219. https://pubsonline.informs.org/doi/abs/10.1287/deca.1110.0213
[ii] Keelin, 2016. The Metalog Distributions. Decision Analysis, 13(4), pp.243-277. https://pubsonline.informs.org/doi/10.1287/deca.2016.0338
[iii] MetalogDistributions.com
[iv] From http://metalogdistributions.com/images/TheMetalogDistributions.pdf