Image from Paramount at Hagerty
The Monte Carlo simulation (MCS, hereafter) was a method developed by scientists during WWII to test for possible outcomes involving the use of the atom bomb. When used in the realm of investing, MCS is akin to a crystal ball for investors. Well, not quite, of course. It tries to come as close as possible to foretelling the future by running different simulations to determine risk exposure of assets, on the one hand, and risk tolerance for investors, on the other hand. As you might have guessed, its game theoretical name derives from the luxurious casinos of Monte Carlo.
MCS can be used on any type of asset, but it tends to focus on assets that are volatile and uncertain. Predictions can include looking at various levels of risk commitment — from “going all in” to more risk-averse investments.
A key part of this method of simulation is taking into account a range of multiple variables. This is often referred to as “multivariate modeling”. Trials are typically run many times to determine possible outcomes and their respective probabilities of occurring.
You may have seen the scene from the film War Games (1983), where the supercomputer Joshua runs simulations of nuclear war.
Screenshot from YouTube
Because the number of variables tested can be enormous, and particularly because the occurrence of each variable itself has a probability of occurring within any given scenario, modeling can get complicated from the start and require thousands of trials. To help narrow the number of likely variables, normal distribution and standard deviation methods are used.
Normal distribution is a way of weighting the occurrence of each variable equally with respect to the most probable event (or the mean). For example, if we’re interested in the event of human growth in terms of height, a normal distribution will consider each height equally in order to see what occurs more frequently. The most probable occurrence is the mean, and we can see with human growth in terms of height, the mean is in the middle between the shortest and tallest measurements. Normal distribution graphs therefore take the shape of a bell curve.
Graph from OERTX
Standard deviation is a further analysis of the mean revealed in normal distribution. It measures the amount of variation or dispersion of the data in relation to the mean. So, in our example of height, it would look at the range of heights closest to the shortest and tallest heights.
If there is a high amount of variation, then the bell curve will be flatter. The more dispersed or spread out the data is, the more the standard deviation metric will be greater than zero. In the image below, the top bell curve has a higher level of dispersion than the bottom one.
Image from the National Library of Medicine
A genuine normal distribution ends up with a standard deviation called the “68–95–99.7 rule”:
“Assuming a probability distribution is normally distributed, approximately 68% of the values will fall within one standard deviation of the mean, about 95% of the values will fall within two standard deviations, and about 99.7% will lie within three standard deviations of the mean.”
Using both methods of normal distribution and standard deviation, analysts can determine the most appropriate constraints on the variables relevant to the test. So, for example, if we were gambling on whether or not a woman would reach the height of 4’10” without any knowledge of her parents or genetic information, we’d be less likely to believe it had a high degree of probability of occurring.
Why the Median and Not the Mean Matters
So why should you use the median and not the mean calculation when using the Monte Carlos situation?
* * * * *
The mean is the average of all values in a data set. The median is the middle point between the highest and lowest values. Consider the following values:
2, 2, 2, 2, 4, 5, 9, 9, 9, 9
The mean is determined by adding all the values and dividing the sum by the number of values in the set:
2 + 2 + 2 + 2 + 4 + 5 + 9 + 9 + 9 + 9 = 53
53 / 10 = 5.3
The mean, or average, is 5.3.
* * * * *
To calculate the median, there are two methods depending on conditionals.
Sort the values from smallest to largest.
If the number of values is odd, the median is the middle data point in the list.
If the number of values is even, the median is the average of the two middle data points in the list.
In our example, the number of values is even (i.e. 10). If we take the two middle values (i.e. 4, 5) and average them, we get 4.5.
4 + 5 = 9
9 / 2 = 4.5
The median, or middle point, is 4.5.
What’s important to take away from the difference between the mean and median with respect to the Monte Carlo simulation is that the median calculation is not affected by the number of values at the extreme limits of distribution. A small number of values at either extreme might skew the mean risk calculation, while a median risk calculation would represent a more accurate probability.
For example, consider these values:
3, 5, 50, 50, 55, 56, 58
The mean calculation:
3, 5, 50, 50, 55, 56, 58 = 277
277 / 7 = 39.57
The median calculation is the middle point:
3, 5, 50, 50, 55, 56, 58
39.75 versus 50 is a significant difference in what the values may represent. And this difference becomes more significant the more the majority of values reside within the 50 to 58 range. The values of 3 and 5 really then appear as outliers.
This article originally appeared on Medium and is a part of the Crypto Industry Essentials educational program presented by 1.2 Labs (formerly The Art of the Bubble).
Though this article is credited to me, it contains some written material by Sebastian Purcell, PhD from his 1.2 Labs education series on cryptocurrencies.
If you found this helpful, Subscribe to 1.2 Lab’s free newsletter.
Join us on Discord for live chat and daily updates.