- A histogram can be used to visually interrogate the distribution of a security’s returns.
- These returns typically resemble a bell-shaped curve when viewed as a histogram.
- A random variable that follows this type of bell-shaped distribution is said to follow a normal distribution.
- There are certain characteristics of a normal distribution that can be helpful when investigating the returns of stocks, given an understanding of the assumptions involved in such analysis.
Can’t Live Without It
Investing is all about taking on risk. Meaning in order to grow your capital, you must take on some sort of risk. Given this, it’s important to understand that risk is your friend, not your enemy. Nevertheless, this doesn’t mean you need to treat your hard-earned money as if you’re gambling in a casino.
With a firm understanding of risk, you can place your financial bets with greater confidence, taking on the types of risk that are most likely to lead towards favorable outcomes, while doing your best to avoid the types of risk that are most likely to go against you.
That’s what sound risk management is all about. Thus, the goal of this series is to provide a solid understanding on which types of risk can be controlled (and thus should be mitigated as much as possible); and which types of risk can’t be controlled (and thus should be avoided, if possible). There are many tools that we can use to accomplish all this, and in this post, we’ll examine one more way to provide some insight in assessing the riskiness of an investment.
Intro to Histograms
Accordingly, another way to examine the dispersion of returns (beyond the standard deviation measure mentioned in Part 2 of this series) is to look at how the returns are distributed from their smallest value to their largest in a chart, known as a histogram. From Table 1 below, we see that the average return for Stock A (first introduced in Part 2 of this series) is 1.0%, with a standard deviation of 2.1%. This means we will most likely find the monthly return of Stock A to be around 1.0% +/- 2.1%, or between -1.1% and 3.1%. However, a histogram can provide us with even more granular insight into how these returns are distributed.
With a histogram chart, the x-axis shows the monthly returns sorted and grouped from smallest to largest, while the y-axis shows the frequency of occurrences within each group. As an example, in Figure 1 below, we see a histogram for Stock A.
Notice, that each month’s return is bucketed into one of the following six groups: “less than -4%”, “-4% to -2%”, “-2% to 0%”, “0% to 2%”, “2% to 4%”, “greater than 4%”. Thus, April’s return of -2.0% ends up in the “-4% to -2%” group; while February’s return of 0.0%, and May’s return of -0.1% both end up in the “-2% to 0%” group; and so on.
Again, the purpose of this histogram chart is to better understand where the returns are concentrated. For example, based upon Figure 1, we can see the dispersion of returns more clearly. Most of the returns are grouped towards the center of the distribution, and there are no returns in the “less than -4%” group or the “greater than 4%” group. In addition, notice how the returns in the histogram from Figure 1 looks a like a bell (or a hill). This bell-shaped curve is something that occurs often when exploring randomly occurring situations, so let’s investigate this phenomenon in more detail.
Everything Is Normal
As another example, Figure 2 below shows the annual real returns of the S&P 500® Index as a histogram. For this chart, real returns means that the returns are adjusted for inflation to allow a more comparable comparison across years. This was accomplished by backing out the annual change in the Consumer Price Index for Urban Consumers (CPI-U) from each year’s return in order to come up with an estimate for the S&P 500® Index’s inflation-adjusted return for each year.
Notice again, the bell-shaped (or hill-shaped) distribution of this larger dataset of returns, where most of the data points are concentrated towards the center of the chart. As mentioned, this bell-shaped curve is noticeable with many other randomly occurring situations when plotted as a histogram. In these situations we tend to see most of the occurrences clustering around an average value, with less and less observations occurring the further away get from the average, both above and below.
As an example, the average height of adult males in the Netherlands is just above 6 feet; but there are very few adult males below 5 feet and above 7 feet, and essentially none below 4 feet and above 8 feet. Other examples of this type of bell-shaped distribution include the shoe size of adult females in your hometown, the examination scores of students on a test, or even the outcomes of rolling a pair of fair dice (as depicted in Figure 3).
Once again, this histogram depicts a bell-shaped (or hill-shaped) chart. The way to interpret Figure 3 is that the most likely outcome is a roll of a 7, while the smallest outcome is a 2, and the largest is a 12 (as noted by the x-axis of the chart). The y-axis of the chart represents the probability of a given outcome; e.g., the chance of rolling a 7 is 6/36 or about 16.7%, while the chance of rolling a 2 is 1/36 (or about 2.8%) and the chance of rolling a 12 is also 1/36 (again about 2.8%).
This bell-shaped curve occurs so often that statisticians call this a bell curve, or a normal distribution when we associate these types of histograms to truly random situations.
The normal distribution is the well-known bell-shaped curve depicted below (Figure 4). The bell-shaped curve comes from a statistical tendency for outcomes to cluster symmetrically around the mean (or average). —FinanceTrain
In statistics, a random phenomenon whose outcomes can be measured is called a random variable. When we say a stock’s return behaves randomly, we’re basically saying that we can’t use past information to tell us what the return for the next period will be. Given that the returns of a stock or stock index appear to be random in nature, we can draw some important insights by assuming that the movements of these market instruments behave like a random variable whose returns are normally distributed.
Thus, if we are to believe that the returns of a stock behaves randomly, similar to the rolling of a pair of dice, then we can use our standard deviation measure along with a histogram to obtain a deeper insight into a stock’s returns. Accordingly, a more general form of a histogram for a random variable that follows a normal distribution is presented in Figure 4 above, where μ represents the average value of a random value, and σ represents its standard deviation.
Of course, there are many other assumptions at play here, which we’ll need to consider. But rest assured, we’ll dig into all this concern in Part 4 of this series. For now, let’s take this randomness/normal distribution assumption at face value. So an interesting property of normally distributed random variables is that 68% of outcomes like within plus or minus one standard deviation from their average, while 95% occur within two standard deviations from the average, and 99.7% within three standard deviations.
The percentages 68%, 95%, and 99.7% come directly from a mathematical derivation in probability theory, the scope of which is beyond that of this series (but if you’re interested, there’s always Wikipedia for the curious). Nevertheless, this understanding is what leads to the distribution numbers presented in Table 2 below.
So with just six monthly return numbers we can get a pretty good understanding of what our potential upside and downside is for Stock A using statistical analysis. Accordingly, if you don’t want to be down more than -2.5% in a given month more than 95% of the time, Stock A is not for you. Simple, right?
Expectations Versus Reality
In Table 3 below, we see the statistics for SPY, which is a low fee, passively managed version of the S&P 500® Index. Over the past ten years, the standard deviation for SPY has been 13.18% per year, while its annual return has been 14.20%. This means that 68% of the time the annual return of SPY is estimated to be 14.20% +/- 13.18%, or between 1.02% and 27.38%. So most of the time, SPY is estimated to provide investors with a positive return. But in about one in three years, the returns for SPY will be outside of this range.
Further, 95% of the time the annual return for SPY is estimated to be between -12.16% and 40.56%. And 99.7% of the time the annual return is estimated to be between -25.34% and 53.74%. By the way, 95% corresponds to 19 out of every 20 years and 99.7% corresponds to 369 out of every 370 years. So once every 370 years, we can estimate that the returns of SPY should be either less than -25.34% or greater than 53.74%.
However, you’ll notice from from Figure 2 above that the S&P 500® Index was down more than -35% in three years since 1927; in 1937, 1974, and 2008. Clearly our range analysis can only go so far in explaining reality given that we experienced three events in the past one hundred years that should only occur less than once every three hundred years. So obviously, there are strong limitations to this type of risk analysis.
Given this, before we go any further into this topic, let’s tackle the assumptions that underlie normally distributed random variables to gain some understanding on just how well these assumptions line up with reality. As such, this will be the topic of Part 4 of this series.