If you grasped the idea of “simple” percentages from my last post, probability, odds, and statistics will be more meaningful. As we look at these last three, we’ll see how the concept of Manipulating Numbers fits in. Manipulation can come from ignorance, but it is sometimes used to sell an idea or a product.
Chance
I’ll briefly talk about “chance” since we use it in everyday language. We may hear “What’s the chance of being struck by lightning – twice?”. Or “With the car problems I’ve had recently, what is the chance my car will break down on this road trip?”. “What are my chances of winning the lottery?”. Chance can be a vague term.
There are games that we think of as “games of chance”. These are games such as roulette, dice games, and coin tosses. These are games where we often think that “luck” affects the outcome. From those games of chance, the mathematics of probability was developed. Early gamers want to know how to calculate the likelihood of winning. So, what is probability?
What is Probability?
Probability developed to answer questions about games of chance. We may refer to chance when we really mean probability. Probability, includes a rich set of literature. This article won’t explore all those depths but will only review the higher-level version of it.
Dictionaries define probability. But, those definitions cause some haziness when it comes to discussing probability in mathematical terms. Dictionaries, as well as articles written about events, mix what I’ll call true “probability” with other terms (i.e., mostly “chance” and “odds”). Probability is a branch of math that provides a way to calculate the likelihood that an event will occur or the likelihood that a hypothesis or theory is true. Though it’s sometimes discussed as a percentage, the probability of an event is expressed as a number between 0 and 1 (and includes 0 and 1). A probability of 0 says we are confident the event won’t occur. A probability of 1 says we are certain that the event will occur. Most probabilities are between 0 and 1. The closer the probability is to 0, the lower the likelihood that the event will occur. The closer the probability to 1, the higher the likelihood.
Consider a common example used in elementary probability discussions – tossing a coin. A standard coin has a “head” and a “tail”. We probably don’t think about coins often but for probability discussions, we assume (ignoring two headed coins or coins that are constructed to favor one side):
- A coin has two sides – a head and a tail,
- Coins are reasonably balanced so that the result of a coin toss is not influenced by one side being heavier than the other, and
- There’s no other reason to believe that the result of a coin toss is likely to favor one side over the other.
If we toss a coin, we will get a head or a tail. Probability is a way to answer the question – what is the likelihood that the toss will be either of those results? In this example, each toss has two possible results. So, we say the number of possibilities, or the “sample space”, for a coin toss is 2. Each toss will only yield one result. So each toss gives a theoretical probability of 1/2 or 0.5 (see more discussion below). We also sometimes translate this to mean that we expect to get heads half the time and expect to get tails half of the time.
We can try to prove that this is true by tossing a coin. Try it for yourself. You may toss it 10 times and get 5 heads or get 3 heads. You may toss it 200 times and get 100 heads or get 114 heads. This doesn’t mean that the mathematical answer is wrong but shows that such events as tossing a coin are also affected by physical action and perhaps by some randomness. (Should we always start our tosses either head up or tails up to get consistency? Can we apply the exact same force for each toss? If during one toss a fan blows air across the path of the coin while it’s rotating, will that affect the result?) The point in these questions is twofold: (1) we may be able to prove that the probability of a head or tail is 0.5 by physically tossing a coin but the number of required tosses my be very high (10,000? 100,000?) and (2) we can’t always test events to prove the probability since most events to which probability is applied are more complex than a coin toss. That’s why we need the mathematics of probability to evaluate this and more challenging questions.
For those who care, here’s some of the basic math for the coin toss:
- The list of outcomes of a trial (i.e., a set of events such as one toss of a coin) is called the Sample Space. For a coin toss, there are only two outcomes – Heads and Tails. So the Sample Space, let’s label it “S”, contains two items.
- S = {Heads, Tails}
- If we make one toss of the coin, there are two possibilities for the toss but only one outcome. Since we have said we have a fair coin (and assume a “fair toss”), the likelihood of Heads or Tails is considered equal. The probability of either Heads or Tails is then:
1.a.: Probability of Heads/Tails = number of outcomes from one toss [1 – since only one Head or Tail is possible for one toss] / the size of the Sample Space [2]
1.b.: Probability = 1 / 2 = 0.5 [also 1/2 of 50%]
- If we switch to tossing a single, fair, six-sided die (only one of a pair of dice) and want to know the probability of rolling a 6:
2.a.: S = {1,2,3,4,5,6}
- S for this example has a size of 6.
2.b.: Probability of rolling a 6 = number of outcomes [1 since there is only one possible 6 on a single die roll] / the size of the Sample Space [6]
2.c.: Probability of a 6 = 1 / 6 = 0.167 [or 1/6]
Increasing the Difficulty
We won’t spend much time on the explaining results for more challenging probability considerations but will briefly review some examples. There are many possible variations of the coin toss and die roll:
- We could ask the probability of getting two Heads if we toss two fair coins. For this, the Sample Space is larger than for tossing a single coin: {HH, HT, TH, TT}. For simplicity, we will assume that HT and TH are different results. (We could further complicate it by saying they are the same result.) In this S, each toss of the two coins gives a result for each coin. This we have pairs of results in S.
- Our Sample Space has a size of 4. There is one possibility of getting two heads. So the probability of tossing two coins where both land on Heads = 1 / 4 = 0.25 [1/4]
- From the same S, what is the probability of getting at least one Head? (It’s important to notice the specific language. “At least one” means there is one, we don’t care about the order, and there can be more than one.) There are three results that have at least one Heads so that probability is 3 / 4 = 0.75.
- Similarly, if I roll 4 dice, what is the probability of getting a 3 as the sum of the die faces? (S = {1111, 1112, 1113, 1114, 1115, 1116, …}). (This introduces the question of how to determine the size of S. We can’t always list every outcome so we need a way to count. For this counting problem, probability calculations start to rely on rules from combinatorics to know how to select a number of elements from a larger set.) The probability of a sum of 3 with a roll of four dice is 0 since the lowest sum possible from 4 dice is 4 (1+1+1+1). Rolling one die gives us only 1 through 6 as possibilities. Rolling 4 dice increases the options for results and limits the results on the lower end (i.e., 4 is the smallest) and higher end (i.e. 24 = 6+6+6+6 is the largest) of possible sums.
Other problems are even more involved. We may have more events and more possibilities of outcome.
- Consider tossing 5 coins 10 times (S contains 32 possible results for tossing 5 coins) and wanting to know the probability of a Heads on the second coin in the third set of tosses. More conditions are placed on the question to be answered.
- As problems become larger, say with 100 or 10,000 possible outcomes, it becomes more difficult to list all possible outcomes.
Another common discussion for this more elementary probability calculation is selecting colored marbles from a jar. This exercise can represent a variety of probability questions.
- A jar contains 50 marbles and each is either red or green (25 of each color). We assume they are well mixed. We draw a marble without being able to see its color before we remove it. We draw 10 marbles from the jar. What is the probability of drawing a green marble on the 5th draw?
- If a jar contains 50 marbles (red and green) but we don’t know how many there are of each color, can we draw 10 marbles and estimate the mix of marble colors in the jar (how many are red and how many are green)?
So probability can be used both to measure results and to predict results.
Finally, to add more flavor (pun intended?) to this consideration of probability calculations, consider selling gelato and deciding whether you have enough on hand. You know you have forty eight 2 gallon containers of fourteen flavors. You need to know whether or not you are likely to sell all of your available gelato on summer weekend (or to decide ahead of time if you should have more on hand).
- You hope it will be hot and sunny so that you can steadily sell each day.
- You don’t want to have too much on hand but certainly don’t want to exhaust your supply before Sunday night.
- Last summer at this time, weather was favorable for sales and you sold out of 40 containers by noon on Sunday.
- You estimate that if the weather is sunny, you have a 85% chance of selling all your stock (based on previous weekends).
- If it is cloudy, you believe your likelihood is lowered to 60%.
- If it rains most of one day, the probability lowers to only 20%. Less if it rains on both days.
- According to the weather forecast, the probability of sunshine is 50%, the probability of cloud is 50%, and the probability of rain is 20%.
- What is the overall probability that you will sell all of your current gelato stock (or should order more)?
Obviously, even “everyday” questions of probability can become more challenging to estimate.
Assumptions and Conditions
As the conditions on a probability calculation become more involved, it becomes important to define our assumptions. Assumptions and conditions impact results. This is referred to as “Conditional Probability“.
For the “fair coin toss”, we had two basic assumptions: (1) the coin is fair and (2) tosses are the same. For the first marble problem above, we assumed: (1) the marbles were evenly mixed, (2) we could not see what color we were drawing before the draw, and (3) there were 25 of each color when we started drawing. Though simple, these were our conditions.
When scientists want to calculate the probability that a certain chemical influences someone’s health, they may have many assumptions to list as explanation of the validity of their results. Even with drawing marbles, we may have more colors involved or may not know that they are mixed evenly. We may need to know probability if we draw a marble and then replace it in the jar or if we draw a marble and do not replace it.
Because of such issues, more involved probability calculations involve considerable effort in quantifying the assumptions and in defining conditions that may impact results. This leads to conditional probability calculations that involve more complicated evaluations than our simple formula of:
Probability = possible events / size of sample space.
Instead of the probability of an even given the sample space, we refer to: the probability of and Event given these Conditions. We’ve referred to this in some of our examples above showing more challenging probability calculations.
Problems like this, involve multiple conditions and multiple combined probabilities to determine the overall likelihood of selling the existing supply or needing to order more. We won’t solve this here since it requires introducing several new concepts and rules. This is presented to further demonstrate that there can be multiple conditions and assumption involved in one calculation of probability.
Odds
Chance – probability – odds. In everyday language, some people (in the link, all three are used in the title and the body of the article) use those terms for the same idea. Odds are often talked about in situations that involve betting (e.g., horse racing). You don’t have to know probability to grasp the concept of “odds”, but, though the term is not used consistently, odds are dependent on probability.
Odds are defined as: (Probability an Event will occur) / (Probability an Event will not occur) = Odds
While probability is a number between 0 and 1 (or sometimes we view it as a percentage), the odds may be greater than or less than 1.
3.a.: If the probability an Event will occur = 0.8, the probability that the Event will not occur = o.2. So the Odds = 0.8 / 0.2 = 4
3.b.: If the probabilities are each 0.5, the Odds are 1 (0.5 / 0.5 = 1)
3.c.: If probability of an Event = 0.05, the Odds become closer to the probability. Odds = 0.05 / 0.95 = 0.053
When the probability an Event will occur is high, the odds are a higher number. When that probability is low, the odds are low.
So What?
Probability is a way to mathematically measure likelihood of events or situations. It can measure simple chance events such as rolling dice or a business owner’s need (or lack of need) for more supply. It’s used for weather forecasts and for betting on sports events. It can help calculate prices and budgets. If used incorrectly, it can have negative effects on a decision. If used correctly, it can make us smile.
To the question of manipulation, probability (or chance or odds) is a commonly used value that can misrepresent the truth. We’ve implied by the lengthy (yet brief) discussion above, that probability can be skewed in at least these ways:
- Not understanding what probability means. Though some probabilities involve more complicated calculations than those we’ve reviewed, not understanding the basics can lead to wrong conclusions.
- Miscalculation of Sample Space size. Large Sample Spaces may rely on use of combinatorics for counting the size of S.
- Not considering all conditions or not revealing all assumed conditions. A false or misleading premise can impact interpretation of the results.
- Failing to properly calculate the conditional probability. We did not review the details of these calculations here but you can see examples in the link above.
- Mixing probability, chance, and odds in ways that skew the results. As we saw above, chance is a vague term but both probability and odds have mathematical representations.
Next, we will look at statistics. It will also be a “simplified” review but will lead us toward the concluding discussion of how manipulation of numbers can affect us every day.