To understand the underlying concept of inverse probability, lets start with a very simple example.
Suppose that you have two bags , one containing red and white balls, and the second containing white and red balls.
You play a game with your friend. The friend tosses a fair coin, without telling you the outcome and if he gets a Head he withdraws a ball from Bag- while if he gets a Tail, he withdraws a ball from Bag–, with you looking away all the time. After doing this once, he has a red ball in his hand. Which bag do think is the more likely one from which this ball was drawn? Intuition immediately tells us that it should be Bag–, since it has a large number of red balls. What we need to do now is quantify the inverse probability of the ball being drawn from Bag– and Bag–, the word ‘inverse’ being used since you are trying to find the probability of an event that has already taken place, using information from a subsequent event. We do this intuitively all the time: “India’s scorecard against Pakistan in yesterday’s match had a century.” “Oh, it would most likely have been Tendulkar!” So here, the speaker is expressing his conviction that the century-scorer must have been Tendulkar, since he has in his mind the information about the various players, and he thinks Tendulkar is the best.
Coming back to our bags and balls, let us draw a tree diagram highlighting the various possible actions your friend can take (the brackets show the probabilities of the corresponding paths)
Now comes the crucial part. Note that the total probability of selecting a red ball is the sum of the probabilities of the two darkened paths (one through Bag–, one through Bag–). This is
Now, the probability of selecting a red ball through Bag – corresponds to the upper path only, and it equals
Finally, it should now be intuitively obvious that
Note what a huge difference there is between the two probabilities, which was expected. Also expected is the fact that the two probabilities sum to .
This, then, is the essence of calculating inverse probabilities. We are given the information that an event has occurred. This event can occur through paths , , , . We want to find the probability that occurred through some particular path, say , which is
Let us write this in standard terminology, which will give us the Baye’s theorem.
Suppose that the sample space consists of mutually exclusive events . Now, an event occurs, which could have resulted from any of the events (For example, think of as obtaining a red ball in the previous example, while and are selecting Bag- and Bat- respectively). We intend to find , i.e., the probability that occurred given that has occurred. There are now two ways to do the visualisation:
The left hand tree we have already explained. The right hand side shows that is an event that has occurred, which must have been a result of one of the occurring (i.e. one of the ’s must occur for to occur)
From the tree, evaluating has already been explained:
The same relation follows from the second figure:
Thus, the famous Baye’s theorem is
The name “inverse” stems from the fact that this relation gives us in terms of . The theorem is also known as a theorem on the probability of causes. The reason should be obvious.
In the examples following this, we’ll be using the tree diagram to calculate inverse probabilities. With sufficient practice, you’ll eventually not require drawing the tree diagram because by then you’ll be quite comfortable in using Baye’s theorem directly.