tgt

## Thursday, 7 August 2014

### chapter-20 Terminology associated with Probability

It is in most cases very useful to talk about the mean of a random variable $X$. For example, in the experiment above of four tosses of a coin, someone might want to know the average number of heads obtained. Now the reader may wonder about what meaning to attach to the phrase “average number of heads”. After all, we are doing the experiment only once and we’ll obtain a particular value for the number of Heads, say $0$ or $1$ or $2$ or $3$ or $4$, but what is then this “average number of heads”?
By the average number of heads we mean this: repeat the experiment an indefinitely large number of times. Each time you’ll get a certain number of Heads. Take the average of the number of Heads obtained in each repetition of the experiment. For example, if in $5$ repetitions of this experiment, you obtain $2$,$2$$3$$1$$1$ Heads respectively, the average number of Heads would be $(2 + 2 + 3 + 1 + 1) / 5 = 1.9$. This is not a natural number, which shouldn’t worry you since it is an average over the $5$ repetitions. To calculate the true average, you have to repeat the experiment an indefinitely large number of times.
An alert reader might have realized that the average value of a $RV$ is easily calculable through its $PD$. For example, let us calculate the true average number of heads in the experiment of Example – $18$. The $PD$ is reproduced below:
Thus, for example, $P(X = 1) = \dfrac{1}{4}$, which means that if the experiment is repeated an indefinitely large number of times, we’ll obtain Heads exactly once, (about) ${\dfrac{1}{4}^{th}}$ of the time. Similarly, (about) ${\dfrac{3}{8}^{th}}$ of the time, Heads will be obtained exactly twice, and so on. Let us denote the number of repetition of the experiment by $N$, where $N \to \infty$, Thus, the average number of Heads per repetition would be $(< >$ denotes average)
 $< {\rm{Heads}} > \; = \dfrac{{{\rm{Total\, no}}{\rm{. of\, Heads\, in\, }}N\,{\rm{ repetitions}}}}{N}$ $= \dfrac{{0 \times \dfrac{N}{{16}} + 1 \times \dfrac{N}{4} + 2 \times \dfrac{{3N}}{8} + 3 \times \dfrac{N}{4} + 4 \times \dfrac{N}{{16}}}}{N}$ $= 0 \times \dfrac{1}{{16}} + 1 \times \dfrac{1}{4} + 2 \times \dfrac{3}{8} + 3 \times \dfrac{1}{4} + 4 \times \dfrac{1}{{16}}$ $= \sum$ (Value of the $RV)$ $\times$ (Corresponding Probability of this value)
Thus, we see that if a $RV$ $X$ has possible values ${x_1},{x_2},\ldots ,{x_n}$ with respective probabilities ${p_1},{p_2},\ldots ,{p_n}$, the mean of $X$, denote by $\left\langle X \right\rangle$, is simply given by
 $\left\langle X \right\rangle = \sum\limits_{i = 1}^n {{x_i}\,{p_i}}$ $\ldots(1)$
As another example, recall the experiment of rolling two dice where the $RV$ $X$was the sum of the numbers on the two dice. The $PD$ of $X$ is given in the table on Page – $42$, and the average value of $X$ is
 $\left\langle X \right\rangle \; = 2 \times \dfrac{1}{{36}} + 3 \times \dfrac{1}{{18}} + 4 \times \dfrac{1}{{12}} + 5 \times \dfrac{1}{9} + 6 \times \dfrac{5}{{36}} + 7 \times \dfrac{1}{6}$ $+ 8 \times \dfrac{5}{{36}} + 9 \times \dfrac{1}{9} + 10 \times \dfrac{1}{{12}} + 11 \times \dfrac{1}{{18}} + 12 \times \dfrac{1}{{36}}$ $=7$
The average value is also called the expected value, which signifies that it is what we can expect to obtain by averaging the $RV$’s value over a large number of repetitions of the experiment. Note that the value itself may not be expected in the general sense – the “expected value” itself may be unlikely or even impossible. For example, in the rolling of a fair die, the expected value of the number that shows up is $3.5$ (verify), which in itself can never be a possible outcome. Thus, you must take care while interpreting the expected value – see it as an average of the $RV$’s values when the experiment is repeated indefinitely.
Another quantity of great significance associated with any $RV$ $X$ is its variance, denoted by $Var(X)$. To understand this properly, consider two $RV$${X_1}$ and ${X_2}$ and their $PD$s shown in graphical form below.
Both the $RV$s have an expected value of $3$ (verify), but it is obvious that there is a significant difference between the two distributions. What is this difference? Can you put it into words? And more importantly, can you quantify it?
It turns out that we can, in a way very simple to understand. The ‘data’ or the $PD$ of ${X_1}$ is more widely spread than that of ${X_2}$. This is what is obvious visually, but we must now assign a numerical value to this spread. So what we’ll do is measure the spread of the $PD$ about the mean of the $RV$. For both ${X_1}$ and ${X_2}$, the mean is $3$, but the $PD$ of ${X_1}$ is spread more about $3$ than that of ${X_2}$. We now quantity the spread in ${X_1}$.
Observe that the various value of $X - \left\langle X \right\rangle$ tell us how far the corresponding values of $X$ are from the mean (which is fixed). One way that may come to your mind to measure the spread is sum all these distances, i.e.
 ${\rm{Spread}}= \sum\limits_{\scriptstyle{\rm{For \,all}}\hfill\atop {\scriptstyle{\rm{values }}\,\hfill\atop \scriptstyle{\rm{of }}\,X\hfill}} {\left( {X{\rm{ - }}\left\langle X \right\rangle } \right)}$
However, a little thinking should immediately make it obvious to you that the right hand side is always $0$, because the data is spread in such a way around the mean that positive contributions to the sum from those $X$ values greater than $\left\langle X \right\rangle$ and negative contributions from those $X$ values smaller than $\left\langle X \right\rangle$exactly cancel out. Work it out yourself.
So what we do is use the sum of the squares of these distances:
 ${\rm{spread}}= {\sum\limits_{\scriptstyle{\rm{For\, all}}\hfill\atop {\scriptstyle{\rm{values }}\hfill\atop \,\scriptstyle{\rm{of }}\,X\hfill}} {\left( {X{\rm{ - }}\left\langle X \right\rangle } \right)} ^2}$
However, there is still something missing. To understand what consider the following $PD$:
Although the $PD$ seems visually widespread here, the probabilities of those $X$values far from the mean are extremely low, which means that their contribution to the spread must take into account how probable they are and so on. This is simply accomplished by multiplying the value of ${\left( {X - \left\langle X \right\rangle } \right)^2}$with the probability of the corresponding value of $X$.
Thus, if $X$ can take the values ${x_1},{x_2},\ldots ,{x_n}$ with probabilities ${p_1},{p_2},\ldots ,{p_n}$, the spread in the $PD$ of $X$ can be appropriately represented by
 ‘Spread’ $= {\sum\limits_{i\; = \;1}^n {\left( {{x_i} - \left\langle X \right\rangle } \right)} ^2}{p_i}$
This definition of spread is termed the variance of $X$, and is denoted by $Var(X)$. Statisticians defined another quantify for spread, called the standard deviation, denote by $\sigma _x^2$, and related to the variance by
 $Var(X) = \sigma _X^2$
Note that the expected value of $X$ was
 $\left\langle X \right\rangle = \;\sum\limits_{i = 1}^n {{x_i}{p_i}}$
Similarly, variance is nothing but the expected value of ${({x_i} - \left\langle X \right\rangle )^2}$
 $Var\left( X \right) = \left\langle {{{\left( {{x_i} - \left\langle X \right\rangle } \right)}^2}} \right\rangle \; = \;\sum\limits_{i = 1}^n {{{\left( {{x_i} - \left\langle X \right\rangle } \right)}^2}{p_i}}$
Coming back to Fig-$16$, the variance in ${X_1}$ is
 $Var({X_1}) = \;{(1 - 3)^2} \cdot \dfrac{1}{{10}} + {(2 - 3)^2} \cdot \dfrac{1}{5}$$+ {(3 - 3)^2} \cdot \dfrac{2}{5} + {(4 - 3)^2} \cdot \dfrac{1}{5} + {(5 - 3)^2} \cdot \dfrac{1}{{10}}$ $= \dfrac{4}{{10}} + \dfrac{1}{5} + 0 + \dfrac{1}{5} + \dfrac{4}{{10}}$ $=1.2$
Similarly, the variance in ${X_2}$ is
 $Var({X_2}) = {(2 - 3)^2} \cdot \dfrac{1}{4} + {(3 - 3)^2} \cdot \dfrac{1}{2} + {(4 - 3)^2} \cdot \dfrac{1}{4}$ $= \dfrac{1}{4} + 0 + \dfrac{1}{4}$ $= 0.5$
which confirms our visual observation that the $PD$ of ${X_1}$ is more widely spread than of ${X_1}$ , because ${\mathop{\rm var}} ({X_1}) > {\mathop{\rm var}} ({X_2})$.