Tuesday, October 19, 2010

Fundamental Ideas in Probability through Examples

Counting

The first fundamental idea in probability is that of counting.

There is an idea of an experiment which when conducted will result in an outcome - for example, tossing a coin, throwing a die or dice, finding whether a disease exists in a population etc.

The total set of all possible outcomes for the experiment is called the sample space. For example, for the experiment of tossing a coin, the set of all possible outcomes is the set { H, T }. For the experiment of two flips of a coin, the sample set is { HH, HT, TH, TT }. And so on. Enumerating these sets of outcomes is one aspect of counting in probability.

An event is an expression of a set of possible outcomes. For example, getting exactly one H in two flips of a coin is an event; or getting at least one H in two flips is another event.

And the other aspect of counting is enumerating the outcomes for a given event. For example, for the event of getting exactly one H in two flips of a coin is {HT, TH}.

The definition of probability is:

P(E) = n(E) / n(S)

which says that the probability of the occurrence of the event E is the ratio of the number of outcomes corresponding to E to the total number of outcomes possible.

Conditional Probability

If there are two events defined on a sample space, it is possible that the probability of one event is changed if the outcome of the other event is known.

Consider the experiment of two flips of a coin: the sample space is { HH, HT, TH, TT }. Consider two events: A is the event of getting one H in two flips and B is the event of getting a T in the first flip.

The question is, what is the probability of A when the event B has occurred? To answer this question we have to get back to counting: If B has occurred, then the possible outcomes are { TH, TT }. So, A can occur only in the case TH. Therefore, the probability of A occurring is 1/2.

Without the knowledge of B having occurred, the probability of A occurring would be 3/4. The knowledge that one T has already occurred has changed the probability for the event A. This is known as conditional probability.

P(A|B) is the notation for the conditional probability of A when event B has occurred.

The following relationship holds for conditional probability:
P(A|B) = P(A and B) / P(A)

In our example, the probability of A by itself is 3/4 and that for B by itself is 1/2.

The calculation becomes: (3/4 x 1/2) / (3/4) = (3/8) / (3/4) = (4/3) x (3/8)

= 1/2 (same as what we got through counting).

Bayes Theorem

Certain relationships exist between simple and conditional probabilities that make it possible to infer additional probabilities. Bayes theorem provides these relationships.

As an example, consider the following situation:
A certain disease exists in 0.5% of a population (0.005).
A certain blood test is 99% accurate when the disease is present (0.99)
The blood test gives a positive result when the disease does not exist in 5% cases (0.05)

The first statement above expresses the probability of the event (A) of the disease existing in a randomly chosen member of the population. The notation for this probability is P(A).

A second event B of getting a positive result of the blood test is also implied. However, the statements regarding this event are expressed as conditional probabilities:

P(B|A) expresses that the blood test will give a positive result when the disease present. That is, when it is already known that the disease is present, what is the probability of getting a positive result? This answer is 0.99

The third known probability is P(B|~A) which expresses that the blood test will give a positive result when it is known that the disease is not present. This value is 0.05.

What is not known is the probability of having the disease when the result of the test is positive.

If this number is lower than say 0.8 then it would imply that two people in ten who tested positive dont actually have the disease. Which would make the test quite useless for prescribing medication with side effects since two people out of ten would needlessly have to suffer the side effects. But if the number worked out to 0.95 then it would make the test quite useful.

Using Bayes theorem, it is possible to answer this and other related questions. Bayes theorem itself and the related calculations are the subject of another blog post. Here, we are presenting the case for and context of Bayes theorem.

Random Variables

Random variables are used to express probabilities of a group of related events. For example, an event could express getting a sum of 5 on the roll of two dice. But, related to this event are events where the sum is 2, 3, ... 12. Random variables are used to express the groups of related events and their respective probabilities.

A random variable is a function that maps outcomes from the sample space into numbers. This statement implies the definition of an experiment without which the set of outcomes is meaningless. Further, this statement also implies that the random variable expresses something about the outcomes. Let's look at an example.

For example, the experiment is two flips of a coin. The sample set is { HH, HT, TH, TT }. One possibility for a random variable could be:

X: the number of heads

The values for X are { 0, 1, 2 }

So, X has taken the outcomes and mapped them into a set of numbers.

Now, each of the values of X has a probability associated with it:

P(X=0) = 1/4
P(X=1) = 1/2
P(X=2) = 1/4

The above can be expressed as a function F(X) whose individual values are f(x) where each f(x) is actually P(X=t). This means:

f(0) = P(X=0) = 1/4
f(1) = P(X=1) = 1/2
f(2) = P(X=2) = 1/4

The function F(X) is called the probability mass function.

There is one more commonly associated concept with random variables - the Expected Value

The Expected Value E(X) is the weighted sum of the values of X with their corresponding probabilities. For the above example, the expected value is:

E(X) = 0 x f(0) + 1 x f(1) + 2 x f(2)
= 0 + 1/2 + 2/4
= 1

What this implies is that if the experiment is carried out for a large enough number of times and the values of X obtained from the experiments are averaged, then the answer will be 1.

This is reasonable because we are likely to get the value 0 for a quatter of the time, 1 for half the time and 2 for another quarter of the time. Like a sequence like: 1,2,1,0,1,0,1,2,... and the average of this is going to be 1 (or close to it).

So the Expected Value of a random variable is the average of the numbers obtained from a large number of experiments.

No comments: