Thursday, August 18, 2005

What is a Random Variable?

Random Variable

The outcome of an experiment need not be a number, for example, the outcome when a coin is tossed can be 'heads' or 'tails'. However, we often want to represent outcomes as numbers. A random variable is a function that associates a unique numerical value with every outcome of an experiment. The value of the random variable will vary from trial to trial as the experiment is repeated.

There are two types of random variable - discrete and continuous.

A random variable has either an associated probability distribution (discrete random variable) or probability density function (continuous random variable).

Examples

  1. A coin is tossed ten times. The random variable X is the number of tails that are noted. X can only take the values 0, 1, ..., 10, so X is a discrete random variable.
  2. A light bulb is burned until it burns out. The random variable Y is its lifetime in hours. Y can take any positive real value, so Y is a continuous random variable.
Probability Distribution

The probability distribution of a discrete random variable is a list of probabilities associated with each of its possible values. It is also sometimes called the probability function or the probability mass function.

More formally, the probability distribution of a discrete random variable X is a function which gives the probability p(xi) that the random variable equals xi, for each value xi: p(xi) = P(X=xi)

It satisfies the following conditions:

  1. 0 <= p(xi) <= 1
  2. sum of all p(xi) is 1

Cumulative Distribution Function

All random variables (discrete and continuous) have a cumulative distribution function. It is a function giving the probability that the random variable X is less than or equal to x, for every value x.

Formally, the cumulative distribution function F(x) is defined to be: F(x) = P(X<=x)
for -infinity < x < infinity

For a discrete random variable, the cumulative distribution function is found by summing up the probabilities as in the example below.

For a continuous random variable, the cumulative distribution function is the integral of its probability density function.

Example
Discrete case : Suppose a random variable X has the following probability distribution p(xi):
xi
0 1 2 3 4 5
p(xi)
1/32 5/32 10/32 10/32 5/32 1/32
This is actually a binomial distribution: Bi(5, 0.5) or B(5, 0.5). The cumulative distribution function F(x) is then:
xi
0 1 2 3 4 5
F(xi)
1/32 6/32 16/32 26/32 31/32 32/32

F(x) does not change at intermediate values. For example:
F(1.3) = F(1) = 6/32
F(2.86) = F(2) = 16/32

No comments: