Monday, October 17, 2005

Probability Distribution function

Random variables: Are the real valued functions defined on the sample space i.e they map the outcomes of an experiment from a non-real outcome to real numbers outcome.

Analogy: For e.g while tossing a dice we are not interested in the actual outcomes but are interested in the functions of the outcome

A random variable that can take at most a countable number of possible values is said to be discrete. For discrete random variable we define probability mass function

Picked from lecture notes :) for my reference ( thanks to whoever did it)

Cumulative Distributive function
--------------------------------------
What is the probability that x is less than or equal to x0?
The probability that x < x="-" infinity="" integral="" x0="" dx="">

This integral yields the area under the curve between x = -∞ and x = x0
and is called the cumulative density function or cdf denoted by ‘g’.


Variance – measure of the deviation from the mean for points in one dimension e.g. heights
Covariance as a measure of how much each of the dimensions vary from the mean with respect to each other.

Covariance is measured between 2 dimensions to see if there is a relationship between the 2 dimensions e.g. number of hours studied & marks obtained.

The covariance between one dimension and itself is the variance

Covariance Properties
---------------------------------

Exact value is not as important as it’s sign.
A positive value of covariance indicates both dimensions increase or decrease together e.g. as the number of hours studied increases, the marks in that subject increase.

A negative value indicates while one increases the other decreases, or vice-versa e.g. active social life at RIT vs performance in CS dept.

If covariance is zero: the two dimensions are independent of each other e.g. heights of students vs the marks obtained in a subject

Covariance calculations are used to find relationships between dimensions in high dimensional data sets (usually greater than 3) where visualization is difficult

variance (X) = Σi=1n(Xi – X) (Xi – X)
(n -1)
covariance (X,Y) = Σi=1n(Xi – X) (Yi – Y)
(n -1)

the mass (probability) of a small section of wire is the mass per unit length (density) times it length of section (bin width) under consideration.




No comments: