STAT 3360 Notes
Table of Contents
1 Random Variable
1.1 Introduction
- In the section Probability of Events, we represent the probability of an event A by \(P\)(A). For example, we use \(P\)(the student is tall) to represent the probability of the event "the student is tall".
- Apparently, English statements are not as concise and efficient as mathematical ones. Compared to the English statement "one person plus two other persons are three persons in total", we prefer the mathematical statement "\(1+2=3\)". Here we use the number \(1\) to represent "one person", the number \(2\) to represent "two persons" and the number \(3\) to represent "three persons". Actually, no matter it is "one person plus two other persons are three persons in total", or "one orange plus two other oranges are three oranges in total", or "one pencil plus two other pencils are three pencils in total", we use exactly the same mathematical statement "\(1+2=3\)". When calculating the total number, we focus only on the mathematical representation "\(1+2\)" and ignore the real-world objects (person, orange, pencil) behind these numbers.
- Now we would like to do the same thing for the probability statements. That is, we use numbers to represent outcomes of experiments.
- For example, if the experiment is about the nominal variable Gender, then we may use
- the number \(1\) to represent the outcome "male",
- the number \(2\) to represent the outcome "female".
- For example, if the experiment is about the ordinal variable Height, then we may use
- the number \(1\) to represent the outcome "tall",
- the number \(2\) to represent the outcome "medium",
- the number \(3\) to represent the outcome "short".
- For example, if the experiment is about the integer-valued numerical variable Score, then we may use
- the number \(60\) to represent the outcome "\(60\) points",
- the number \(70\) to represent the outcome "\(70\) points",
- \(\cdots\)
- the number \(C\) to represent the outcome "\(C\) points",
- etc.
- For example, if the experiment is about the numerical variable Weight, then we may use
- the number \(131.5\) to represent the outcome "\(131.5\) pounds",
- the number \(91.23\) to represent the outcome "\(91.23\) pounds",
- \(\cdots\)
- the number \(C\) to represent the outcome "\(C\) pounds",
- etc.
- For example, if the experiment is about the nominal variable Gender, then we may use
1.2 Random Variable
1.2.1 Definition
- A function which realizes the correspondence between numbers and outcomes of an experiment is called a Random Variable.
1.2.2 Examples
- For example, suppose \(u\) is the outcome of the experiment about nominal variable Gender. Then the following function \(X(u)\) realizes the correspondence between the outcome \(u\) and numbers, and thus is a random variable. \[ X(u) = \left\{ \begin{array}{ll} 1, & \text{if } u = \text{“male”}\\ 2, & \text{if } u = \text{“female”} \end{array} \right.\]
- For example, suppose \(u\) is the outcome of the experiment about ordinal variable Height. Then the following function \(X(u)\) realizes the correspondence between the outcome \(u\) and numbers, and thus is a random variable. \[ X(u) = \left\{ \begin{array}{ll} 1, & \text{if } u = \text{“tall”}\\ 2, & \text{if } u = \text{“medium”}\\ 3, & \text{if } u = \text{“short”} \end{array} \right.\]
- For example, suppose \(u\) is the outcome of the experiment about integer-valued numerical variable Score. Then the following function \(X(u)\) realizes the correspondence between the outcome \(u\) and numbers, and thus is a random variable. \[ X(u) = U, \text{ if } u = \text{“} U \text{ points”} \]
- For example, suppose \(u\) is the outcome of the experiment about numerical variable Weight. Then the following function \(X(u)\) realizes the correspondence between the outcome \(u\) and numbers, and thus is a random variable. \[ X(u) = U, \text{ if } u = \text{“} U \text{ pounds”} \]
1.2.3 Variable and Random Variable
- Note that "variable" and "random variable" are two different concepts in this course.
- A variable is some characteristic of a population or sample.
- For example, the Gender is a characteristic of the population consisting of all students, so it is a variable.
- The values of a varibale are realizations of the characteristic.
- For example, the values of the nominal variable Gender are "male" and "female".
- A variable and its values are used to represent an experiment and its outcomes, respectively.
- For example, the outcomes of the experiment about gender is represented by values "male" and "female" of the variable Gender.
- A random variable is a function which realizes the correspondence between numbers and outcomes of an experiment.
- The values of a random variable are purely numbers.
- For example, the values of the above random variable \(X(u)\) for the variable Gender are \(1\) and \(2\), which are purely numbers.
Since
- a random variable realizes the correspondence between numbers and outcomes of an experiment,
- the outcomes of an experiment are represented by the values of a variable,
we can summarize the relationship between experiment, variable and random variable as follows.
- numbers \(\overset{Random Variable}{\iff}\) outcomes of experiment \(\overset{Representation}{\iff}\) values of variable
For example, the random variable for the experiment about nominal variable Gender can be illustrated by the following plot
number outcome of experiment value of variable 1 <==> a male student <==> the value "male" of variable GENDER 2 <==> a female student <==> the value "female" of variable GENDER
1.3 Simplified Notations
- As discussed above, by introducing the concept random variable, we can greatly simplify the notations.
- We use \(X\) to represent the RANDOM VARIABLE \(X(u)\).
- We use statement "\(X=x\)" to represent the EVENT corresponding to it.
- For example, for the experiment about nominal variable Gender, we use the statement
- "\(X=1\)" to represent the event "the student is male",
- For example, for the experiment about ordinal variable Height, we use the statement
- "\(X=3\)" to represent the event "the student is short"
- For example, for the experiment about integer-valued numerical variable Score, we use the statement
- "\(X=60\)" to represent the event "the score is \(60\) points",
- For example, for the experiment about numerical variable Weight, we use the statement
- "\(X=131.5\)" to represent the event "the score is \(131.5\) pounds",
- For example, for the experiment about nominal variable Gender, we use the statement
- We use notation \(P(x)\) to represent \(P(X=x)\), which is the PROBABILITY OF EVENT \(X=x\).
- For example, for the experiment about nominal variable Gender, we use the notation
- "\(P(1)\)" to represent "\(P(X=1)\)", which in turns means \(P\)(the student is male),
- For example, for the experiment about ordinal variable Height, we use the notation
- "\(P(3)\)" to represent "\(P(X=3)\)", which in turns means \(P\)(the student is short).
- For example, for the experiment about integer-valued numerical variable Score, we use the notation
- "\(P(60)\)" to represent "\(P(X=60)\), which in turns means \(P\)(the score is \(60\) points),
- For example, for the experiment about numerical variable Weight, we use the notation
- "\(P(131.5)\)" to represent "\(P(X=131.5)\)", which in turns means \(P\)(the weight is \(131.5\) pounds),
- For example, for the experiment about nominal variable Gender, we use the notation
1.4 Discrete Random Variable
1.4.1 Defintion
- A random variable \(X\) is said to be Discrete if it takes on a finite or countable number of values.
- A set is finite or countable if we can assign a unique integer to each of its element.
1.4.2 Examples
- For example, the random variable \(X\) for the nominal variable Gender is discrete because \(X\) takes on only 2 values (\(1\) means "male" and \(2\) means "female").
- For example, the random variable \(X\) for the ordinal variable Height is discrete because \(X\) takes on only 3 values (\(1\) means "tall", \(2\) means "medium" and \(3\) means "short").
- For example, the random variable \(X\) for the integer-valued numerical variable Score is discrete because \(X\) takes on only a finite number of values \(0, 1, 2, \dots, 100\) (\(0\) means "\(0\) point", \(1\) means "\(1\) point", \(\dots\), \(100\) means "\(100\) points")
- For example, if we flip a coin for infinitely times and let \(X\) be the random variable representing the total number of heads, then the value of \(X\) can be any non-negative integer (\(0, 1, 2, 3, \dots\)). Therefore, \(X\) takes on infinitely many values. Since the set of all non-negative integers is countable (we can assign a unique integer index to each of its elements), \(X\) is still discrete.
1.5 Continous Random Variable
1.5.1 Definition
- A random variable \(X\) is said to be continuous if \(X\) takes on uncountably many values.
- Quite roughly, if there exists some interval of which all numbers are possible values of \(X\), then \(X\) is a continuous random variable.
1.5.2 Examples
- For example, the random variable \(X\) for the numerical variable Weight is continuous. All number on the interval \([50, 200]\) (eg, \(53\), \(65.33\), \(78.9126\)) are possible values of \(X\) (of course, other values are possible).
- For example, if \(X\) is the random variable representing the time (in seconds) we have to wait until next rain in Dallas, then \(X\) is a continuous random variable. All numbers on the interval \([0, \infty)\) (eg, \(10\), \(1352.68\), \(26498.245\)) are possible values of \(X\).
2 Probability Distribution
- A Probability Distribution describes the values of a random variable and the probability associated with these values.
- The probability distribution of a discrete random variable is called discrete (probability) distibution.
- The probability distribution of a continuous random variable is called continuous (probability) distibution.
3 References
- Keller, Gerald. (2015). Statistics for Management and Economics, 10th Edition. Stamford: Cengage Learning.