6.3 Random Variable

With a basic understanding of theoretical probability rules, we can introduce the most important concept from probability for our uses in this class: a random variable.

A random variable (\(X\)) is variable whose outcome (the value it takes) is governed by chance. In other words, it is a variable (something capable of taking different values) whose value is random. Examples include:

  • \(X =\) age of the next person to walk into the building
  • \(X =\) the number of dots on the side that lands face up on a balanced 6-sided die

When considering data analysis and modeling, the random variables we will be considering will be estimated regression coefficients, estimated odds ratios, etc. Why are these random variables? Because their values depend on the random samples that we draw. To establish our understanding, let’s start with a simple example.

You are going to flip a fair coin 3 times (the coin has 2-sides, we’ll call one side Heads and the other Tails).

  • Assume there are only 2 possible outcomes and \(P(\text{Heads}) = P(\text{Tails}) = 0.5\) (the coin can’t land on its side).

  • Below are three possible random variables based on the same random process (flipping a 2-sided coin 3 times):

  • Example 1: \(X =\) the number of heads in 3 coin flips

    • What are the possible values of \(X\)? 0, 1, 2, or 3.
  • Example 2: Say you get 3 dollars for each Head

    • \(Y =\) the amount of money won from 3 coin flips, \(Y = 3*X\)
    • The possible values of \(Y\) are 0, 3, 6, or 9.
  • Example 3: \(Z =\) the number of heads on the last of the 3 coin flips

    • The possible values are 0 or 1.

What might you want to know about these random variables? In general, we’d like to know the probability model (what values it takes and the associated chances), the expected value (long-run average), and the variance (a measure of how much the values vary). Let’s talk about each of these next.