MVJ
12April, 2018
Now that the reports are done, you need to pick hypotheses for your second report.
These hypotheses must follow the sentence types on the Hypotheses page on the course website.
Example:
mean(datasubset$Variable)
is [not equal to] / [larger than] / [smaller than] M
for some specific value M
.
A specific hypothesis of this type could be
mean(iris$Sepal.Length)
is larger than 6.
If you are estimating values from your data, you must use the same subset that you used for the first report.
Frequentist | Bayesian | |
---|---|---|
Origin | Analyzing gambling | Analyzing evidence |
Interpretation | Expected long term proportions after many repetitions | Quantified degree of belief |
Both approaches follow the same basic algebraic laws.
A phenomenon is random if individual outcomes are uncertain, but there is a regular distribution of outcomes in a large number of repetitions.
The probability of any outcome of a random phenomenon is the proportion of times that outcome would occur in a very long series of repetitions.
Note that the book uses a frequentist perspective.
Toss a fair coin. We expect heads & tails to come up approximately equally often.
Roll a fair 6-sided die. We expect each number to come up approximately equally often.
Both the coin toss and the dice roll is an example of a discrete random distribution:
To each of a finite (countable) number of outcomes is assigned a probability, with the sum of probabilities being equal to 1.
Coin toss | Probability | Dice roll | Probability | |
---|---|---|---|---|
Heads | 0.5 | 1 | 0.166 | |
Tails | 0.5 | 2 | 0.166 | |
3 | 0.166 | |||
4 | 0.166 | |||
5 | 0.166 | |||
6 | 0.166 |
To completely specify a model for a random phenomenon, we need:
To completely specify a model for a random phenomenon, we need:
The sample space of a random phenomenon is the set of all possible outcomes.
An event is a set of possible outcomes.
Probability is a function that takes an event \(A\) an produces a number \(0\leq\mathbb{P}(A)\leq1\).
Discuss in pairs:
What is the sample space for…
Discuss in pairs:
What is the sample space for…
Benford’s Law: in many “naturally occurring” collections of numbers (tax returns, payment record, expense account claims, …) the first digit follows a distinct probability distribution:
Two events are independent if knowing that one occurs does not change the probability of the other one occurring.
If \(A\) and \(B\) are independent events, then \[ \mathbb{P}(A\text{ and }B) = \mathbb{P}(A)\cdot\mathbb{P}(B) \]
Two events are independent if knowing that one occurs does not change the probability of the other one occurring.
If \(A\) and \(B\) are independent events, then \[ \mathbb{P}(A\text{ and }B) = \mathbb{P}(A)\cdot\mathbb{P}(B) \]
Coins don’t have memory: so subsequent coin tosses can be considered independent.
Two events are independent if knowing that one occurs does not change the probability of the other one occurring.
If \(A\) and \(B\) are independent events, then \[ \mathbb{P}(A\text{ and }B) = \mathbb{P}(A)\cdot\mathbb{P}(B) \]
Coins don’t have memory: so subsequent coin tosses can be considered independent.
Once cards get removed from a deck, the proportions of card drawn changes: Probability of first card being red is \(26/52\). Probabiltiy of second card being red is
Sample spaces are collections of possible outcomes. A numeric value assigned to each outcome produces a random variable.
Example A craps roll has as its sample space the 36 possible pairs of dice outcomes.
The payout for a particular craps bet is a random variable.
A discrete random variable has a finite (countable) set of possible values.
A discrete random variable can be specified by giving a probability to each possible value.
A continuous random variable has numeric possible values.
A continuous random variable has probability 0 for any single specific value. Instead, for continuous variables, probabilities are assigned to ranges.
The probability of a range is the area under the density curve for that range.
We have already seen the normal distribution. This is determined by a mean \(\mu\) and a standard deviation \(\sigma\).
The uniform distribution has a constant density curve. This is determined by the range of the constant density.
The binomial distribution counts the number of successes in \(n\) repeated trials of constant success probability \(p\).
The Poisson distribution counts the number of events in a constant rate process with an average count of \(\lambda\) events per time unit.
Suppose you gamble with a consistent bet: every time you play, you have a probability of 1% to win $ 100 and a probability of 99% of losing $ 5.
After playing 1000 times, you expect to have won 10 times, and lost 990 times.
This produces an overall gain of $ 100 \(\times\) 10 = $ 1000 and a loss of $ 5 \(\times\) 990 = $ 4950 for a total loss of $ 3950.
We define the expected value or mean of a discrete random variable to be \[ \mu_X = \mathbb{E}X = \sum_x x\cdot\mathbb{P}(x) \]
For a continuous random variable, the sum becomes an integral and the expected value is \[ \mathbb{E}X = \int x\cdot p(x)dx \]
1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | |
---|---|---|---|---|---|---|---|---|---|
Uniform | 1/9 | 1/9 | 1/9 | 1/9 | 1/9 | 1/9 | 1/9 | 1/9 | 1/9 |
Benford | .301 | .176 | .125 | .097 | .079 | .067 | .058 | .051 | .046 |
1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | |
---|---|---|---|---|---|---|---|---|---|
Uniform | 1/9 | 1/9 | 1/9 | 1/9 | 1/9 | 1/9 | 1/9 | 1/9 | 1/9 |
Benford | .301 | .176 | .125 | .097 | .079 | .067 | .058 | .051 | .046 |
Uniform mean is \[ 1/9 + 2/9 + 3/9 + 4/9 + 5/9 + 6/9 + 7/9 + 8/9 + 9/9 = 45/9 = 5 \]
1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | |
---|---|---|---|---|---|---|---|---|---|
Uniform | 1/9 | 1/9 | 1/9 | 1/9 | 1/9 | 1/9 | 1/9 | 1/9 | 1/9 |
Benford | .301 | .176 | .125 | .097 | .079 | .067 | .058 | .051 | .046 |
Uniform mean is \[ 1/9 + 2/9 + 3/9 + 4/9 + 5/9 + 6/9 + 7/9 + 8/9 + 9/9 = 45/9 = 5 \]
Mean first digit in Benford’s law is \[\begin{multline*} 1\cdot.301+2\cdot.176+3\cdot.125+4\cdot.097+5\cdot.079+\\ +6\cdot.067+7\cdot.058+8\cdot.051+9\cdot.046 \approx 3.441 \end{multline*}\]or The Central Limit Theorem:
As the sample size grows larger, the sample mean \(\overline x\) gets closer to the distribution mean \(\mu\).
This holds for any distribution (as long as the mean and standard deviation are finite) and we can calculate how many samples we need to reach the precision we want.
Just like the mean can be defined and used for random variables, the standard deviation and the variance can too.
The variance \(\sigma^2_X\) of a random variable \(X\) is the mean square deviation from the mean.
\[ \sigma^2_X = \mathbb{E}[(X-\mu_X)^2] \]
The standard deviation \(\sigma_X\) is the square root of the variance.
X | \(\mathbb{P}\) | \(X\cdot\mathbb{P}\) | \((X-\mu_X)^2\mathbb{P}\) |
---|---|---|---|
1 | 1/6 | 0.166 | 1.041 |
2 | 1/6 | 0.333 | 0.375 |
3 | 1/6 | 0.500 | 0.041 |
4 | 1/6 | 0.666 | 0.041 |
5 | 1/6 | 0.833 | 0.375 |
6 | 1/6 | 1.000 | 1.041 |
\(\mu_X = 3.5\)
\(\sigma^2_X = 0.486\)
\(\sigma_X = 0.697\)
The correlation between random variables controls how variances combine.