Monday, May 14, 2012

05-14-2012

Central Limit Theorem for Proportions (p-hat)


The Central Limit Theorem maintains:
1) Shape: Is normal or approximately normal
2) Center: μ(subscripted p-hat) = P, meaning the center for both the population and the sample is the same.
3) Spread: σ(subscripted p-hat) = √[(p(1-p))/n]


Note: We can only use the CLT for proportion when np ≥ 5 AND n(1-p) ≥ 5

Confidence Intervals - How certain do you want to be that you caught the population value?
Table derived from table 8.1 on pg. 391, similar to table 8.7 on page 422.



Confidence Level (1 - α)100% α α/2 Z α/2
100(1-0.15)% = 85% 0.15 0.075 1.44
100(1-0.10)% = 90% 0.10 0.05 1.645
100(1-0.05)% = 95% .05 0.025 1.96
100(1-0.01)% = 99% .01 0.005 2.576

We observed a bag with a proportion of .48 orange candies, but we don't know that the factory desires a 0.40 proportion for orange candies. Suppose we want to be 95% confident that our observation of 0.48 was within the factory specification.

To find our confidence intervals we'll use the following formula which can be found on p.391
Lower bound = x-bar - Z α/2(σ/√n)
Upper bound = x-bar + Z α/2(σ/√n)

So plugging in to find out lower bound...
Lower bound = 0.48 - 1.96(0.0980) = 0.2879
Upper bound = 0.48 + 1.96(0.0980) = 0.6721

In conclusion, we're 95% that the true proportion of orange candies in a given bag of Reese's Pieces will fall between 0.2879 and 0.6721.

What if we wanted to be 99% confident?
Simply changing the Z α/2 to reflect our desired confidence interval:
Lower bound = 0.48 - 2.576(0.0980) = 0.2276
Upper bound = 0.48 + 2.576(0.0980) = 0.7324
In conclusion, we're 99% that the true proportion of orange candies in a given bag of Reese's Pieces will fall between 0.2276 and 0.7324. 



In all reality we'll never know the true proportion of any given observation, so how do we adjust for this? We continue using the CLT for proportion but replace every instance of P (population proportion) with p-hat (our sample proportion). Use p-hat instead of P

Given
n = 25
p-hat = 0.48
standard deviation = √ [(P(1-P))/n]
But wait, we don't have P, what ever will we do? Use p-hat!
so plugging in: √[(0.48(1-0.48))/25] = 0.0999

Predicting a 95% confidence interval for this data...
Lower bound  = 0.48 - 1.96 (.0999) = 0.2842
Upper bound = 0.48 + 1.96(.0999) = 0.6758




Margin of error  = Z α/2p-hat)
Please note that the margin of error is exactly the same as portion of the confidence interval formula that lies to the right of the operation symbol (±).




Hypothesis Testing - Think of this as "looking for how much evidence we have against the null hypothesis" or "Using Statistics to answer a question"

Suppose that a recent poll reported that 49% of the United States is pro-choice, however you think it's higher.

You need to do a few things:
1) Form a null hypothesis (H0) which assumes the original value is true (status quo).
2) Offer an alternate hypothesis (HA) in which you make your claim. HA: P   P0, P > P0, or P <  P0 . In this example we are assuming P > P0.
3) Get sample and find a test statistic (Use the Z Score formula)
4) Go to the chart and find the P-Value for your Z Score of interest.

  • P(Z > z) for  P > P0
  • P(Z < z) for   P < P0
  • 2P(Z<z) = P   P0 

5) Conclusion

No comments:

Post a Comment