# A motivating exmaple

Effectiveness of a drug product in reducing anxiety levels of anxious patients

• Estimation: What is the mean decrease in anxiety ratings for the population?

• Testing: Is the mean drop in anxiety ratings greater than zero?

# Concepts to be covered

• Point estimate, confidence interval, etc

• Null hypothesis, statistical test, Type I/II errors, etc

• Bootstrap and large sample approximation

# The Normal distribution

• Notation: $$Y \sim \mathsf{N}(\mu,\sigma^2)$$ means that $$Y$$ follows a Normal distribution with mean $$\mu$$ and variance $$\sigma^2$$

• If $$Y \sim \mathsf{N}(\mu,\sigma^2)$$, then $$Z = \frac{Y -\mu}{\sigma}$$ follows $$\mathsf{N}(0,1)$$

• When $$Y \sim \mathsf{N}(\mu,\sigma^2)$$, $$z = \frac{y-\mu}{\sigma}$$ is referred to as $$z$$-score when $$Y=y$$

• Compute related probabilities: Table 1 in textbook, and lecture notes

# Distribution of sample mean

Assume $$x_1,x_2,\ldots, x_n$$ is a random sample of size $$n$$ from a distribution with mean $$\mu$$ and standard deviation $$\sigma$$. Set $\bar{x} = \frac{1}{n}\sum_{i=1}^n x_i$

• $$\bar{x}$$ follows a distribution with mean $$\mu$$ and standard deviation $$\sigma/\sqrt{n}$$
• $$\bar{x} \sim \mathsf{N}(\mu,\sigma^2/n)$$ if the sample is from $$\mathsf{N}(\mu,\sigma^2)$$
• CLT: $$\sqrt{n} \left(\bar{x} - \mu\right) \stackrel{d}{\to} \mathsf{N}\left(0,\sigma^2\right)$$ in general

# Estimating population mean

• How to estimate the population mean $$\mu$$?
• Use sample mean?
• Use sample median?
• Use some ad hoc quantity?

# Sample mean as an estimate

• Assume $$x_1,x_2,\ldots, x_n$$ is a random sample of size $$n$$ from a distribution with population mean $$\mu$$
• Then the sample mean $\bar{x} = \frac{1}{n}\sum_{i=1}^n x_i$ is an estimator of $$\mu$$

# Sample mean as an estimate

• When will sample mean be a good estimate of population mean?

• When will it not be?

# Illustration: I

• Distribution $$\mathsf{N}(2, 1^2)$$ with population mean $$\mu=2$$.
• Random sample of size $$n=100$$
> set.seed(567)
> d1 = rnorm(100, mean=2, sd=1)
> xbar = mean(d1)
> xbar-2
[1] -0.09954917

Note the difference

# Illustration: II

• Distribution $$\mathsf{N}(2, 1^2)$$ with population mean $$\mu=2$$.
• Random sample of size $$n=100$$
> set.seed(123)
> d1 = rnorm(100, mean=2, sd=1)
> xbar = mean(d1)
> xbar-2
[1] 0.09040591

Note the difference

# Illustration: III

• Distribution $$\mathsf{N}(2, 1^2)$$ with population mean $$\mu=2$$.
• Random sample of size $$n=5,000$$
> set.seed(123)
> d1 = rnorm(5000, mean=2, sd=1)
> xbar = mean(d1)
> xbar-2
[1] -0.0005695937

Note the difference

# Illustration: IV

• Distribution with population mean $$\mu=0$$.
• Non-random sample of size $$n=5,000$$
> set.seed(123)
> z = rnorm(5000, mean=0, sd=1)
> y = rnorm(1, mean=0, sd=1)
> d1 = sqrt(0.7)*y + sqrt(0.3)*z
> xbar = mean(d1)
> xbar-0
[1] -0.4137675

Note the difference

# Question

What are two important properties you remember about the distribution of sample mean?

# Main messages

• Sample mean is a random variable and depends on which sample it is computed from
• Sample mean may not be a very good estimate of population mean if the sample is not a random sample
• Sample mean usually estimates population mean more accurately as the sample size of the random sample increases

# Confidence interval

• is an interval estimate of population mean $$\mu$$
• quantifies how likely population mean is contained in an interval formed by sample mean
• is usually symmetric around sample mean
• quantifies how variable sample mean is relative to population mean

# Interpretation

A 95% confidence interval for population mean $$\mu$$ means that

• 95% of the time in repeated sampling from the same population, the interval will contain $$\mu$$; see Table 5.1 and related contents in Text
• In a single random sampling, it is ok that the interval does not contain $$\mu$$
• The confidence coefficient is 95%

# Example 5.1 in Text

• Claim: courier company claims that its mean delivery time is less than 3 hours
• Sample: 50 deliveries, whose mean delivery time is 2.8 hours and standard deviation 0.6 hours
• Is “2.8 hours” population mean? Is “0.6 hours” population standard deviation?

# Example (cont’d)

• Hypothesis: population mean delivery time $$\mu <3$$ hours
• Sample size $$n=50$$
• Sample mean $$\bar{x} = 2.8$$ hours; sample standard deviation $$s=0.6$$ hours
• Task 1: construct 95% confidence interval for $$\mu$$
• Task 2: check the hypothesis that $$\mu <3$$

# Example (cont’d)

• Sample size $$n=50$$; OK to use CLT
• Sample mean $$\bar{x}$$ approximately is $$\mathsf{N}(\mu, \sigma^2/n)$$
• $$\sigma$$ the population standard deviation, which is unknown

• $$\bar{z} = \frac{\bar{x} - \mu}{\sigma/\sqrt{n}}$$ is approximately $$\mathsf{N}(0, 1)$$

• Since $$P(\bar{z} \in [-1.96,1.96]) \approx 0.95$$ $P(\mu \in [\bar{x}-1.96\sigma/\sqrt{n}, \bar{x}+1.96\sigma/\sqrt{n}]) \approx 0.95$
• Interval above is approximate 95% confidence interval

# Example (cont’d)

• Sample mean $$\bar{x} = 2.8$$; sample standard deviation $$s=0.6$$
• $$s$$ estimates population std dev $$\sigma$$
• Replace interval $[\bar{x}-1.96\sigma/\sqrt{n}, \bar{x}+1.96\sigma/\sqrt{n}]$ by interval $[2.8-1.96\times \frac{s}{\sqrt{50}}, 2.8+1.96\times \frac{s}{\sqrt{50}}],$ i.e., $[2.8-.166, 2.8+.166]=[2.634,2.966]$

# Example (cont’d)

• approximate 95% confidence interval for $$\mu$$ is $[2.8-.166, 2.8+.166]=[2.634,2.966],$ i.e., $P(\mu \in [2.634,2.966]) \approx 0.95$
• Confidence interval lies to the left of $$3$$
• Strong evidence that $$\mu <3$$

# Example (cont’d)

Would the previous confidence interval

• change if it is computed based on another 50 delivery times?
• be narrower if the sample size is much larger?
• change if the population standard dev is known?
• change if the confidence coefficient is changed into 99%?

# Example (cont’d)

• Now suppose $$\bar{x} = 2.8, s=0.6$$ but $$n=10000$$
• 95% confidence interval becomes $[2.8-1.96\times \frac{.6}{\sqrt{1000}}, 2.8+1.96\times \frac{.6}{\sqrt{10000}}],$ i.e., $[2.788, 2.812]$
• Compare this to $$[2.634,2.966]$$

# Confidence interval

• quantifies how likely population mean is contained in an interval formed by sample mean
• is usually symmetric around sample mean, and takes the form $[\bar{x}-z_{\alpha/2}\sigma/\sqrt{n}, \bar{x}+z_{\alpha/2}\sigma/\sqrt{n}],$ where $$z_{\alpha/2}$$ is associated with a reference distribution and $$1-\alpha$$ is the confidence coefficient

# Relationship to z-Values

• Table 5.2 on page 229 of Text
• E.g., z-value corresponding to confidence coefficient .95 is 1.96
• Table 5.2 is reproduced in pdf format of lecture notes

# General principle

Suppose we want to have a $$100(1-\alpha)\%$$ CI of width $$w$$ and of the form $$\bar{y} \pm \frac{w}{2}$$, we solve $\frac{w}{2} = z_{\alpha/2}\frac{\sigma}{\sqrt{n}},$ which gives the needed sample size $$n$$ as $n = \frac{(z_{\alpha/2})^2\sigma^2}{(w/2)^2}$

# General principle (cont’d)

• Population standard deviation $$\sigma$$ unknown
• Replace $$\sigma$$ by sample standard deviation $$s$$ in $n = \frac{(z_{\alpha/2})^2\sigma^2}{(w/2)^2}$ to obtain $n = \frac{(z_{\alpha/2})^2 s^2}{(w/2)^2}$

Estimate average annual textbook expenditure to be within $25 of the mean expenditure for all undergrads at Univ A. How many students should we sample in order to be 95% confident that the estimate will satisfy the level of accuracy? # Example (cont’d) • Data collected show the annual textbook expenditure has a histogram that is Normal in shape with cost ranging from $$250$$ and $$750$$ • Estimate $$\sigma$$ as $$\hat{\sigma}=\frac{range}{4} = \frac{750-250}{4}=125$$ • Why estimate $$\sigma$$ this way? # Example (cont’d) • Level of accuracy$25
• Width of CI \$50 (why?)
• Level of confidence 95%, implying $$z_{\alpha/2} = z_{.05/2}=z_{.025}=1.96$$ (why?)
• By formula $n = \frac{1.96^2 125^2}{25^2} = 96.04$

# Warm-up

Generate $$1$$ observation $$x_1$$ randomly from $$\mathsf{N}(\mu,1^2)$$

• if $$x_1 = 0.05$$, how likely $$\mu=0$$?
• if $$x_1 = 2$$, how likely $$\mu=0$$?

Assume $$\mu=0$$, how likely is it for $$x_1 \ge 1.96$$?

# Warm-up (cont’d)

• Assume $$\mu=0$$, $$P(x_1 \ge 1.96) \approx 0.025$$

# Warm-up (cont’d)

• Assume $$\mu=1$$, $$P(x_1 \ge 1.96) \approx 0.169$$

# Warm-up (cont’d)

Generate $$n=10,000$$ observations $$x_i$$ randomly from $$\mathsf{N}(\mu,1^2)$$. If $$\bar{x} = 0.3$$ and $$s = 0.85$$,

• how likely $$\mu=0$$?
• how likely $$\mu=0.3$$?

# Warm-up (cont’d)

To check $$\mu=0$$ against $$\mu \ne 0$$, how many types of mistakes can be made? What are they? What is the best one can do?

# Warm-up (cont’d)

What can you say about inferring on $$\mu$$ based on $$\bar{x}$$, $$s$$ and $$n$$?

# Testing population mean

Recall the delivery time example:

• Population mean $$\mu <3$$ hours is claimed
• Another claim: $$\mu = 2.8$$ hours
• Another claim: $$\mu \ge 2.8$$ hours
• We only looked into confidence interval

# Example: soybean yield

To determine whether “the mean yield per acre (in bushels) for a particular variety of soybeans has increased during the current year over the mean yield in the previous 2 years when $$\mu$$ was 520”

• Hypothesis 1: the mean yield has increased
• Hypothesis 2: the mean yield has NOT increased
• Hypothesis 2 negates hypothesis 1

# Example (cont’d)

• How to obtain evidence against or for each of these hypotheses?
• How to make a decision based on the evidence?
• What if the mean yield has increased but I said it has not?
• What if the mean yield has NOT increased but I said it has?

# Example (cont’d)

Formal definitions:

• Null hypothesis $$H_0$$: the mean yield has NOT increased, i.e., $$\mu \le 520$$
• Alternative hypothesis $$H_a$$: one proposed by the researcher, i.e., $$\mu > 520$$
• $$H_a$$ negates $$H_0$$
• What else is needed to determine if $$\mu > 520$$?

# Example (cont’d)

Would it be sufficient to have …

• the standard deviation of the mean yield?
• the sample mean of the yields the current year?

How to quantify evidence for $$\mu > 520$$?

# Example (cont’d)

• obtain a sample of $$n=36$$ one-acre-plot yields
• sample mean $$\bar{x} = 573$$ and sample std deviation $$s=124$$
• $$\bar{x} = 573 > 520$$; likely the yield has increased (how likely?)
• a first step: define $T = \frac{\bar{x} - 520}{124/\sqrt{36}}=\frac{\bar{x} - \mu}{s/\sqrt{n}}$
• $$T$$ roughly quantifies how likely $$\bar{x} = 573$$ appears by chance, so that it is unlikely that the yield has increased

# Example (cont’d)

• recall $$T = \frac{\bar{x} - \mu}{s/\sqrt{n}}$$
• if one knows the distribution of $$T$$, then the distribution of mean yield is available to quantify the evidence
• ideally $$G = \frac{\bar{x} - \mu}{\sigma/\sqrt{n}}$$ should be used but $$\sigma$$ is unknown
• if $$s \approx \sigma$$, $$T$$ is probably ok
• $$T$$ is called a “test statistic”

# Example (cont’d)

Suppose $$\bar{x} \sim \mathsf{N}(\mu,\sigma/\sqrt{n})$$ and $$s \approx \sigma$$

• $$T = \frac{\bar{x} - \mu}{s/\sqrt{n}} \sim \mathsf{N}(0,1)$$

• $$\sigma/\sqrt{n} \approx s/\sqrt{n} = 20.67$$

• $$T = \frac{573-520}{20.67} = 2.564103$$

• to reject $$H_a: \mu > 520$$ when $$T = 2.564103$$? How risky is it to do so since we obtained $$\bar{x}$$ from a sample of size $$36$$?

# Example (cont’d)

• recall: if $$X \sim \mathsf{N}(0,1)$$, then $$P(X > 1.96) = 0.025,$$ i.e., $$1.96 = z_{.025}$$

• reject $$H_0: \mu \le 520$$ whenever $$T > 1.96$$ comes with a risk of $$\alpha= .025$$ of incorrectly rejecting $$H_0$$

• i.e., the probability of Type I error is $$\alpha= .025$$ when rejecting $$H_0$$ whenever $$T > z_{.025}$$

# Example (cont’d)

• the rejection region corresponding to $$\alpha = .025$$ is $$R = \{t: t \ge z_{.025}\}$$
• i.e., if $$T \in R$$, then rejection $$H_0: \mu \le 520$$

• the above is a one-tailed test or one-sided test
• p-value of $$T=2.564103$$ is

> 1-pnorm(2.564103)
[1] 0.005172142

# Example (two-tailed test)

Soybean yield

• $$H_0: \mu = 520$$ versus $$H_a: \mu \ne 520$$
• probability of Type I error: $$\alpha = .05$$
• reject $$H_0: \mu = 520$$ if $$|T| \ge z_{.025}$$
• i.e., reject $$H_0: \mu = 520$$ if $$T \ge z_{.025}$$ or $$T \le -z_{.025}$$
• A two-tailed test

Soybean yield

# Example 5.6

• Question: mean cholesterol level for Group 1 different than that for Group 2
• Group 2: cholesterol level approximately Normal with mean $$\mu_0 = 190$$ mg/dL
• Sample from Group 1: $$n=100$$, $$\bar{x} = 178.2$$ mg/dL, $$s = 45.3$$ mg/dL
• $$H_a: \mu \ne 190$$ versus $$H_0: \mu = 190$$

# Example 5.6 (cont’d)

• $$n=100$$; OK to use CLT
• $$\bar{x}$$ is approximately Normal
• if $$H_0: \mu = 190$$ were true, the mean of $$\bar{x}$$ should be?
• if $$H_0: \mu = 190$$ were true, $$z = \frac{\bar{x} - \mu_0}{\sigma/\sqrt{n}}$$ should be?
• $$\frac{178.2 - 190}{45.3/\sqrt{100}}=-2.6$$
• for $$\alpha=.05$$, recall $$z_{\alpha/2} = z_{.025} = 1.96$$
• decision: …

# Summary

Hypotheses:

• case 1: $$H_0: \mu \le \mu_0$$ vs $$H_a: \mu > \mu_0$$
• case 2: $$H_0: \mu \ge \mu_0$$ vs $$H_a: \mu < \mu_0$$
• case 3: $$H_0: \mu = \mu_0$$ vs $$H_a: \mu \ne \mu_0$$

Test statistic: $$z = \frac{\bar{z} - \mu_0}{s/\sqrt{n}}$$

If $$z \sim \mathsf{N}(0,1)$$, then rejection region for a prob of Type I error:

• case 1: reject $$H_0$$ if $$z \ge z_{\alpha}$$
• case 2: reject $$H_0$$ if $$z \le - z_{\alpha}$$
• case 3: reject $$H_0$$ if $$|z| \ge z_{\alpha/2}$$

# Recap (one-tailed test)

To check $$\mu \le 0$$ against $$\mu > 0$$

# Recap (two-tailed test)

To check $$\mu=0$$ against $$\mu \ne 0$$

# Some definitions

• Type II error: an error made when $$H_0$$ is accepted when $$H_0$$ is false
• Probability of a Type II error: denoted by $$\beta$$
• Power of a test: $$1-\beta$$

# Compute power (Example 5.7)

• previously: number of improperly issued tickets per officer $$Y \sim \mathsf{N}(\mu=380, \sigma^2= 35.2^2)$$
• current: changes in regulations affected how tickets are issued
• check $$H_0: \mu \le 380$$ vs $$H_a: \mu > 380$$
• $$\mu_0 = 380$$

# Compute power (Example 5.7)

• $$n=50$$, $$\bar{x} = 390$$, $$\sigma=35.2$$
• test statistic: $$z = \frac{\bar{x} - \mu_0}{\sigma/\sqrt{n}} = 2.01$$
• rejection region: for $$\alpha=.01$$ and a right-tailed test, reject $$H_0$$ if $$z \ge z_{0.01} = 2.33$$

# Compute power (Example 5.7)

• $$H_0: \mu \le 380$$ vs $$H_a: \mu > 380$$
• what if $$n=50$$, $$\sigma=35.2$$ but $$\bar{x} = 385$$ and $$\alpha = .01$$?
• in this case, $$z = 1.004 < z_{0.01}$$, and $$H_0$$ is accepted, i.e., accept $$H_0$$ when $$z < z_{\alpha}$$

• if the actual mean number of improper tickets is $$\mu_a = 395$$ per officer, then a Type II error has been made
• what is the prob $$\beta$$ of making such an error in this case?

# Compute power (Example 5.7)

• $$H_0: \mu \le 380$$ vs $$H_a: \mu > 380$$
• $$\mu_0 = 380$$, $$\mu_a = 395$$
• $$\beta(395) = P\left( z < z_{.01} - \frac{|\mu_0 - \mu_a|}{\sigma/\sqrt{n}}\right)$$, implying $$\beta(395)= P(z < 2.33 - \frac{|380 - 395|}{25.2/\sqrt{50}})$$
• simplification: $$\beta(395)= P(z < -.68) \approx .2483$$
• PWR(395) = 1 - $$\beta(395)$$

# Compute power (Extra)

• Recall: $$H_0$$ is accepted when $$z < z_{\alpha}$$, where $z = \frac{\bar{x} - \mu_0}{\sigma/\sqrt{n}}$

• Recall: $$\mu_a$$ is the actual mean, so $\bar{x} = n^{-1} \sum_{i=1}^m x_i \sim \mathsf{N}(\mu_a, \sigma^2/n)$

• So, $$z = \frac{\bar{x} - \mu_0}{\sigma/\sqrt{n}} \sim \mathsf{N}\left(\delta, 1\right)$$, where $$\delta = \frac{\mu_a - \mu_0}{\sigma/\sqrt{n}}$$

# Compute power (Extra)

• Recall: $$H_0$$ is accepted when $$z < z_{\alpha}$$

• When $$z = \frac{\bar{x} - \mu_0}{\sigma/\sqrt{n}} \sim \mathsf{N}\left(\delta, 1\right)$$ and $$\delta = \frac{\mu_a - \mu_0}{\sigma/\sqrt{n}}$$, we have $$\beta(\mu_a)= P(z < z_{\alpha}) = P(Z < z_{\alpha} -\delta),$$ where $$Z \sim \mathsf{N}(0,1)$$

# Compute power: summary

With Normal reference distribution

• One-tailed test: $\beta(\mu_a) = P\left( z < z_{\alpha} - \frac{|\mu_0 - \mu_a|}{\sigma/\sqrt{n}}\right)$

• two-tailed test: $\beta(\mu_a) \approx P\left( z < z_{\alpha/2} - \frac{|\mu_0 - \mu_a|}{\sigma/\sqrt{n}}\right)$

# Compute power

General principle: $\textrm{PWR}(\mu_a) = 1 - \beta(\mu_a),$ where $\beta(\mu_a) = P(\textrm{accept } H_0 \textrm{ when } H_a \textrm{ is true})$

In contrast: $\alpha = P(\textrm{reject } H_0 \textrm{ when } H_0 \textrm{ is true})$

# Power curve

Recall: $$\textrm{PWR}(\mu) = 1 - \beta(\mu).$$ Plotting $$\mu$$ against $$\textrm{PWR}(\mu)$$ gives the power curve

# Power curve

• What are the main quantities in hypothesis testing that affect the Type I and II errors?
• Type II error focuses more on the null hypothesis?

# Calculate sample size and p-value

• purpose of sample size calculation: to maintain power at a specific level, subject to a Type I error level

• p-value: the smallest Type I error probability when the null hypothesis is rejected at the observed value of the test statistic

# p-value: illustration

• test $$H_0: \mu =0$$ vs $$H_a: \mu \ne 0$$
• assume $$T \sim \mathbf{N}(0,1)$$ under $$H_0$$
• p-value for the observed statistic $$T=2$$ is $2 \times P(T > |2|) = 0.0455$

# p-value: illustration

• test $$H_0: \mu \le 0$$ vs $$H_a: \mu > 0$$
• assume $$T \sim \mathbf{N}(0,1)$$ when $$\mu = 0$$
• p-value for the observed statistic $$T=2$$ is $P(T > 2) = 0.0228$

# p-value: illustration

• p-value is computed by assuming $$H_0$$ is true
• common practice: if p-value $$p(T) < \alpha$$, reject $$H_0$$, where $$\alpha$$ is the probability of Type I error
• reporting only p-value is not enough; better to also report power

# Questions

Scenario 1:

• Test A: p-value $$.01$$ and Type II error probability $$.2$$
• Test B: p-value $$.01$$ and Type II error probability $$.05$$
• Which test do you prefer?

Scenario 1:

• Test C: p-value $$.01$$ and Type II error probability $$.05$$
• Test D: p-value $$.05$$ and Type II error probability $$.05$$
• Which test do you prefer?

# Student’s t-test

• Recall the test statistic: $$Z = \frac{\bar{x} - \mu}{\sigma/\sqrt{n}}$$ with a known std dev $$\sigma$$
• Consider the test statistic: $$T = \frac{\bar{x} - \mu}{s/\sqrt{n}}$$ where $$s$$ is the sample sdt dev

• When $$\bar{x}$$ and $$s$$ are both based on a random sample of size $$n$$ from $$\mathsf{N}(\mu,\sigma^2)$$, $$T$$ follows the Student’s t distribution with degrees of freedom $$\textrm{df}=n-1$$ and mean $$0$$
• As $$n$$ increases, the distribution of $$T$$ approaches that of $$Z$$

# Student’s t-test (Extra)

$$x_1,\ldots,x_n$$ a random sample from $$\mathsf{N}(\mu,\sigma^2)$$

• sample mean $$\bar{x} \sim \mathsf{N}(\mu,\sigma^2/n)$$, i.e., $$Z = \frac{\bar{x} - \mu}{\sigma/\sqrt{n}} \sim \mathsf{N}(0,1)$$

• sample standard deviation $$s$$: $$\frac{s^2}{\sigma^2} \sim \chi^2 (n-1)$$

• further $$\bar{x}$$ and $$\frac{s^2}{\sigma^2}$$ are independent

• so $$T = \frac{Z}{\sqrt{s^2/\sigma^2}} = \frac{\bar{x} - \mu}{s/\sqrt{n}}$$ follows a Student’s t distribution with $$\textrm{df} = n-1$$ and mean $$0$$

# Example 5.15

• food-borne illness associated with Salmonella enteritidis
• need to determine the average level of Salmonella enteritidis in ice cream.
• $$H_0: \mu \le .3$$ vs $$H_a: \mu > .3$$
• $$\mu_0 = .3$$

# Example 5.15 (cont’d)

> sel = c(.593,.142,.329,.691,.231,.793,.519,.392,.418)
> qqnorm(sel); qqline(sel, col = 2, lwd=2)

# Example 5.15 (cont’d)

Ok to use Student’s t test

> ks.test(sel, "pnorm",mean(sel),sd(sel))

One-sample Kolmogorov-Smirnov test

data:  sel
D = 0.12722, p-value = 0.9941
alternative hypothesis: two-sided

# Example 5.15 (cont’d)

• $$\bar{x} = .456$$, $$s = .2128$$, $$n=9$$
• Recall $$T = \frac{\bar{x} - \mu_0}{s/\sqrt{n}}$$
• $$t = \frac{.456 - .3}{.2128/\sqrt{9}} = 2.20$$
• $$T$$ has $$8 = 9-1$$ degrees of freedom

• For $$\alpha=.01$$, reject $$H_0$$ if $$T > t_{.01}$$ with $$t_{.01}= 2.896$$
• Conclusion?

# Example 5.15 (cont’d)

• Recall $$H_0: \mu \le .3$$ vs $$H_a: \mu > .3$$
• p-value when $$t = 2.20$$ is $$P(T > 2.20)$$, where we take upper tail since we reject $$H_0$$ when $$T > t_{\alpha}$$

• p-value $$p(t) = .029 > \alpha = 0.01$$
• Conclusion?

> pt(2.20,df=8,ncp=0, lower.tail=FALSE)
[1] 0.02949695

# Example 5.15 (cont’d)

Confidence interval (CI) for $$\mu_0 = .3$$

• $$\alpha=.02$$, i.e., 98% CI is needed
• $$t_{\alpha/2} = t_{.01}= 2.896$$; $$s=.2128$$, $$n=9$$

• CI: $$.456 \pm 2.896 \times \frac{.2128}{\sqrt{9}}$$

Formula for $$(1-\alpha)\%$$ CI: $\bar{x} \pm t_{\alpha/2} \frac{s}{\sqrt{n}}$

# Power of Student’s t-test

Recall $\textrm{PWR}(\mu_a) = 1 - \beta(\mu_a),$ where $\beta(\mu_a) = P(\textrm{accept } H_0 \textrm{ when } H_a \textrm{ is true})$

# Power of Student’s t-test

Example 5.15

• recall $$H_0: \mu \le .3$$ vs $$H_a: \mu > .3$$

• $$T = \frac{\bar{x} - \mu_0}{s/\sqrt{n}}$$, $$\mu_0=.3$$, $$t = 2.20$$

• $$T$$ has 8 degrees of freedom
• if $$\mu_a = 0.45,$$ what is the power?

# Power of Student’s t-test (Extra)

Recall $$T = \frac{\bar{x} - \mu_0}{s/\sqrt{n}}$$. If $$\mu=\mu_a$$, the distribution of $$T$$ differs from that with $$\mu = \mu_0$$ by

$\delta = \frac{\mu_a - \mu_0}{\sigma/\sqrt{n}}$ in centrality; $$\delta$$ is called the non-centrality parameter

# Power of Student’s t-test

Example 5.16

• Recall $$H_0: \mu \le .3$$ vs $$H_a: \mu > .3$$
• $$n=$$, $$\mu_0 = .3$$, $$\sigma = 0.25$$

• $$T = \frac{\bar{x} - \mu_0}{s/\sqrt{n}}$$
• If $$\mu_a = 0.45$$, then $$\delta = \frac{\mu_a - \mu_0}{\sigma/\sqrt{n}} = 1.8$$

# Power of Student’s t-test

Example 5.16 (con’t): recall $$t_{.01} \approx 2.896$$ and $\delta = \frac{\mu_a - \mu_0}{\sigma/\sqrt{n}} = 1.8$

So, at prob of Type I error $$\alpha = 0.01$$,

> # Type II error prob
> beta = pt(2.896, df=8, ncp = 1.8, lower.tail = TRUE)
> beta
[1] 0.7931609
> # power
> 1- beta
[1] 0.2068391

Power: $$0.21$$ (how to interpret power?)

Reference: The Power of Student’s t-Test, page 1

# Student’s t-test

• Read carefully Summary on page 252 of Text
• Less sensitive to outliers relative to Normal distribution
• Usually robust to violations on Normality

# Key issue in statistical inference

The distribution of the test statistics is unknown, e.g., data no longer Normal; data not necessary a random sample

If you have $$10,000$$ different random samples each with sample size $$n=100$$

• how many realizations of the test statistic you have?
• do you have much information about the distribution of the test statistic?

# Bootstrap

A resampling technique to approximate the distribution of the test statistic by

• resampling randomly with replacement from the original data as if each newly obtained sample is a different sample than the original data
• steps described in Section 5.8 of Text

Wait … what is going on “resample from the original data” ???

# Bootstrap

The workflow:

1. compute the actual value of test statistic from original data

2. sample from original data, compute the value of test statistic using this sample

3. repeat 2. many times and collect test statistic values based on bootstrap samples

4. use the distribution of bootstrap-based test statistic values as distribution of the test statistic to draw inference

# Bootstrap: illustration

> set.seed(123)
> n = 50  # actual sample size
> obs = rnorm(n) # generate random sample from standard Normal
> # sample randomly and with replacement from original data
> bsSample1 = sample(obs, n, replace = TRUE, prob = rep(1/n,n))

# Bootstrap: illustration

> plot(bsSample1,obs,xlab="Original data", ylab="Bootstrap sample", main ="")
> abline(a=0,b=1,col="red",lwd=2)

# Bootstrap: illustration

> mean(obs) # sample mean of original data
[1] 0.03440355
> mean(bsSample1) # sample mean of a bootstrap sample
[1] -0.05722458

# Bootstrap

To approximate the distribution of sample mean $$\bar{x}$$

# Bootstrap

To approximate the distribution of test statistic $$T = \frac{\bar{x}-0}{s/\sqrt{50}}$$ when testing $$H_0: \mu \le 0$$ vs $$H_a: \mu >0$$

# Bootstrap

To build a 95% confidence interval (CI) for $$\mu$$

> # compute .025 percentile and .975 percentile
> tL = quantile(tbBootstamp, probs = .025)
> tL
2.5%
-1.772072
> tU = quantile(tbBootstamp, probs = .975)
> tU
97.5%
2.23689
> # confidence bounds
> bL = mean(obs) - tL*(sd(obs)/sqrt(n)) #left end point
> bU = mean(obs) + tU*(sd(obs)/sqrt(n)) # right end point

The CI is [0.0384,0.0397]

# Bootstrap

Theoretical 95% confidence interval for $$\mu$$ is [-0.222,0.291], obtained using Z-test

> # theoretical confidence bounds
> bLT = mean(obs) - qnorm(0.975)*(sd(obs)/sqrt(n))
> bLT # left end point
[1] -0.2222298
> bUT = mean(obs) + qnorm(0.975)*(sd(obs)/sqrt(n))
> bUT # right end point
[1] 0.2910369

# Bootstrap

Check: $$H_0: \mu \le 0$$ vs $$H_a: \mu >0$$

Bootstrap p-value: number of times the bootstrap-based test statistics are larger than or equal to the observed test statistic divided by the number of bootstrap samples

> pval = sum(tbBootstamp >= ts)/B
> pval
[1] 0.49

Reject $$H_0$$?

# Bootstrap

• Computationally intensive resampling technique
• The source file of the lecture notes contains example codes for implementing bootstrap for Student’s t test
• Bootstrap techniques can be very useful, and you are encouraged to code them up

# Which to use?

Among Z-test, Student’s t test, and bootstrap based test for inference on population mean, consider scenarios:

• a random sample with large sample size
• a random sample that appears to be Normally distributed
• a random smple with moderate sample size but from an unknown distribution

Which test to use?

# Simple rules on homework

• If you use software to obtain answers, please attach the codes
• Homework assignments will be announced via Blackboard

> sessionInfo()
R version 3.3.0 (2016-05-03)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 15063)

locale:
[1] LC_COLLATE=English_United States.1252
[2] LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C
[5] LC_TIME=English_United States.1252

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods
[7] base

other attached packages:
[1] knitr_1.17

loaded via a namespace (and not attached):
[1] backports_1.1.0  magrittr_1.5     rprojroot_1.2
[4] tools_3.3.0      htmltools_0.3.6  revealjs_0.9
[7] yaml_2.1.14      Rcpp_0.12.12     codetools_0.2-14
[10] stringi_1.1.5    rmarkdown_1.6    stringr_1.2.0
[13] digest_0.6.12    evaluate_0.10.1