# Estimating population mean

- How to estimate the population mean \(\mu\)?
- Use sample mean?
- Use sample median?
- Use some ad hoc quantity?

# Sample mean as an estimate

- Assume \(x_1,x_2,\ldots, x_n\) is a random sample of size \(n\) from a distribution with population mean \(\mu\)
- Then the sample mean \[\bar{x} = \frac{1}{n}\sum_{i=1}^n x_i\] is an estimator of \(\mu\)

# Sample mean as an estimate

# Illustration: I

- Distribution \(\mathsf{N}(2, 1^2)\) with population mean \(\mu=2\).
- Random sample of size \(n=100\)

```
> set.seed(567)
> d1 = rnorm(100, mean=2, sd=1)
> xbar = mean(d1)
> xbar-2
[1] -0.09954917
```

Note the difference

# Illustration: II

- Distribution \(\mathsf{N}(2, 1^2)\) with population mean \(\mu=2\).
- Random sample of size \(n=100\)

```
> set.seed(123)
> d1 = rnorm(100, mean=2, sd=1)
> xbar = mean(d1)
> xbar-2
[1] 0.09040591
```

Note the difference

# Illustration: III

- Distribution \(\mathsf{N}(2, 1^2)\) with population mean \(\mu=2\).
- Random sample of size \(n=5,000\)

```
> set.seed(123)
> d1 = rnorm(5000, mean=2, sd=1)
> xbar = mean(d1)
> xbar-2
[1] -0.0005695937
```

Note the difference

# Illustration: IV

- Distribution with population mean \(\mu=0\).
- Non-random sample of size \(n=5,000\)

```
> set.seed(123)
> z = rnorm(5000, mean=0, sd=1)
> y = rnorm(1, mean=0, sd=1)
> d1 = sqrt(0.7)*y + sqrt(0.3)*z
> xbar = mean(d1)
> xbar-0
[1] -0.4137675
```

Note the difference

# Question

What are two important properties you remember about the distribution of sample mean?

# Main messages

- Sample mean is a random variable and depends on which sample it is computed from
- Sample mean may not be a very good estimate of population mean if the sample is not a random sample
- Sample mean usually estimates population mean more accurately as the sample size of the random sample increases

# Warm-up

Generate \(1\) observation \(x_1\) randomly from \(\mathsf{N}(\mu,1^2)\)

- if \(x_1 = 0.05\), how likely \(\mu=0\)?
- if \(x_1 = 2\), how likely \(\mu=0\)?

Assume \(\mu=0\), how likely is it for \(x_1 \ge 1.96\)?

# Warm-up (cont’d)

- Assume \(\mu=0\), \(P(x_1 \ge 1.96) \approx 0.025\)

# Warm-up (cont’d)

- Assume \(\mu=1\), \(P(x_1 \ge 1.96) \approx 0.169\)

# Warm-up (cont’d)

Generate \(n=10,000\) observations \(x_i\) randomly from \(\mathsf{N}(\mu,1^2)\). If \(\bar{x} = 0.3\) and \(s = 0.85\),

- how likely \(\mu=0\)?
- how likely \(\mu=0.3\)?

# Warm-up (cont’d)

To check \(\mu=0\) against \(\mu \ne 0\), how many types of mistakes can be made? What are they? What is the best one can do?

# Warm-up (cont’d)

What can you say about inferring on \(\mu\) based on \(\bar{x}\), \(s\) and \(n\)?

# Testing population mean

Recall the delivery time example:

- Population mean \(\mu <3\) hours is claimed
- Another claim: \(\mu = 2.8\) hours
- Another claim: \(\mu \ge 2.8\) hours
- We only looked into confidence interval

# Example: soybean yield

To determine whether “the mean yield per acre (in bushels) for a particular variety of soybeans has increased during the current year over the mean yield in the previous 2 years when \(\mu\) was 520”

- Hypothesis 1: the mean yield has increased
- Hypothesis 2: the mean yield has NOT increased
- Hypothesis 2 negates hypothesis 1

# Example (cont’d)

- How to obtain evidence against or for each of these hypotheses?
- How to make a decision based on the evidence?
- What if the mean yield has increased but I said it has not?
- What if the mean yield has NOT increased but I said it has?

# Example (cont’d)

Formal definitions:

- Null hypothesis \(H_0\): the mean yield has NOT increased, i.e., \(\mu \le 520\)
- Alternative hypothesis \(H_a\): one proposed by the researcher, i.e., \(\mu > 520\)
- \(H_a\) negates \(H_0\)
- What else is needed to determine if \(\mu > 520\)?

# Example (cont’d)

Would it be sufficient to have …

- the standard deviation of the mean yield?
- the sample mean of the yields the current year?

How to quantify evidence for \(\mu > 520\)?

# Example (cont’d)

- obtain a sample of \(n=36\) one-acre-plot yields
- sample mean \(\bar{x} = 573\) and sample std deviation \(s=124\)
- \(\bar{x} = 573 > 520\); likely the yield has increased (how likely?)
- a first step: define \[T = \frac{\bar{x} - 520}{124/\sqrt{36}}=\frac{\bar{x} - \mu}{s/\sqrt{n}}\]
- \(T\) roughly quantifies how likely \(\bar{x} = 573\) appears by chance, so that it is unlikely that the yield has increased

# Example (cont’d)

- recall \(T = \frac{\bar{x} - \mu}{s/\sqrt{n}}\)
- if one knows the distribution of \(T\), then the distribution of mean yield is available to quantify the evidence
- ideally \(G = \frac{\bar{x} - \mu}{\sigma/\sqrt{n}}\) should be used but \(\sigma\) is unknown
- if \(s \approx \sigma\), \(T\) is probably ok
- \(T\) is called a “test statistic”

# Example (cont’d)

Suppose \(\bar{x} \sim \mathsf{N}(\mu,\sigma/\sqrt{n})\) and \(s \approx \sigma\)

\(T = \frac{\bar{x} - \mu}{s/\sqrt{n}} \sim \mathsf{N}(0,1)\)

\(\sigma/\sqrt{n} \approx s/\sqrt{n} = 20.67\)

\(T = \frac{573-520}{20.67} = 2.564103\)

to reject \(H_a: \mu > 520\) when \(T = 2.564103\)? How risky is it to do so since we obtained \(\bar{x}\) from a sample of size \(36\)?

# Example (cont’d)

recall: if \(X \sim \mathsf{N}(0,1)\), then \(P(X > 1.96) = 0.025,\) i.e., \(1.96 = z_{.025}\)

reject \(H_0: \mu \le 520\) whenever \(T > 1.96\) comes with a risk of \(\alpha= .025\) of incorrectly rejecting \(H_0\)

i.e., the probability of Type I error is \(\alpha= .025\) when rejecting \(H_0\) whenever \(T > z_{.025}\)

# Example (cont’d)

- the rejection region corresponding to \(\alpha = .025\) is \(R = \{t: t \ge z_{.025}\}\)
i.e., if \(T \in R\), then rejection \(H_0: \mu \le 520\)

- the above is a one-tailed test or one-sided test
p-value of \(T=2.564103\) is

```
> 1-pnorm(2.564103)
[1] 0.005172142
```

# Example (two-tailed test)

Soybean yield

- \(H_0: \mu = 520\) versus \(H_a: \mu \ne 520\)
- probability of Type I error: \(\alpha = .05\)
- reject \(H_0: \mu = 520\) if \(|T| \ge z_{.025}\)
- i.e., reject \(H_0: \mu = 520\) if \(T \ge z_{.025}\) or \(T \le -z_{.025}\)
- A two-tailed test

# Example (two-tailed test)

Soybean yield

# Example 5.6

- Question: mean cholesterol level for Group 1 different than that for Group 2
- Group 2: cholesterol level approximately Normal with mean \(\mu_0 = 190\) mg/dL
- Sample from Group 1: \(n=100\), \(\bar{x} = 178.2\) mg/dL, \(s = 45.3\) mg/dL
- \(H_a: \mu \ne 190\) versus \(H_0: \mu = 190\)

# Example 5.6 (cont’d)

- \(n=100\); OK to use CLT
- \(\bar{x}\) is approximately Normal
- if \(H_0: \mu = 190\) were true, the mean of \(\bar{x}\) should be?
- if \(H_0: \mu = 190\) were true, \(z = \frac{\bar{x} - \mu_0}{\sigma/\sqrt{n}}\) should be?
- \(\frac{178.2 - 190}{45.3/\sqrt{100}}=-2.6\)
- for \(\alpha=.05\), recall \(z_{\alpha/2} = z_{.025} = 1.96\)
- decision: …

# Summary

Hypotheses:

- case 1: \(H_0: \mu \le \mu_0\) vs \(H_a: \mu > \mu_0\)
- case 2: \(H_0: \mu \ge \mu_0\) vs \(H_a: \mu < \mu_0\)
- case 3: \(H_0: \mu = \mu_0\) vs \(H_a: \mu \ne \mu_0\)

Test statistic: \(z = \frac{\bar{z} - \mu_0}{s/\sqrt{n}}\)

If \(z \sim \mathsf{N}(0,1)\), then rejection region for a prob of Type I error:

- case 1: reject \(H_0\) if \(z \ge z_{\alpha}\)
- case 2: reject \(H_0\) if \(z \le - z_{\alpha}\)
- case 3: reject \(H_0\) if \(|z| \ge z_{\alpha/2}\)

# Student’s t-test

- Recall the test statistic: \(Z = \frac{\bar{x} - \mu}{\sigma/\sqrt{n}}\) with a known std dev \(\sigma\)
Consider the test statistic: \(T = \frac{\bar{x} - \mu}{s/\sqrt{n}}\) where \(s\) is the sample sdt dev

- When \(\bar{x}\) and \(s\) are both based on a random sample of size \(n\) from \(\mathsf{N}(\mu,\sigma^2)\), \(T\) follows the Student’s t distribution with degrees of freedom \(\textrm{df}=n-1\) and mean \(0\)
As \(n\) increases, the distribution of \(T\) approaches that of \(Z\)

# Example 5.15

- food-borne illness associated with Salmonella enteritidis
- need to determine the average level of Salmonella enteritidis in ice cream.
- \(H_0: \mu \le .3\) vs \(H_a: \mu > .3\)
- \(\mu_0 = .3\)

# Example 5.15 (cont’d)

```
> sel = c(.593,.142,.329,.691,.231,.793,.519,.392,.418)
> qqnorm(sel); qqline(sel, col = 2, lwd=2)
```

# Example 5.15 (cont’d)

Ok to use Student’s t test

```
> ks.test(sel, "pnorm",mean(sel),sd(sel))
One-sample Kolmogorov-Smirnov test
data: sel
D = 0.12722, p-value = 0.9941
alternative hypothesis: two-sided
```

# Example 5.15 (cont’d)

```
> pt(2.20,df=8,ncp=0, lower.tail=FALSE)
[1] 0.02949695
```

# Example 5.15 (cont’d)

Confidence interval (CI) for \(\mu_0 = .3\)

- \(\alpha=.02\), i.e., 98% CI is needed
\(t_{\alpha/2} = t_{.01}= 2.896\); \(s=.2128\), \(n=9\)

CI: \(.456 \pm 2.896 \times \frac{.2128}{\sqrt{9}}\)

Formula for \((1-\alpha)\%\) CI: \[\bar{x} \pm t_{\alpha/2} \frac{s}{\sqrt{n}}\]

# Power of Student’s t-test

Recall \[\textrm{PWR}(\mu_a) = 1 - \beta(\mu_a),\] where \[\beta(\mu_a) = P(\textrm{accept } H_0 \textrm{ when } H_a \textrm{ is true})\]

# Power of Student’s t-test

Example 5.15

recall \(H_0: \mu \le .3\) vs \(H_a: \mu > .3\)

\(T = \frac{\bar{x} - \mu_0}{s/\sqrt{n}}\), \(\mu_0=.3\), \(t = 2.20\)

- \(T\) has 8 degrees of freedom
if \(\mu_a = 0.45,\) what is the power?

# Power of Student’s t-test

Example 5.16

- Recall \(H_0: \mu \le .3\) vs \(H_a: \mu > .3\)
\(n=\), \(\mu_0 = .3\), \(\sigma = 0.25\)

- \(T = \frac{\bar{x} - \mu_0}{s/\sqrt{n}}\)
If \(\mu_a = 0.45\), then \(\delta = \frac{\mu_a - \mu_0}{\sigma/\sqrt{n}} = 1.8\)

# Power of Student’s t-test

Example 5.16 (con’t): recall \(t_{.01} \approx 2.896\) and \[\delta = \frac{\mu_a - \mu_0}{\sigma/\sqrt{n}} = 1.8\]

So, at prob of Type I error \(\alpha = 0.01\),

```
> # Type II error prob
> beta = pt(2.896, df=8, ncp = 1.8, lower.tail = TRUE)
> beta
[1] 0.7931609
> # power
> 1- beta
[1] 0.2068391
```

Power: \(0.21\) (how to interpret power?)

Reference: The Power of Student’s t-Test, page 1

# Student’s t-test

- Read carefully Summary on page 252 of Text
- Less sensitive to outliers relative to Normal distribution
- Usually robust to violations on Normality

# Inference with non-Normal data

# Key issue in statistical inference

The distribution of the test statistics is unknown, e.g., data no longer Normal; data not necessary a random sample

If you have \(10,000\) different random samples each with sample size \(n=100\)

- how many realizations of the test statistic you have?
- do you have much information about the distribution of the test statistic?

# Bootstrap

A resampling technique to approximate the distribution of the test statistic by

- resampling randomly with replacement from the original data as if each newly obtained sample is a different sample than the original data
- steps described in Section 5.8 of Text

Wait … what is going on “resample from the original data” ???

# Bootstrap

The workflow:

compute the actual value of test statistic from original data

sample from original data, compute the value of test statistic using this sample

repeat 2. many times and collect test statistic values based on bootstrap samples

use the distribution of bootstrap-based test statistic values as distribution of the test statistic to draw inference

# Bootstrap: illustration

```
> set.seed(123)
> n = 50 # actual sample size
> obs = rnorm(n) # generate random sample from standard Normal
> # sample randomly and with replacement from original data
> bsSample1 = sample(obs, n, replace = TRUE, prob = rep(1/n,n))
```

# Bootstrap: illustration

```
> plot(bsSample1,obs,xlab="Original data", ylab="Bootstrap sample", main ="")
> abline(a=0,b=1,col="red",lwd=2)
```

# Bootstrap: illustration

```
> mean(obs) # sample mean of original data
[1] 0.03440355
> mean(bsSample1) # sample mean of a bootstrap sample
[1] -0.05722458
```

# Bootstrap

To approximate the distribution of sample mean \(\bar{x}\)

# Bootstrap

To approximate the distribution of test statistic \(T = \frac{\bar{x}-0}{s/\sqrt{50}}\) when testing \(H_0: \mu \le 0\) vs \(H_a: \mu >0\)

# Bootstrap

To build a 95% confidence interval (CI) for \(\mu\)

```
> # compute .025 percentile and .975 percentile
> tL = quantile(tbBootstamp, probs = .025)
> tL
2.5%
-1.772072
> tU = quantile(tbBootstamp, probs = .975)
> tU
97.5%
2.23689
> # confidence bounds
> bL = mean(obs) - tL*(sd(obs)/sqrt(n)) #left end point
> bU = mean(obs) + tU*(sd(obs)/sqrt(n)) # right end point
```

The CI is [0.0384,0.0397]

# Bootstrap

Theoretical 95% confidence interval for \(\mu\) is [-0.222,0.291], obtained using Z-test

```
> # theoretical confidence bounds
> bLT = mean(obs) - qnorm(0.975)*(sd(obs)/sqrt(n))
> bLT # left end point
[1] -0.2222298
> bUT = mean(obs) + qnorm(0.975)*(sd(obs)/sqrt(n))
> bUT # right end point
[1] 0.2910369
```

# Bootstrap

Check: \(H_0: \mu \le 0\) vs \(H_a: \mu >0\)

Bootstrap p-value: number of times the bootstrap-based test statistics are larger than or equal to the observed test statistic divided by the number of bootstrap samples

```
> pval = sum(tbBootstamp >= ts)/B
> pval
[1] 0.49
```

Reject \(H_0\)?

# Bootstrap

- Computationally intensive resampling technique
- The source file of the lecture notes contains example codes for implementing bootstrap for Student’s t test
- Bootstrap techniques can be very useful, and you are encouraged to code them up

# Which to use?

Among Z-test, Student’s t test, and bootstrap based test for inference on population mean, consider scenarios:

- a random sample with large sample size
- a random sample that appears to be Normally distributed
- a random smple with moderate sample size but from an unknown distribution

Which test to use?