# Motivations

• Variation of potency for drug
• Risks in portfolios
• Assess equality of variances in two-sample test

# Sample variance

• Random sample $$y_1,\ldots,y_n$$ with mean $$\mu$$ and variance $$\sigma^2$$

• Sample mean $$\bar{y}=\frac{1}{n}\sum_{i=1}^n y_i$$

• Sample variance $$s^2 = \frac{1}{n-1}\sum_{i=1}^n (y_i - \bar{y})^2$$

# Statistic

• Random sample 1: $$y_{1i}, i=1,\ldots,n_1$$ follow $$\mathsf{N}(\mu_1, \sigma_1^2)$$
• Random sample 2: $$y_{21}, i=1,\ldots,n_2$$ follow $$\mathsf{N}(\mu_2, \sigma_2^2)$$
• Sample variance $$s_1^2$$ for random sample 1; sample variance $$s_2^2$$ for random sample 2

Then $\frac{s^2/\sigma_1^2}{s_2^2/\sigma_2^2}$ follows an F distribution

# Density of an F distribution

Density: $$df_1=3$$ and $$df_2=5$$

# F distribution

• not symmetrical
• with non-negative values
• with $$df_1$$ (for $$s_1^2$$) and $$df_2$$ (for $$s_2^2$$)

# Hypothesis testing

Test statistic $$F= \frac{s_1^2}{s_2^2}$$

• For $$H_0: \sigma_1^2 \le \sigma_2^2$$ vs $$H_a: \sigma_1^2 > \sigma_2^2$$, reject $$H_0$$ if $$F \ge F_{\alpha,df_1,df_2}$$

• For $$H_0: \sigma_1^2= \sigma_2^2$$ vs $$H_a: \sigma_1^2 \ne \sigma_2^2$$, reject $$H_0$$ if $$F \ge F_{\alpha/2,df_1,df_2}$$ or $$F \le F_{1-\alpha/2,df_1,df_2}$$

• F table; $$F_{1-\alpha,df_1,df_2}= \frac{1}{F_{\alpha,df_2,df_1}}$$

# Confidence interval

$$100(1-\alpha)\%$$ confidence interval for $$\sigma_1^2/\sigma_2^2$$ is constructed as follows:

• obtain $$s_1^2$$, $$s_2^2$$ and $$s_1^2/s_2^2$$
• obtain $$df_1 = n_1 -1$$ and $$df_2 = n_2 -1$$
• obtain $$F_U = F_{\alpha/2,df_2,df_1}$$ and $$F_L = 1/F_{\alpha/2,df_1,df_2}$$

• confidence interval: $$\left[\frac{s_1^2}{s_2^2}F_L,\frac{s_1^2}{s_2^2}F_U\right]$$

# Example 6.1 (revist)

Data:

> # potency readings for Fresh drug
> cFresh = c(10.2, 10.5, 10.3, 10.8, 9.8, 10.6, 10.7, 10.2, 10,
+     10.6)
>
> # potency readings for Stored drug
> cStored = c(9.8, 9.6, 10.1, 10.2, 10.1, 9.7, 9.5, 9.6, 9.8, 9.9)

Nomality test

# Example 6.1: testing variances

At Type I error probability $$\alpha = 0.05$$, test $$H_0: \sigma_1^2= \sigma_2^2$$ vs $$H_a: \sigma_1^2 \ne \sigma_2^2$$

• Sample 1: Fresh; Sample 2: Stored
• $$n_1 = 10$$, $$n_2 = 10$$; $$s_1^2 = 0.10$$, $$s_2^2= 0.06$$
• $$F = \frac{s_1^2}{s_2^2} = 1.67$$
• $$F_{0.025,9,9}=4.03$$; $$F_{0.975,9,9} =1/4.03= 0.25$$; F table

Reject $$H_0$$ if $$F \ge F_{\alpha/2,df_1,df_2}$$ or $$F \le F_{1-\alpha/2,df_1,df_2}$$. Conclusion?

# Example 6.1: CI for variances

$$95\%$$ confidence interval for $$\sigma_1^2/\sigma_2^2$$

• $$n_1 = 10$$, $$n_2 = 10$$
• $$s_1^2 = 0.10$$, $$s_2^2= 0.06$$ and $$\frac{s_1^2}{s_2^2} = 1.67$$
• $$F_U = F_{0.025,9,9}=4.03$$; $$F_L = F_{0.975,9,9} = 0.25$$
• CI: $$[1.67\times 0.25, 1.67\times 4.03]$$

F table

# Exercise 1 (Example 7.5)

• Sample 1: $$\bar{y}_1 = 38.48$$, $$s_1 = 16.37$$, $$n_1 = 40$$

• Sample 2: $$\bar{y}_2 = 26.93$$, $$s_1 = 9.88$$, $$n_1 = 40$$

• Test $$H_0: \sigma_1^2 \le \sigma_2^2$$ vs $$H_a: \sigma_1^2 > \sigma_2^2$$

• $$F = \sigma_1^2/\sigma_2^2$$

• $$F_{0.025,39,39} \approx 1.88$$

# Exercise 2 (Example 7.5)

• Sample 1: $$\bar{y}_1 = 38.48$$, $$s_1 = 16.37$$, $$n_1 = 40$$

• Sample 2: $$\bar{y}_2 = 26.93$$, $$s_1 = 9.88$$, $$n_1 = 40$$

• 95% confidence interval

• $$\left[\frac{s_1^2}{s_2^2}F_L,\frac{s_1^2}{s_2^2}F_U\right]$$

# Exercise 7.23

Quality of motor tires via tread thickness

> d723 = read.table("http://math.wsu.edu/faculty/xchen/stat412/data/stat412ASCII-tab/CH07/ex7-23.TXT",
+     sep = "\t", header = T)
>
> # BrandI
> BrandI = d723$Brand.I > BrandI [1] 38.9 39.7 42.3 39.5 39.6 35.6 36.0 39.2 37.6 39.5 > > # Brand II > BrandII = d723$Brand.II
> BrandII
[1] 44.6 46.9 48.7 41.5 37.5 33.1 43.4 36.5 32.5 42.0

# Exercise 7.23

Nomality ok for both samples?

# Exercise 7.23

Nomality ok for both samples?

> shapiro.test(BrandI)

Shapiro-Wilk normality test

data:  BrandI
W = 0.90448, p-value = 0.2452
>
> shapiro.test(BrandII)

Shapiro-Wilk normality test

data:  BrandII
W = 0.95013, p-value = 0.67

# Exercise 7.23

Test $$H_0: \sigma_1^2 = \sigma_2^2$$ vs $$H_a: \sigma_1^2 \ne \sigma_2^2$$

> var.test(x = BrandI, y = BrandII, ratio = 1, alternative = "two.sided",
+     conf.level = 0.95)

F test to compare two variances

data:  BrandI and BrandII
F = 0.12268, num df = 9, denom df = 9, p-value = 0.004505
alternative hypothesis: true ratio of variances is not equal to 1
95 percent confidence interval:
0.03047313 0.49392768
sample estimates:
ratio of variances
0.1226846 

Conclusion?

# Exercise 7.23

Test $$H_0: \mu_1 = \mu_2$$ vs $$H_a: \mu_1 \ne \mu_2$$

> t.test(x = BrandI, y = BrandII, alternative = "two.sided", mu = 0,
+     paired = FALSE, var.equal = T, conf.level = 0.95)

Two Sample t-test

data:  BrandI and BrandII
t = -1.0057, df = 18, p-value = 0.3279
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-5.807407  2.047407
sample estimates:
mean of x mean of y
38.79     40.67 

Conclusion?

# Quick illustration

Data from SAS website

> dextra = read.table("http://math.wsu.edu/faculty/xchen/stat412/data/upsit.txt",
+     sep = "\t", header = T)
> # change agegroup from numeric into factor
> dextra$agegroup = factor(dextra$agegroup)
> summary(dextra)
agegroup     smell
1:38     Min.   :0.502
2:36     1st Qu.:1.162
3:21     Median :1.275
4:43     Mean   :1.234
5:42     3rd Qu.:1.381
Max.   :1.492  

# Quick illustration

> library(ggplot2)
> ggplot(dextra, aes(x = agegroup, y = smell)) + geom_boxplot() +
+     theme_bw()

# Quick illustration

Test homogeneity of variances for 5 groups

> # agegroup is the factor; Bartlett test requires Normality
> bartlett.test(smell ~ agegroup, data = dextra)

Bartlett test of homogeneity of variances

data:  smell by agegroup
Bartlett's K-squared = 47.424, df = 4, p-value = 1.244e-09

Conclusion?

# Quick illustration

Test homogeneity of variances for 5 groups

> library(car)
> # agegroup is the factor; Levene Test more robust
> leveneTest(smell ~ agegroup, data = dextra)
Levene's Test for Homogeneity of Variance (center = median)
Df F value   Pr(>F)
group   4  5.4713 0.000357 ***
175
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Conclusion?

# Summary

• $$F$$ test for comparing two population variances based on two independent Normal random samples
• With Normality, Hartley $$F_{max}$$ test can be used to compare more than two variances
• Without Normality, Brown-Forsythe-Levene test can be used to compare more than two variances
• Homogeneity of Variance will be assessed in Analysis of Variance

# Motivation

Need to compare more than two population means

• hourly wages for 3 classifications of farm laborers
• sense of smell for 5 age groups
• price of an item at 10 grocery stores

For serveral populations means

• are they all the same?
• if they are not the same, how many of them are different?

For $$20$$ non-identical poluations means, there are $$20 \times 19$$ possible pairs of means to compare! The risk of making Type I errors, i.e., flase positives, can be high!

# Notations

• $$t$$ population means: $$\mu_1 = \ldots = \mu_t$$

• $$t$$ sample size: $$n_1,\ldots, n_t$$

• $$t$$ random samples: $$y_{ij}, i=1,\ldots,t; j = 1,\ldots,n_i$$

• $$t$$ sample means: $$\bar{y}_{i.}, i=1,\ldots,t$$

• $$t$$ sample variances: $$s_i^2, i=1,\ldots,t$$

• the grand mean: $$\bar{y}_{..} = \frac{1}{\sum_{i=1}^t n_i} \sum_{i=1}^t \sum_{j=1}^{n_i} y_{ij}$$

# Sums of squares

• Within-sample sum of squares: sum of variability for each sample with respect to its own mean, i.e., $SSW = \sum_{i,j}(y_{ij} - \bar{y}_{i.})^2 = \sum_{i=1}^t (n_i -1)s_q^2$
• Between-sample sum of squares: sum of variability of all samples with respect to grand mean $SSB = \sum_{i=1}^t n_i (\bar{y}_{i.}-\bar{y}_{..})^2$

# Intuition

Data from $$\textsf{N}(0,1)$$: small within-sample variation and small between-sample variation

Sample 1 Sample 2 Sample 3 Sample 4
-0.63 0.49 -0.62 0.82
0.18 0.74 -2.21 0.59
-0.84 0.58 1.12 0.92
1.60 -0.31 -0.04 0.78
0.33 1.51 -0.02 0.07
-0.82 0.39 0.94 -1.99
SampleMean -0.03 0.57 -0.14 0.20
SampleStdDevs 0.94 0.59 1.21 1.11

# Intuition

Data from $$\textsf{N}(i^2,1), 1 \le i \le 4$$: small within-sample variation and large between-sample variation

Sample 1 Sample 2 Sample 3 Sample 4
0.37 4.49 8.4 16.8
1.18 4.74 6.8 16.6
0.16 4.58 10.1 16.9
2.60 3.69 9.0 16.8
1.33 5.51 9.0 16.1
0.18 4.39 9.9 14.0
SampleMean 0.97 4.57 8.9 16.2
SampleStdDevs 0.94 0.59 1.2 1.1

# Intuition

Data from $$\textsf{N}(i^2,i^2), 1 \le i \le 4$$: large within-sample variation and large between-sample variation

Sample 1 Sample 2 Sample 3 Sample 4
0.37 6.0 3.4 29
1.18 7.0 -10.9 26
0.16 6.3 19.1 31
2.60 2.8 8.6 29
1.33 10.1 8.8 17
0.18 5.6 17.5 -16
SampleMean 0.97 6.3 7.8 19
SampleStdDevs 0.94 2.4 10.9 18

# Intuition

Data from $$\textsf{N}(1,i^2), 1 \le i \le 4$$: large within-sample variation and small between-sample variation

Sample 1 Sample 2 Sample 3 Sample 4
0.37 2.95 -4.59 14.1
1.18 3.95 -18.93 10.5
0.16 3.30 11.12 15.7
2.60 -0.22 0.60 13.5
1.33 7.05 0.85 2.2
0.18 2.56 9.49 -30.8
SampleMean 0.97 3.27 -0.24 4.2
SampleStdDevs 0.94 2.35 10.90 17.8

# Method

• $$t$$ population means: $$\mu_1 = \ldots = \mu_t$$

• $$t$$ sample size: $$n_1,\ldots, n_t$$

• $$t$$ random samples: $$y_{ij}, i=1,\ldots,t; j = 1,\ldots,n_i$$

• $$t$$ sample means: $$\bar{y}_{i.}, i=1,\ldots,t$$

• $$t$$ sample variances: $$s_i^2, i=1,\ldots,t$$

• the grand mean: $$\bar{y}_{..} = \frac{1}{\sum_{i=1}^t n_i} \sum_{i=1}^t \sum_{j=1}^{n_i} y_{ij}$$

# Method

• Within-sample sum of squares: sum of variability for each sample with respect to its own mean, i.e., $SSW = \sum_{i,j}(y_{ij} - \bar{y}_{i.})^2 = \sum_{i=1}^t (n_i -1)s_q^2$
• Between-sample sum of squares: sum of variability of all samples with respect to grand mean $SSB = \sum_{i=1}^t n_i (\bar{y}_{i.}-\bar{y}_{..})^2$

# Method

• Assumption: independent Normal random samples with identical variance
• Test $$H_0: \mu_1 = \ldots = \mu_t$$ va $$H_a$$: not all $$\mu_i$$’s are equal

• $$SSW$$: $$df = n_T-t$$ where $$n_T$$ is total sample size
• $$SSB$$: $$df=t-1$$

# Method

• Test statistic $F = \frac{SSB/(t-1)}{SSW/(n_T -t)}$ follows an F distribution with $$df_1=t-1$$ and $$df_2 = n_T -t$$

• At Type I error probability $$\alpha$$, reject $$H_0$$ if $$F > F_{\alpha,df_1, df_2}$$; otherwise, conclude $$H_0$$

F table

# Example 1

All observations from $$\textsf{N}(0,1)$$ but divided into 3 groups

> library(dplyr)
> options(digits = 2)
> set.seed(1)
> n = 20
> obs = c(rnorm(n, 0, 1), rnorm(n, 0, 1), rnorm(n, 0, 1))
> grps = rep(c("Sample 1", "Sample 2", "Sample 3"), each = n)
> data1 = data.frame(obs, grps)
> group_by(data1, grps) %>% summarise(count = n(), mean = mean(obs,
+     na.rm = TRUE), sd = sd(obs, na.rm = TRUE))
# A tibble: 3 x 4
grps count    mean    sd
<fctr> <int>   <dbl> <dbl>
1 Sample 1    20  0.1905  0.91
2 Sample 2    20 -0.0065  0.87
3 Sample 3    20  0.1388  0.81

# Example 1

> library(ggplot2)
> ggplot(data1, aes(x = grps, y = obs)) + geom_boxplot() + theme_bw() +
+     xlab("")

# Example 1

Data from $$\textsf{N}(0,1)$$ and $$H_0: \mu_1 = \mu_2 = \mu_3$$

> # Normality and equal variances needed
> reseg1 = aov(obs ~ grps, data = data1)
> summary(reseg1)
Df Sum Sq Mean Sq F value Pr(>F)
grps         2    0.4   0.209    0.28   0.76
Residuals   57   42.7   0.750               

# Example 2

Data from $$\textsf{N}(i^2,1), 1 \le i \le 3$$ and $$H_0: \mu_1 = \mu_2 = \mu_3$$

> library(dplyr)
> options(digits = 2)
> set.seed(1)
> n = 20
> obs = c(rnorm(n, 1, 1), rnorm(n, 4, 1), rnorm(n, 9, 1))
> grps = rep(c("Sample 1", "Sample 2", "Sample 3"), each = n)
> data2 = data.frame(obs, grps)
> group_by(data2, grps) %>% summarise(count = n(), mean = mean(obs,
+     na.rm = TRUE), sd = sd(obs, na.rm = TRUE))
# A tibble: 3 x 4
grps count  mean    sd
<fctr> <int> <dbl> <dbl>
1 Sample 1    20   1.2  0.91
2 Sample 2    20   4.0  0.87
3 Sample 3    20   9.1  0.81

# Example 2

> library(ggplot2)
> ggplot(data2, aes(x = grps, y = obs)) + geom_boxplot() + theme_bw() +
+     xlab("")

# Example 2

Data from $$\textsf{N}(i^2,1), 1 \le i \le 3$$ and $$H_0: \mu_1 = \mu_2 = \mu_3$$

> # Normality and equal variances needed
> reseg2 = aov(obs ~ grps, data = data2)
> summary(reseg2)
Df Sum Sq Mean Sq F value Pr(>F)
grps         2    650     325     434 <2e-16 ***
Residuals   57     43       1
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

# Example 3

Data from $$\textsf{N}(i^2,i^2), 1 \le i \le 3$$ and $$H_0: \mu_1 = \mu_2 = \mu_3$$

> library(dplyr)
> options(digits = 2)
> set.seed(1)
> n = 20
> obs = c(rnorm(n, 1, 1), rnorm(n, 4, 4), rnorm(n, 9, 9))
> grps = rep(c("Sample 1", "Sample 2", "Sample 3"), each = n)
> data3 = data.frame(obs, grps)
> group_by(data3, grps) %>% summarise(count = n(), mean = mean(obs,
+     na.rm = TRUE), sd = sd(obs, na.rm = TRUE))
# A tibble: 3 x 4
grps count  mean    sd
<fctr> <int> <dbl> <dbl>
1 Sample 1    20   1.2  0.91
2 Sample 2    20   4.0  3.49
3 Sample 3    20  10.2  7.29

# Example 3

> library(ggplot2)
> ggplot(data3, aes(x = grps, y = obs)) + geom_boxplot() + theme_bw() +
+     xlab("")

# Example 3

Data from $$\textsf{N}(i^2,i^2), 1 \le i \le 3$$ and $$H_0: \mu_1 = \mu_2 = \mu_3$$

> # Normality and equal variances needed
> reseg3 = aov(obs ~ grps, data = data3)
> summary(reseg3)
Df Sum Sq Mean Sq F value  Pr(>F)
grps         2    861     431    19.6 3.4e-07 ***
Residuals   57   1256      22
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

# Example 3

Data from $$\textsf{N}(i^2,i^2), 1 \le i \le 3$$ and $$H_0: \mu_1 = \mu_2 = \mu_3$$

> # Kruskal-Wallis Test; distributions should be similar shapes
> kruskal.test(obs ~ grps, data = data3)

Kruskal-Wallis rank sum test

data:  obs by grps
Kruskal-Wallis chi-squared = 30, df = 2, p-value = 2e-06

# Example 4

Data from $$\textsf{N}(1,i^2), 1 \le i \le 3$$ and $$H_0: \mu_1 = \mu_2 = \mu_3$$

> library(dplyr)
> options(digits = 2)
> set.seed(1)
> n = 20
> obs = c(rnorm(n, 1, 1), rnorm(n, 1, 4), rnorm(n, 1, 9))
> grps = rep(c("Sample 1", "Sample 2", "Sample 3"), each = n)
> data4 = data.frame(obs, grps)
> group_by(data4, grps) %>% summarise(count = n(), mean = mean(obs,
+     na.rm = TRUE), sd = sd(obs, na.rm = TRUE))
# A tibble: 3 x 4
grps count  mean    sd
<fctr> <int> <dbl> <dbl>
1 Sample 1    20  1.19  0.91
2 Sample 2    20  0.97  3.49
3 Sample 3    20  2.25  7.29

# Example 4

Data from $$\textsf{N}(1,i^2), 1 \le i \le 3$$ and $$H_0: \mu_1 = \mu_2 = \mu_3$$

> # Normality and equal variances needed
> reseg4 = aov(obs ~ grps, data = data4)
> summary(reseg4)
Df Sum Sq Mean Sq F value Pr(>F)
grps         2     19    9.31    0.42   0.66
Residuals   57   1256   22.03               

# Example 4

> library(ggplot2)
> ggplot(data4, aes(x = grps, y = obs)) + geom_boxplot() + theme_bw() +
+     xlab("")

# Example 4

Data from $$\textsf{N}(1,i^2), 1 \le i \le 3$$ and $$H_0: \mu_1 = \mu_2 = \mu_3$$

> # Kruskal-Wallis Test; distributions should be similar shapes
> kruskal.test(obs ~ grps, data = data4)

Kruskal-Wallis rank sum test

data:  obs by grps
Kruskal-Wallis chi-squared = 0.1, df = 2, p-value = 0.9

# Comparing more than two means

• F test: independent samples, Normality, equal variances
• Kruskal-Wallis test: independent samples, distributions have similar shapes

# Comparing more than two variances

• Bartlett’s test: independent samples, Normality
• Levene’s test: independent samples

# Normality test

• KS test: no ties between observations
• Shapiro-Wilk test: less senstive to ties among observations

# Recap

• $$t$$ population means: $$\mu_1 = \ldots = \mu_t$$

• $$t$$ sample size: $$n_1,\ldots, n_t$$

• $$t$$ random samples: $$y_{ij}, i=1,\ldots,t; j = 1,\ldots,n_i$$

• $$t$$ sample means: $$\bar{y}_{i.}, i=1,\ldots,t$$

• $$t$$ sample variances: $$s_i^2, i=1,\ldots,t$$

• the grand mean: $$\bar{y}_{..} = \frac{1}{\sum_{i=1}^t n_i} \sum_{i=1}^t \sum_{j=1}^{n_i} y_{ij}$$

# Recap

• Within-sample sum of squares: sum of variability for each sample with respect to its own mean, i.e., $SSW = \sum_{i,j}(y_{ij} - \bar{y}_{i.})^2$
• Between-sample sum of squares: sum of variability of all samples with respect to grand mean $SSB = \sum_{i=1}^t n_i (\bar{y}_{i.}-\bar{y}_{..})^2$

# Recap

• Assumption: independent Normal random samples with identical variance
• Test $$H_0: \mu_1 = \ldots = \mu_t$$ vs $$H_a$$: not all $$\mu_i$$’s are equal

• $$SSW$$: $$df = n_T-t$$ where $$n_T=\sum_{i=1}^t n_i$$ is total sample size
• $$SSB$$: $$df=t-1$$

# Recap

• Test $$H_0: \mu_1 = \ldots = \mu_t$$ vs $$H_a$$: not all $$\mu_i$$’s are equal
• Test statistic $F = \frac{SSB/(t-1)}{SSW/(n_T -t)}$ follows an F distribution with $$df_1=t-1$$ and $$df_2 = n_T -t$$

• At Type I error probability $$\alpha$$, reject $$H_0$$ if $$F > F_{\alpha,df_1, df_2}$$; otherwise, conclude $$H_0$$

F table

# Example 2 (Recap)

Null hypothesis $$H_0: \mu_1 = \mu_2 = \mu_3$$

• $$n_1 = n_2 = n_3 = 20$$
• $$y_{1i}, i=1,...,n_1$$ Normal with $$\mu=1$$ and $$\sigma^2=1$$
• $$y_{2i}, i=1,...,n_2$$ Normal with $$\mu=4$$ and $$\sigma^2=1$$
• $$y_{3i}, i=1,...,n_3$$ Normal with $$\mu=9$$ and $$\sigma^2=1$$

# Example 2 via Regression

Model:

• $$y_{1i} = \mu_1 + \varepsilon_{1i}, i=1,...,n_1$$
• $$y_{2i} = \mu_2 + \varepsilon_{2i}, i=1,...,n_2$$
• $$y_{3i} = \mu_3 + \varepsilon_{3i}, i=1,...,n_3$$

• $$\varepsilon_{ij}$$ independent and Normally distributed

# Example 2 via Regression

Reformulate the means:

• $$\mu_1 = \mu + \tau_1$$
• $$\mu_2 = \mu + \tau_2$$
• $$\mu_2 = \mu + \tau_3$$
• $$\mu$$ corresponding to the grand population mean (recall the overal mean obtain from all samples)

• $$\mu_1 = \mu_2 = \mu_3$$ if and only if $$\tau_1 = \tau_2 = \tau_3 = 0$$. So, $$H_0: \mu_1 = \mu_2 = \mu_3$$ is equivalent to $$H_0: \tau_1 = \tau_2 = \tau_3 = 0$$

# Example 2 via Regression

Residuals:

• Sample mean $$\bar{y}_{1 \cdot} = n_1^{-1} \sum_{j=1}^{n_1}y_{1j}$$ estimates $$\mu_1$$, giving residuals $e_{1j} = y_{1j} - \bar{y}_{1 \cdot}$

• Sample mean $$\bar{y}_{2 \cdot} = n_2^{-1} \sum_{j=1}^{n_2}y_{2j}$$ estimates $$\mu_2$$, giving residuals $e_{2j} = y_{2j} - \bar{y}_{2 \cdot}$

• Sample mean $$\bar{y}_{3 \cdot} = n_2^{-1} \sum_{j=1}^{n_3}y_{3j}$$ estimates $$\mu_3$$, giving residuals $e_{3j} = y_{3j} - \bar{y}_{3 \cdot}$

# Example 2 via Regression

Use of residuals: under the assumption that all samples are independent and Normally distributed with equal variances

• If the sample sizes are equal, all residuals $$e_{ij}$$ should follow the SAME Normal distribution

• The variances of each invidual set of residuals $$\left\{e_{ij}, j =1, \ldots, n_1\right\}$$ for $$i=1,\ldots,t$$should be equal

# Example 2 via Regression

Commands to fit a regression model:

> # fit linear model with the grand population mean
> Reg2 = lm(obs ~ grps, data = data2)
> # display analysis results
> summary(Reg2)

# Example 2 via Regression


Call:
lm(formula = obs ~ grps, data = data2)

Residuals:
Min      1Q  Median      3Q     Max
-2.4052 -0.4777  0.0371  0.5752  1.8416

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept)     1.191      0.194    6.15  8.2e-08 ***
grpsSample 2    2.803      0.274   10.24  1.6e-14 ***
grpsSample 3    7.948      0.274   29.03  < 2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.87 on 57 degrees of freedom
Multiple R-squared:  0.938, Adjusted R-squared:  0.936
F-statistic:  434 on 2 and 57 DF,  p-value: <2e-16

# Example 2 via ANOVA

Recall the following:

> # Normality and equal variances needed
> reseg2 = aov(obs ~ grps, data = data2)
> summary(reseg2)
Df Sum Sq Mean Sq F value Pr(>F)
grps         2    650     325     434 <2e-16 ***
Residuals   57     43       1
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

# Example 2 via Regression

Commands to obtain iagnostic plots:

> # divide plotting window into 4 parts
> par(mfrow = c(2, 2))
> # obtain diagnostic plots
> plot(Reg2)

Diagnostic plots

# ANOVA: summary

A workflow

• Exploratory analysis
• Check assumptions
• Remedial measures if needed (later chapters)
• Statistical analysis

# The workflow

• Exploratory analysis
• Check assumptions
• Remedial measures if needed (later chapters)
• Statistical analysis

# Exercise 8.6

A large laboratory has four types of devices used to determine the pH of soil samples. The laboratory wants to determine whether there are differences in the average readings given by these devices. The lab uses 24 soil samples having known pH in the study, and randomly assigns six of the samples to each device. The soil samples are tested and the response recorded is the difference between the pH reading of the device and the known pH of the soil.

# Exercise 8.6

1. Based on your intuition, is there evidence to indicate any difference among the mean differences in pH readings for the four devices?

2. Run an analysis of variance to confirm or reject your conclusion of part (1). Use $$\alpha = 0.05$$.

3. Compute the p-value of the F test in part (2).

4. What conditions must be satisfied for your analysis in parts (2) and (3) to be valid?

# Ex 8.6: check objective types

load data and check if data have been stored into a data frame; check if data frame has names

> soil = read.table("http://math.wsu.edu/faculty/xchen/stat412/data/stat412ASCII-tab/CH08/ex8-6.TXT",
+     sep = "\t", header = T)
> # check object type of 'soil'
> class(soil)
[1] "data.frame"
> # check if columns (or rows) have column (or row) names
> names(soil)
[1] "Devise"   "Sample"   "Response"

# Ex 8.6: check objective types

check object type for each object in data frame; show levels of factor(s) in data frame

> # display the first few rows of data
Devise Sample Response
1      A      1   -0.307
2      A      2   -0.294
3      A      3    0.079
4      A      4    0.019
5      A      5   -0.136
6      A      6   -0.324
> # 4 devices coded by a factor 'Devise'
> class(soil$Devise) [1] "factor" > # show all levels of the factor 'Devise' > levels(soil$Devise)
[1] "A" "B" "C" "D"

# Ex 8.6: exploratory analysis

obtain summary statistics for data

> # check object type for soil$Response > class(soil$Response)
[1] "numeric"
> # get a summary of data
> summary(soil)
Devise     Sample       Response
A:6    Min.   :1.0   Min.   :-0.32
B:6    1st Qu.:2.0   1st Qu.:-0.05
C:6    Median :3.5   Median : 0.09
D:6    Mean   :3.5   Mean   : 0.08
3rd Qu.:5.0   3rd Qu.: 0.21
Max.   :6.0   Max.   : 0.69  

# Ex 8.6: exploratory analysis

> library(ggplot2)
> ggplot(soil, aes(x = Devise, y = Response)) + geom_boxplot() +
+     theme_bw() + xlab("Device")

# Ex 8.6: exploratory analysis

> # load library data processing
> library(dplyr)
> # pipe obserations into summary statistics
> group_by(soil, Devise) %>% summarise(count = n(), mean = mean(Response,
+     na.rm = TRUE), sd = sd(Response, na.rm = TRUE))
# A tibble: 4 x 4
Devise count   mean    sd
<fctr> <int>  <dbl> <dbl>
1      A     6 -0.160  0.18
2      B     6  0.095  0.21
3      C     6  0.123  0.15
4      D     6  0.273  0.25

# Ex 8.6: check assumptions

Normality

> # Kolmogorov-Smirnov test on Normality
> ks.test(soil$Response, pnorm, mean(soil$Response), sd(soil$Response)) One-sample Kolmogorov-Smirnov test data: soil$Response
D = 0.09, p-value = 1
alternative hypothesis: two-sided

# Ex 8.6: check assumptions

Normality

> # Shapiro test on Normality: less sensitive to ties
> shapiro.test(soil$Response) Shapiro-Wilk normality test data: soil$Response
W = 1, p-value = 0.7

# Ex 8.6: check assumptions

Homogeneity of Variances

> # load libary in order to use leveneTest
> library(car)
> # Levene's test: robust to violation of Normality
> leveneTest(Response ~ Devise, data = soil)
Levene's Test for Homogeneity of Variance (center = median)
Df F value Pr(>F)
group  3    0.15   0.93
20               

# Ex 8.6: check assumptions

Homogeneity of Variances

> # Bartlett's test: Normality and independence needed
> bartlett.test(Response ~ Devise, data = soil)

Bartlett test of homogeneity of variances

data:  Response by Devise
Bartlett's K-squared = 1, df = 3, p-value = 0.7

# Ex 8.6: ANOVA

> soilAOV = aov(Response ~ Devise, data = soil)
> summary(soilAOV)
Df Sum Sq Mean Sq F value Pr(>F)
Devise       3  0.584  0.1946    4.85  0.011 *
Residuals   20  0.803  0.0401
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

# Ex 8.6: Regression

> soilReg = lm(Response ~ Devise, data = soil)
> summary(soilReg)

Call:
lm(formula = Response ~ Devise, data = soil)

Residuals:
Min      1Q  Median      3Q     Max
-0.3155 -0.1367 -0.0082  0.1214  0.4165

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept)  -0.1605     0.0818   -1.96   0.0638 .
DeviseB       0.2552     0.1157    2.21   0.0393 *
DeviseC       0.2832     0.1157    2.45   0.0237 *
DeviseD       0.4340     0.1157    3.75   0.0013 **
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.2 on 20 degrees of freedom
Multiple R-squared:  0.421, Adjusted R-squared:  0.334
F-statistic: 4.85 on 3 and 20 DF,  p-value: 0.0108

# Ex 8.6: diagnosic plots

> par(mfrow = c(2, 2))
> plot(soilReg)

# Ex 8.6: nonparametric test

> # Kruskal-Wallis Test: distributions should have similar
> # shapes
> kruskal.test(Response ~ Devise, data = soil)

Kruskal-Wallis rank sum test

data:  Response by Devise
Kruskal-Wallis chi-squared = 10, df = 3, p-value = 0.02

# Ex 8.6

1. Suppose the 24 soil samples have widely different pH values. What problems may occur by simply randomly assigning the soil samples to the different devices?

# Multiple comparison

If not all populations means are equal, then we may ask “which population means are different”?

• Experimentwise Type I error: the probability of falsely rejecting at least one null hypotheses
• Other error rates are available

# Fisher’s LSD procedure

Fisher’s Lease Significant Difference (LSD) procedure:

• for each pairwise comparison of population means, the probability of a Type I error is fixed at a specified value $$\alpha$$
• needs equality of variances for involved populations
• needs Normality for each population
• needs independence among random samples

# Fisher’s LSD procedure

Fisher’s LSD procedure:

$$LSD_{ij} = t_{\alpha/2} \sqrt{s_W^2 \left(\frac{1}{n_i}+\frac{1}{n_j}\right)}$$

for comparing $$\mu_i$$ and $$\mu_j$$, where $$s_W^2$$ is the pooled estimator of $$\sigma^2$$ from all $$t$$ samples.

If $$|\bar{y}_{i \cdot} - \bar{y}_{j \cdot}| \ge LSD_{ij}$$, conclude that $$\mu_i \ne \mu_j$$

# Fisher’s LSD procedure

Source website

> # install this package in order to use LSD.test
> library(agricolae)
> # extract dataset from this package
> data(sweetpotato)
> # take a look at data
> summary(sweetpotato)
virus      yield
cc:3   Min.   :11
fc:3   1st Qu.:20
ff:3   Median :28
oo:3   Mean   :28
3rd Qu.:38
Max.   :42  

# Fisher’s LSD procedure

> # step 1: use aov to get sums of squares
> aovEg4 = aov(yield ~ virus, data = sweetpotato)
> summary(aovEg4)
Df Sum Sq Mean Sq F value  Pr(>F)
virus        3   1170     390    17.3 0.00073 ***
Residuals    8    180      22
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

# Fisher’s LSD procedure

> library(agricolae)
> # step 2: apply lsd based on anova outputs
> Flsd = LSD.test(aovEg4, "virus", p.adj = "none")
> # output pairwise comparison
> Flsd$groups yield groups oo 37 a ff 36 a cc 24 b fc 13 c Means sharing the same letter are not significantly different # Fisher’s LSD procedure > library(agricolae) > # visualization > plot(Flsd) # Fisher’s LSD procedure Another way to implement: > pairwise.t.test(sweetpotato$yield, sweetpotato$virus, pool.sd = TRUE, + alternative = "two.sided", p.adjust.method = "none") Pairwise comparisons using t tests with pooled SD data: sweetpotato$yield and sweetpotato$virus cc fc ff fc 0.02 - - ff 0.02 3e-04 - oo 0.01 3e-04 0.89 P value adjustment method: none  # Fisher’s LSD procedure Manual compuation: • citical value: $$t_{0.025,8}=306004$$ • pooled variance: $$s_w^2 = 22.5$$ • LSD: $$2.306004 \times \sqrt{2*22.5/3} = 8.931116$$ • if $$|\mu_i - \mu_j| \ge LSD$$, reject $$H_0: \mu_i = \mu_j$$ # Tukey’s W procedure Tukey’s W procedure • controlls experimentwise error rate • needs equality of variances for involved populations • needs Normality for each population • needs independence among random samples • needs equality of sample sizes # Tukey’s W procedure Tukey’s W procedure: If $$|\bar{y}_{i \cdot} - \bar{y}_{j \cdot}| \ge W$$, conclude $$\mu_i \ne \mu_j$$, where $W=q_{\alpha}(t,v)\sqrt{\frac{s_W^2}{n}}$ and $$q_{\alpha}(t,v)$$ relates to the Studentized range $\frac{\bar{y}_{largest} - \bar{y}_{smallest}}{\sqrt{s_p^2/n}}$ # Tukey’s W procedure: implementation source website > # need this pacakge to get the data > library(agricolae) > data("PlantGrowth") > summary(PlantGrowth) weight group Min. :3.6 ctrl:10 1st Qu.:4.5 trt1:10 Median :5.2 trt2:10 Mean :5.1 3rd Qu.:5.5 Max. :6.3  # Tukey’s W procedure > # need this pacakge to run Tukey's procedure > library(agricolae) > # step 1: run anova > aovEg5 = aov(weight ~ group, data = PlantGrowth) > # step 2: run Tukey's test; the test is built into R > twp = TukeyHSD(aovEg5) > twp Tukey multiple comparisons of means 95% family-wise confidence level Fit: aov(formula = weight ~ group, data = PlantGrowth)$group
trt1-ctrl -0.37 -1.06 0.32  0.39
trt2-ctrl  0.49 -0.20 1.19  0.20
trt2-trt1  0.86  0.17 1.56  0.01

# Tukey’s W procedure

> # visualize differences and CI's
> plot(twp)

# Extras

> sessionInfo()
R version 3.3.0 (2016-05-03)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 15063)

locale:
[1] LC_COLLATE=English_United States.1252
[2] LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C
[5] LC_TIME=English_United States.1252

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
[1] agricolae_1.2-8 dplyr_0.7.4     car_2.1-5       ggplot2_2.2.1
[5] knitr_1.17

loaded via a namespace (and not attached):
[1] gtools_3.5.0       revealjs_0.9       splines_3.3.0
[4] lattice_0.20-33    expm_0.999-2       colorspace_1.3-2
[7] htmltools_0.3.6    yaml_2.1.14        mgcv_1.8-12
[10] AlgDesign_1.1-7.3  rlang_0.1.2        nloptr_1.0.4
[13] glue_1.1.1         sp_1.2-5           bindrcpp_0.2
[16] plyr_1.8.4         bindr_0.1          stringr_1.2.0
[19] MatrixModels_0.4-1 munsell_0.4.3      combinat_0.0-8
[22] gtable_0.2.0       coda_0.19-1        evaluate_0.10.1
[25] labeling_0.3       SparseM_1.77       quantreg_5.33
[28] pbkrtest_0.4-7     parallel_3.3.0     spdep_0.6-15
[31] highr_0.6          Rcpp_0.12.12       scales_0.5.0
[34] backports_1.1.0    formatR_1.5        gdata_2.18.0
[37] deldir_0.1-14      lme4_1.1-14        klaR_0.6-12
[40] digest_0.6.12      stringi_1.1.5      gmodels_2.16.2
[43] grid_3.3.0         rprojroot_1.2      tools_3.3.0
[46] LearnBayes_2.15    magrittr_1.5       lazyeval_0.2.0
[49] tibble_1.3.4       cluster_2.0.4      pkgconfig_2.0.1
[52] MASS_7.3-45        Matrix_1.2-6       assertthat_0.2.0
[55] minqa_1.2.4        rmarkdown_1.6      R6_2.2.2
[58] boot_1.3-18        nnet_7.3-12        nlme_3.1-127