# Exmaple 6.1

Comparing the potency of a particular drug: Fresh versus Stored

Data:

> # potency readings for Fresh drug
> cFresh = c(10.2, 10.5, 10.3, 10.8, 9.8, 10.6, 10.7, 10.2, 10,
+     10.6)
>
> # potency readings for Stored drug
> cStored = c(9.8, 9.6, 10.1, 10.2, 10.1, 9.7, 9.5, 9.6, 9.8, 9.9)
>
> # combine them into data frame
> Length = as.data.frame(cbind(cFresh, cStored))
> colnames(Length) = c("Fresh", "Stored")

# Exmaple 6.1 (cont’d)

How to access if the mean potency of Fresh drug is different than that of Stored drug?

• how to obtain information on the two means?
• how to summarize information on the two means?
• what is a test statistic for this task?

# Exmaple 6.1 (cont’d)

Standard deviations for each sample

> # std dev for potency of Fresh drug
> sd(cFresh)
[1] 0.3233505
>
> # std dev for potency of Stored drug
> sd(cStored)
[1] 0.2406011

Question: the population standard deviations are the same?

# Exmaple 6.1 (cont’d)

Strategy to access the difference between two means:

• obtain sample means from the two samples
• take the difference between the two sample means
• normalize the difference (difficult!)
• assess the normalized difference statistically

# Exmaple 6.1 (cont’d)

> mean(cFresh)
[1] 10.37
> mean(cStored)
[1] 9.83
> mean(cFresh) - mean(cStored)
[1] 0.54
>
> # sample standard deviation of differences
> sd(cFresh - cStored)
[1] 0.4325634

# Theorem 6.1 in Text (Extra)

If $$y_1 \sim \mathsf{N}(\mu_1,\sigma_1^2)$$, and $$y_2 \sim \mathsf{N}(\mu_2,\sigma_2^2)$$ and they are independent, then the difference $$y_1 - y_2$$ follows

$\mathsf{N}(\mu_1 - \mu_2, \sigma_1^2 + \sigma_2^2)$

Similarly, the sum $$y_1 + y_2$$ follows

$\mathsf{N}(\mu_1 + \mu_2, \sigma_1^2 + \sigma_2^2)$

Note the variance term in the above

# Example 6.1: (cont’d) (Extra)

• Sample for Fresh drug follow $$\mathsf{N}(\mu_1,\sigma_1^2)$$
• Sample for Stored drug follow $$\mathsf{N}(\mu_2,\sigma_2^2)$$

• Sample mean for Fresh drug: $\bar{y}_1 \sim \mathsf{N}(\mu_1,\sigma_1^2/n)$

• Sample mean for Stored drug $\bar{y}_2 \sim \mathsf{N}(\mu_2,\sigma_2^2/n)$

# Example 6.1: (cont’d) (Extra)

• Further $$d = \bar{y}_1 - \bar{y}_2$$ follows $\mathsf{N}\left(\mu_1 - \mu_2,\frac{\sigma_1^2}{n_1}+\frac{\sigma_2^2}{n_2}\right)$

• Assume $$\sigma_1 = \sigma_2$$ and set $s_p = \sqrt{\frac{(n_1-1)s_1^2 + (n_2 -1)s_2^2}{n_1 + n_2 -2}}$ then $$T = \frac{d}{s_{p}}$$ follows a Student’s t distribution with degrees of freedom (df) $$n_1 + n_2 -2$$

# Exmaple 6.1 (cont’d)

Assumptions:

• Random sample for Fresh drug follow $$\mathsf{N}(\mu_1,\sigma_1^2)$$
• Random sample for Stored drug follow $$\mathsf{N}(\mu_2,\sigma_2^2)$$
• Two samples are independent
• Assume $$\sigma_1 = \sigma_2$$

# Exmaple 6.1 (cont’d)

The $$100(1- \alpha)\%$$ confidence interval (CI) for the difference $$\mu_1 - \mu_2$$ is constructed as follows:

• Difference between sample means $$d = \bar{y}_1 - \bar{y}_2$$

• Set $$s_p = \sqrt{\frac{(n_1-1)s_1^2 + (n_2 -1)s_2^2}{n_1 + n_2 -2}}$$
• CI: $(\bar{y}_1 - \bar{y}_2) \pm t_{\alpha/2} s_p \sqrt{\frac{1}{n_1} + \frac{1}{n_2}}$ where $$t_{\alpha/2}$$ is the $$(1-\alpha/2)\%$$ quantile of a Student’s t distribution with $$df = n_1+n_2-2$$

# Exmaple 6.1 (cont’d)

Illustration of CI: the $$95\%$$ CI is $$[0.2722297,0.8077703]$$

> n1 = length(cFresh)
> n2 = length(cStored)
> d = mean(cFresh) - mean(cStored)
> sp = sqrt(((n1 - 1) * (sd(cFresh))^2 + (n2 - 1) * (sd(cStored))^2)/(n1 +
+     n2 - 2))
> cval = qt(0.05/2, df = n1 + n2 - 2, ncp = 0, lower.tail = FALSE)
> cval
[1] 2.100922
> nr = sqrt(1/n1 + 1/n2)
> CI_left = d - cval * sp * nr
> CI_left
[1] 0.2722297
> CI_right = d + cval * sp * nr
> CI_right
[1] 0.8077703

# Exmaple 6.1 (cont’d)

Illustration of CI: the $$95\%$$ CI is $$[0.2722297,0.8077703]$$

> tTest = t.test(x = cFresh, y = cStored, alternative = "two.sided",
+     mu = 0, paired = FALSE, var.equal = TRUE, conf.level = 0.95)
> tTest$conf.int [1] 0.2722297 0.8077703 attr(,"conf.level") [1] 0.95 # Exmaple 6.1 (cont’d) Hypothesis testing when $$D_0$$ is a postulated value related to $$\mu_1 - \mu_2$$: • Recall $$d = \bar{y}_1 - \bar{y}_2$$ and $s_p = \sqrt{\frac{(n_1-1)s_1^2 + (n_2 -1)s_2^2}{n_1 + n_2 -2}}$ • Test statistic: $$T = \frac{d - D_0}{s_{p} \sqrt{\frac{1}{n_1}+\frac{1}{n_2}}}$$ follows a Student’s t distribution with $$df= n_1 + n_2 - 2$$ # Exmaple 6.1 (cont’d) Illustration: $$H_0: \mu_1 - \mu_2 = 0$$ vs $$H_0: \mu_1 - \mu_2 \ne 0$$ • $$D_0 = 0$$ and $$t_{0.05/2,18} = 2.100922$$ • Reject $$H_0$$ if $$|T| \ge t_{\alpha/2,n_1 + n_2 -2}$$ > t = (d - 0)/(sp * nr) > t #value of test stat [1] 4.236833 > cval #critical value [1] 2.100922 > t < cval [1] FALSE > pval = 2 * pt(t, df = n1 + n2 - 2, ncp = 0, lower.tail = FALSE) > pval [1] 0.0004959478 # Exmaple 6.1 (cont’d) Test $$H_0: \mu_1 - \mu_2 = 0$$ vs $$H_0: \mu_1 - \mu_2 \ne 0$$ • $$D_0 = 0$$ and $$t_{0.05/2,18} = 2.100922$$ • Reject $$H_0$$ if $$|T| \ge t_{\alpha/2,n_1 + n_2 -2}$$ > t.test(x = cFresh, y = cStored, alternative = "two.sided", mu = 0, + paired = FALSE, var.equal = TRUE, conf.level = 0.98) Two Sample t-test data: cFresh and cStored t = 4.2368, df = 18, p-value = 0.0004959 alternative hypothesis: true difference in means is not equal to 0 98 percent confidence interval: 0.2146898 0.8653102 sample estimates: mean of x mean of y 10.37 9.83  # Exmaple 6.1 (cont’d) Test $$H_0: \mu_1 - \mu_2 \le 0.1$$ vs $$H_0: \mu_1 - \mu_2 > 0.1$$ • $$D_0 = 0.1$$ and $$t_{0.05,18} = 1.734064$$ • Reject $$H_0$$ if $$T \ge t_{\alpha,n_1 + n_2 -2}$$ > t.test(x = cFresh, y = cStored, alternative = "greater", mu = 0.1, + paired = FALSE, var.equal = TRUE, conf.level = 0.99) Two Sample t-test data: cFresh and cStored t = 3.4522, df = 18, p-value = 0.001421 alternative hypothesis: true difference in means is greater than 0.1 99 percent confidence interval: 0.2146898 Inf sample estimates: mean of x mean of y 10.37 9.83  # Exmaple 6.1 (cont’d) Test $$H_0: \mu_1 - \mu_2 \ge 0.5$$ vs $$H_0: \mu_1 - \mu_2 < 0.5$$ • $$D_0 = 0.5$$ and $$- t_{0.05,18} = -1.734064$$ • Reject $$H_0$$ if $$T \le - t_{\alpha,n_1 + n_2 -2}$$ > t.test(x = cFresh, y = cStored, alternative = "less", mu = 0.5, + paired = FALSE, var.equal = TRUE, conf.level = 0.96) Two Sample t-test data: cFresh and cStored t = 0.31384, df = 18, p-value = 0.6214 alternative hypothesis: true difference in means is less than 0.5 96 percent confidence interval: -Inf 0.7764699 sample estimates: mean of x mean of y 10.37 9.83  # Exmaple 6.1 (Recap) # Exmaple 6.1 (Recap) # Independent samples: unequal variances # Exmaple 6.1 Recall the following: • Fresh drug: $$n_1 = 10$$, $$\bar{y}_1=10.37$$, $$s_1 = 0.3233$$ • Stored drug: $$n_2 = 10$$, $$\bar{y}_2=9.83$$, $$s_2 = 0.2406$$ • Assume unequal variances, i.e., $$\sigma_1^2 \ne \sigma_2^2$$ # Confidence Interval The $$100(1- \alpha)\%$$ confidence interval (CI) for the difference $$\mu_1 - \mu_2$$ is constructed as follows: • Recall $$d = \bar{y}_1 - \bar{y}_2$$ and set $$\tilde{s}_p = \sqrt{\frac{s_1^2}{n_1} + \frac{s_2^2}{n_2}}$$ • Test statistic: $$\tilde{T} = \frac{d - (\mu_1 - \mu_2)}{\tilde{s}_p}$$ is approximated by a Student’s t distribution with df given earlier • CI: $(\bar{y}_1 - \bar{y}_2) \pm t_{\alpha/2,\textrm{df}}\tilde{s}_p$ # Exmaple 6.1 (cont’d) Illustration on CI • $$d = \bar{y}_1 - \bar{y}_2 = 10.37 - 9.83 = 0.54$$ • $$\tilde{s}_p = 0.1274$$ • $$df = 16.62774$$ • $$t_{0.025,17} = 2.11$$; T table CI: $$0.54 \pm 2.11 \times 0.1274$$, i.e., $$[0.271, 0.808]$$ # Hypothesis testing Testing when $$D_0$$ is a postulated value related to $$\mu_1 - \mu_2$$: • Recall $$d = \bar{y}_1 - \bar{y}_2$$ and set $$\tilde{s}_p = \sqrt{\frac{s_1^2}{n_1} + \frac{s_2^2}{n_2}}$$ • Test statistic: $$\tilde{T} = \frac{d - D_0}{\tilde{s}_p}$$ approximated by a Student’s t distribution with $df= \frac{(n_1-1)(n_2-1)}{(1-c)^2(n_1-1)+c^2(n_2-1)}$ and $$c = \frac{s_1^2/n_1}{\frac{s_1^2}{n_1}+\frac{s_2^2}{n_2}}$$ • Note: round df to the nearest integer # Exmaple 6.1: unequal variances At Type I error probability $$\alpha=0.05$$ • Test $$H_0: \mu_1 - \mu_2 \le 0.2$$ vs $$H_a: \mu_1 - \mu_2 > 0.2$$ • $$D_0 = 0.2$$ • Reject $$H_0$$ if $$\tilde{T} \ge t_{\alpha,df}$$ # Exmaple 6.1: unequal variances Obtain value of test statistic • $$\tilde{s}_p = 0.1274$$ • $$df = 16.62774$$ • $$t = \frac{10.37 - 9.83 - 0.2}{0.1274} = 2.668$$ • $$t_{0.05,17} = 1.74185$$; T table Compare $$t$$ with $$1.74185$$ # Independent samples with unequal variances: practice # Exmaple 6.1: unequal variances At Type I error probability $$\alpha=0.05$$ • Test $$H_0: \mu_1 - \mu_2 = D_0$$ vs $$H_a: \mu_1 - \mu_2 \ne D_0$$ • $$D_0 = 0$$ and $$t_{0.025,17} = 2.11$$ • Reject $$H_0$$ if $$|\tilde{T}| \ge t_{\alpha/2,df}$$ # Exmaple 6.1: unequal variances At Type I error probability $$\alpha=0.05$$ • Test $$H_0: \mu_1 - \mu_2 \ge D_0$$ vs $$H_0: \mu_1 - \mu_2 < D_0$$ • $$D_0 = 0.3$$ and $$t_{0.05,17} = 1.74185$$ • Reject $$H_0$$ if $$\tilde{T} \le - t_{\alpha,df}$$ # Independent sample with unequal variances: lab # Exmaple 6.1: unequal variances > potency = read.table("http://math.wsu.edu/faculty/xchen/stat412/data/t_uev.txt", + sep = "\t", header = TRUE) > class(potency) [1] "data.frame" > cFresh = potency$Fresh
> cFresh
[1] 10.2 10.5 10.3 10.8  9.8 10.6 10.7 10.2 10.0 10.6
> cStored = potency$Stored > cStored [1] 9.8 9.6 10.1 10.2 10.1 9.7 9.5 9.6 9.8 9.9 # Exmaple 6.1 (Recap) # Exmaple 6.1: unequal variances At Type I error probability $$\alpha=0.05$$ • Test $$H_0: \mu_1 - \mu_2 = 0$$ vs $$H_0: \mu_1 - \mu_2 \ne 0$$ > # compute critical value > qt(0.05/2, df = 17, ncp = 0, lower.tail = FALSE) [1] 2.109816 > # perform hypothesis testing > t.test(x = cFresh, y = cStored, alternative = "two.sided", mu = 0, + paired = FALSE, var.equal = FALSE, conf.level = 0.95) Welch Two Sample t-test data: cFresh and cStored t = 4.2368, df = 16.628, p-value = 0.000581 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: 0.2706369 0.8093631 sample estimates: mean of x mean of y 10.37 9.83  # Exmaple 6.1 (cont’d) At Type I error probability $$\alpha=0.02$$ • Test $$H_0: \mu_1 - \mu_2 \le 0.2$$ vs $$H_0: \mu_1 - \mu_2 > 0.2$$ > # compute critical value > qt(0.02, df = 17, ncp = 0, lower.tail = FALSE) [1] 2.223845 > # perform hypothesis testing > t.test(x = cFresh, y = cStored, alternative = "greater", mu = 0.2, + paired = FALSE, var.equal = FALSE, conf.level = 0.99) Welch Two Sample t-test data: cFresh and cStored t = 2.6676, df = 16.628, p-value = 0.008227 alternative hypothesis: true difference in means is greater than 0.2 99 percent confidence interval: 0.2120818 Inf sample estimates: mean of x mean of y 10.37 9.83  # Paired sample: exploration # Example 6.7 Are average estimates for repair costs from Garage I different than Garage II? > # estimates of cost from Garage I > GarageI = c(17.6, 20.2, 19.5, 11.3, 13, 16.3, 15.3, 16.2, 12.2, + 14.8, 21.3, 22.1, 16.9, 17.6, 18.4) > > # estimates of cost from Garage I > GarageII = c(17.3, 19.1, 18.4, 11.5, 12.7, 15.8, 14.9, 15.3, + 12, 14.2, 21, 21, 16.1, 16.7, 17.5) # Example 6.7 # Example 6.7 Boxplot of difference between estimates # Exmaple 6.7 # Example 6.7 Test $$H_0: \mu_1 - \mu_2 = 0$$ vs $$H_a: \mu_1 - \mu_2 \ne 0$$ • Try t test based on independent samples • test statistic value: 0.54616 • degrees of freedom: 27.797 • p-value of test: 0.5893 Conclusion … # Example 6.7 Histogram for differences between the estimates # Example 6.7 What is wrong with applying t test based on independent samples to this data set? • Are the two samples independent? • For each car, are the two estimates for it independent? • Normality violated? # Paired sample: inference # Construct test statistic On observations: • Sample 1: $$y_{1i}, i=1,\ldots,n$$ • Sample 2: $$y_{2i}, i=1,\ldots,n$$ • Differences: $$d_{i}= y_{1i} - y_{2i}, i=1,\ldots,n$$ Requirements: • Sampling distribution of $$d_{i}$$’s is Normal • The $$d_{i}$$’s are independent # Construct test statistic • Obtain $$\bar{d}$$ and $$s_d$$, the sample mean and stardand deviation of $$d_{i}$$’s • $$D_0$$: a specified value on $$\mu_d = \mu_1 - \mu_2$$ • Test statistic: $$T = \frac{\bar{d} - D_0}{s_d/\sqrt{n}}$$ follows a t distribution with $$\textrm{df}=n-1$$ # Exmaple 6.7 Check Normality on differences p-value of KS test: 0.8008 # Exmaple 6.7: hypothesis testing • Recall $$\mu_d = \mu_1 - \mu_2$$ • Assess $$H_0: \mu_d \le 0$$ vs $$H_a: \mu_d >0$$ • Observed T.S. value: $$t = \frac{0.613-0}{0.394/\sqrt{15}} = 6.026$$ • Critical value at Type I error probability $$\alpha = 0.05$$ is $$t_{0.05,14}=1.761$$; T table • Reject $$H_0$$ if $$T \ge t_{\alpha,n-1}$$ # Exmaple 6.7: confidence interval The $$(1-\alpha)\%$$ CI for $$\mu_d = \mu_1 - \mu_2$$ is $\bar{d} \pm t_{\alpha/2,n-1}\frac{s_d}{\sqrt{n}}$ Computing CI: • $$\bar{d} = 0.613$$, $$s_d = 0.394$$, $$t_{0.025,14}=2.14$$ • 95% CI is: $$0.613 \pm 2.14\times \frac{0.394}{\sqrt{15}}$$, i.e., $$[0.395,0.831]$$ # Paired sample: practice # Exercise 1 The $$(1-\alpha)\%$$ CI for $$\mu_d = \mu_1 - \mu_2$$ is $\bar{d} \pm t_{\alpha/2,n-1}\frac{s_d}{\sqrt{n}}$ Construct 99% CI: • $$n = 12$$, $$\bar{d} = 0.5$$, $$s_d = 0.49$$, • $$t_{0.01,14}$$ T table # Exercise 2 • Test statistic: $$T = \frac{\bar{d} - D_0}{s_d/\sqrt{n}}$$ follows a t distribution with $$\textrm{df}=n-1$$ • Recall $$\mu_d = \mu_1 - \mu_2$$ • At Type I error prob $$\alpha = 0.05$$, test $$H_0: \mu_d= 0.5$$ vs $$H_a: \mu_d \ne 0.5$$ • $$n = 16$$, $$\bar{d} = 0.7$$, $$s_d = 1$$, • $$D_0 =0.5$$ and $$t_{0.025,16}$$; T table • Reject $$H_0$$ when $$|T| > t_{\alpha/2,df}$$ # Paired sample: lab # Exmaple 6.7 > RepairCost = read.table("http://math.wsu.edu/faculty/xchen/stat412/data/pairedT.txt", + sep = "\t", header = T) > RepairCost[1:3, ] GarageI GarageII 1 17.6 17.3 2 20.2 19.1 3 19.5 18.4 > GarageI = RepairCost$GarageI
> GarageII = RepairCost\$GarageII

# Exmaple 6.7: CI

Construct confidence interval

> t.test(GarageI, GarageII, alternative = "two.sided", mu = 0,
+     paired = TRUE, var.equal = FALSE, conf.level = 0.95)

Paired t-test

data:  GarageI and GarageII
t = 6.0234, df = 14, p-value = 3.126e-05
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
0.3949412 0.8317254
sample estimates:
mean of the differences
0.6133333 

# Exmaple 6.7: hypothesis testing

Test on difference $$\mu_d = \mu_1 - \mu_2$$

• Test $$H_0: \mu_d \le 0$$ vs $$H_a: \mu_d >0$$
• Reject $$H_0$$ if $$T \ge t_{\alpha,df}$$
> t.test(GarageI, GarageII, alternative = "greater", mu = 0, paired = TRUE,
+     var.equal = FALSE, conf.level = 0.95)

Paired t-test

data:  GarageI and GarageII
t = 6.0234, df = 14, p-value = 1.563e-05
alternative hypothesis: true difference in means is greater than 0
95 percent confidence interval:
0.4339886       Inf
sample estimates:
mean of the differences
0.6133333 

Conclusion?

# Exmaple 6.7: hypothesis testing

Test on difference $$\mu_d = \mu_1 - \mu_2$$

• Test $$H_0: \mu_d = 0.4$$ vs $$H_a: \mu_d \ne 0.4$$
• Reject $$H_0$$ if $$|T| \ge t_{\alpha/2,df}$$
> t.test(GarageI, GarageII, alternative = "two.sided", mu = 0.4,
+     paired = TRUE, var.equal = FALSE, conf.level = 0.95)

Paired t-test

data:  GarageI and GarageII
t = 2.0951, df = 14, p-value = 0.05483
alternative hypothesis: true difference in means is not equal to 0.4
95 percent confidence interval:
0.3949412 0.8317254
sample estimates:
mean of the differences
0.6133333 

Conclusion?

# Exmaple 6.7: hypothesis testing

Test on difference $$\mu_d = \mu_1 - \mu_2$$

• Test $$H_0: \mu_d \ge 0.3$$ vs $$H_a: \mu_d < 0.3$$
• Reject $$H_0$$ if $$T \le - t_{\alpha,df}$$
> t.test(GarageI, GarageII, alternative = "less", mu = 0.3, paired = TRUE,
+     var.equal = FALSE, conf.level = 0.95)

Paired t-test

data:  GarageI and GarageII
t = 3.0772, df = 14, p-value = 0.9959
alternative hypothesis: true difference in means is less than 0.3
95 percent confidence interval:
-Inf 0.7926781
sample estimates:
mean of the differences
0.6133333 

Conclusion?

# Summary 1

• The rejection regions are similar to those for t test based on independent samples
• Each pair of observations are usually obtained from the same individual or item (i.e., from the same experimental unit)
• For each experimental unit, the pair of observations are dependent
• The pairs of observations are independent

# Summary 2

• Test is applied to differences obtained from the pairs
• Test is essentially a one-sample t test where sample is the pairwise differences
• Rejection regions are very similar to those of one-sample t test

# Paired sample: other examples

• Measurements on the same experimental unit obtained before and after receiving a treatement, e.g., in assessing drug effect
• Measurements obtained on two experimental units with similar features, e.g., in assessing improvement of new teaching methods

# Motivations

• Variation of potency for drug
• Risks in portfolios
• Assess equality of variances in two-sample test

# Sample variance

• Random sample $$y_1,\ldots,y_n$$ with mean $$\mu$$ and variance $$\sigma^2$$

• Sample mean $$\bar{y}=\frac{1}{n}\sum_{i=1}^n y_i$$

• Sample variance $$s^2 = \frac{1}{n-1}\sum_{i=1}^n (y_i - \bar{y})^2$$

# Statistic

• Random sample 1: $$y_{1i}, i=1,\ldots,n_1$$ follow $$\mathsf{N}(\mu_1, \sigma_1^2)$$
• Random sample 2: $$y_{21}, i=1,\ldots,n_2$$ follow $$\mathsf{N}(\mu_2, \sigma_2^2)$$
• Sample variance $$s_1^2$$ for random sample 1; sample variance $$s_2^2$$ for random sample 2

Then $\frac{s^2/\sigma_1^2}{s_2^2/\sigma_2^2}$ follows an F distribution

# Density of an F distribution

Density: $$df_1=3$$ and $$df_2=5$$

# F distribution

• not symmetrical
• with non-negative values
• with $$df_1$$ (for $$s_1^2$$) and $$df_2$$ (for $$s_2^2$$)

# Hypothesis testing

Test statistic $$F= \frac{s_1^2}{s_2^2}$$

• For $$H_0: \sigma_1^2 \le \sigma_2^2$$ vs $$H_a: \sigma_1^2 > \sigma_2^2$$, reject $$H_0$$ if $$F \ge F_{\alpha,df_1,df_2}$$

• For $$H_0: \sigma_1^2= \sigma_2^2$$ vs $$H_a: \sigma_1^2 \ne \sigma_2^2$$, reject $$H_0$$ if $$F \ge F_{\alpha/2,df_1,df_2}$$ or $$F \le F_{1-\alpha/2,df_1,df_2}$$

• F table; $$F_{1-\alpha,df_1,df_2}= \frac{1}{F_{\alpha,df_2,df_1}}$$

# Confidence interval

$$100(1-\alpha)\%$$ confidence interval for $$\sigma_1^2/\sigma_2^2$$ is constructed as follows:

• obtain $$s_1^2$$, $$s_2^2$$ and $$s_1^2/s_2^2$$
• obtain $$df_1 = n_1 -1$$ and $$df_2 = n_2 -1$$
• obtain $$F_U = F_{\alpha/2,df_2,df_1}$$ and $$F_L = 1/F_{\alpha/2,df_1,df_2}$$

• confidence interval: $$\left[\frac{s_1^2}{s_2^2}F_L,\frac{s_1^2}{s_2^2}F_U\right]$$

# Example 6.1 (revist)

Data:

> # potency readings for Fresh drug
> cFresh = c(10.2, 10.5, 10.3, 10.8, 9.8, 10.6, 10.7, 10.2, 10,
+     10.6)
>
> # potency readings for Stored drug
> cStored = c(9.8, 9.6, 10.1, 10.2, 10.1, 9.7, 9.5, 9.6, 9.8, 9.9)

Nomality test

# Example 6.1: testing variances

At Type I error probability $$\alpha = 0.05$$, test $$H_0: \sigma_1^2= \sigma_2^2$$ vs $$H_a: \sigma_1^2 \ne \sigma_2^2$$

• Sample 1: Fresh; Sample 2: Stored
• $$n_1 = 10$$, $$n_2 = 10$$; $$s_1^2 = 0.10$$, $$s_2^2= 0.06$$
• $$F = \frac{s_1^2}{s_2^2} = 1.67$$
• $$F_{0.025,10,10} = 3.72$$; $$F_{0.975,10,10} =1/3.72= 0.27$$; F table

Reject $$H_0$$ if $$F \ge F_{\alpha/2,df_1,df_2}$$ or $$F \le F_{1-\alpha/2,df_1,df_2}$$. Conclusion?

# Example 6.1: CI for variances

$$95\%$$ confidence interval for $$\sigma_1^2/\sigma_2^2$$

• $$n_1 = 10$$, $$n_2 = 10$$
• $$s_1^2 = 0.10$$, $$s_2^2= 0.06$$ and $$\frac{s_1^2}{s_2^2} = 1.67$$
• $$F_U = F_{0.025,10,10} = 3.72$$; $$F_L = F_{0.975,10,10} = 0.27$$
• CI: $$[1.67\times 0.27, 1.67\times 3.72]$$

F table

# Simple rules on homework

• If you use software to obtain answers, please attach the codes
• Please put codes close to their associated answers
• Homework assignments will be announced via Blackboard

# License and session Information

License

> sessionInfo()
R version 3.3.0 (2016-05-03)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 15063)

locale:
[1] LC_COLLATE=English_United States.1252
[2] LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C
[5] LC_TIME=English_United States.1252

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
[1] knitr_1.17

loaded via a namespace (and not attached):
[1] backports_1.1.0 magrittr_1.5    rprojroot_1.2   formatR_1.5
[5] tools_3.3.0     htmltools_0.3.6 revealjs_0.9    yaml_2.1.14
[9] Rcpp_0.12.12    stringi_1.1.5   rmarkdown_1.6   stringr_1.2.0
[13] digest_0.6.12   evaluate_0.10.1