Two-Way independent samples & Mixed design ANOVA with R Example

Data: WOW_data

1.獨立樣本二因子變異數分析（Two-way ANOVA）

  研究者想了解閱讀成績(reading_3)在不同族群間(ethnic)及性別(gender)上是否有差異?

用R讀取剛剛的CSV檔，並將此資料命名為aov2

aov2        <- read.csv("D:/104/ML_R/WOW_data.csv", header=TRUE, sep=",")

進行獨立樣本二因子變異數分析

aov2$gender<-as.factor(aov2$gender)
aov2.aov     <- aov(reading_3 ~ ethnic*gender ,data=aov2)
summary(aov2.aov)

##                Df Sum Sq Mean Sq F value  Pr(>F)    
## ethnic          2  10758    5379  17.475 1.1e-07 ***
## gender          1    892     892   2.898  0.0904 .  
## ethnic:gender   2    781     391   1.269  0.2835    
## Residuals     187  57560     308                    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

輸出各種平均數

利用print(model.tables(data,”means”),digits=X) 輸出各種平均數，digits為小數點後第幾位四捨五入。

print(model.tables(aov2.aov,"means"),digits=3)

## Tables of means
## Grand mean
##          
## 499.3264 
## 
##  ethnic 
##     Human  Orc  Undead 
##        506  486     500
## rep     87   42      64
## 
##  gender 
##       0   1
##     502 497
## rep  89 104
## 
##  ethnic:gender 
##          gender
## ethnic    0   1  
##   Human   506 505
##   rep      38  49
##   Orc     493 481
##   rep      19  23
##   Undead  501 498
##   rep      32  32

載入套件，繪製平均數圖

library(car)
library(sandwich)
library(RcmdrMisc)

with(aov2, plotMeans(reading_3,ethnic,gender,error.bars="se"))

with(aov2, plotMeans(reading_3,gender,ethnic,error.bars="none"))

結論

由上報表顯示，僅有一主要效果達顯著水準。表示閱讀成績在性別上並未達到顯著差異，(F (1,187)=2.898,   p=0.09)；另外族群方面達到統計顯著，意即不同族群間的閱讀成績均值確實不同 (F (2,187)=17.475, p<0.001)；而兩因子交互作用並未顯著水準(F(2,187)=1.26,p=0.284)，顯示兩因子在不同組合下，對於依變項並無不同的影響。不過，為了教學完整度，以下分析還是會呈現交互作用顯著時的調整方法。

單純主要效果檢驗

在執行檢驗之前，我們必須將資料分割，使得被檢驗的自變項的主要效果是限定於另一自變項的特定水準下檢驗(邱皓政，2010)。

用split(資料,資料$欲分割之水準)，再分別作單純主要效果檢驗可使用 view() 去看看分割後的資料長甚麼樣子。

fixA <-split(aov2,aov2$gender)
fixB <-split(aov2,aov2$ethnic)

性別

fixAF.aov   <- aov(reading_3~ ethnic ,data=fixA$`0`)
fixAM.aov   <- aov(reading_3 ~ ethnic ,data=fixA$`1`)

性別結果

summary(fixAF.aov)

##             Df Sum Sq Mean Sq F value Pr(>F)  
## ethnic       2   2423  1211.3   3.645 0.0302 *
## Residuals   86  28577   332.3                 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

summary(fixAM.aov)

##              Df Sum Sq Mean Sq F value   Pr(>F)    
## ethnic        2   9183    4591      16 9.19e-07 ***
## Residuals   101  28982     287                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

種族

fixB_eth1.aov   <- aov(reading_3 ~ gender ,data=fixB$`Orc `)
fixB_eth2.aov   <- aov(reading_3 ~ gender ,data=fixB$`Undead`)
fixB_eth3.aov   <- aov(reading_3 ~ gender ,data=fixB$`Human `)

種族結果

summary(fixB_eth1.aov)

##             Df Sum Sq Mean Sq F value Pr(>F)  
## gender       1   1456  1456.3    3.57 0.0661 .
## Residuals   40  16318   407.9                 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

summary(fixB_eth2.aov)

##             Df Sum Sq Mean Sq F value Pr(>F)
## gender       1    172   172.3   0.612  0.437
## Residuals   62  17447   281.4

summary(fixB_eth3.aov)

##             Df Sum Sq Mean Sq F value Pr(>F)
## gender       1     45   44.64   0.159  0.691
## Residuals   85  23795  279.94

 利用pf( F值, df1=?, df2=?, lower.tail=FALSE)來計算F值的顯著性，先整理一下，
 性別的單純主要效果F值及其自由度：
    Female=3.93, df1=2, df2=86
    Male  =14.9, df1=2, df2=101
 種族的單純主要效果F值及其自由度：
    Orc    種族=4.72, df1=1, df2=40
    Undead 種族=0.56, df1=1, df2=62
    Human  種族=0.14, df1=1, df2=85

pf(3.93, df1=2, df2=86,lower.tail=FALSE)

## [1] 0.02326857

pf(14.9, df1=2, df2=101,lower.tail=FALSE)

## [1] 2.13569e-06

pf(4.72, df1=1, df2=40,lower.tail=FALSE)

## [1] 0.03579687

pf(0.56, df1=1, df2=62,lower.tail=FALSE)

## [1] 0.4570896

pf(0.14, df1=1, df2=85,lower.tail=FALSE)

## [1] 0.7092128

結論

  在固定男女生性別的情況下，顯示無論是男或女，種族之間的閱讀成績均值皆達到顯著差異(女生F(2,86)=3.93, p<.05；男生F(2,101)=14.9, p<.001)，且由均值剖面圖可以得知種族Human的表現最好，其次是種族Undead，最後才是種族Orc。
  在固定三種種族的情況下，對於三種種族來說，種族Orc的女生在閱讀成績方面都較男生高。(F(1,40)=4.72 , p<.05)，但對於其他種族來說，則未達到顯著差異。(種族Undead：F(1,62)=.056 , p=.457)；(種族Human：F(1,85)=.14 , p=.709)。

2.混合設計二因子變異數分析

研究者想了解閱讀成績在不同種族或是時間點上是否存在差異。如下圖所示:

跟RM-ANOVA一樣，我們必須先將資料重新排列成R可以進行分析的形式，如下圖所示：

用R讀取CSV檔，並將此資料命名為aov_mix_raw，選取所要分析的變項(reading_1、reading_2、reading_3) 並另存為aov_mix

aov_mix_raw <- read.csv("D:/104/ML_R/WOW_data.csv", header=TRUE, sep=",")
aov_mix_1<-aov_mix_raw[c(12,17,22)]

資料整理

aov_mix<-stack(aov_mix_1)

aov_mix$Mix_id = rep(aov_mix_raw$ID, 3)
aov_mix$Mix_ethnic = rep(aov_mix_raw$ethnic, 3)

colnames(aov_mix) = c("Mix_read_score", "Mix_time", "Mix_id","Mix_ethnic")
head(aov_mix)

##   Mix_read_score  Mix_time  Mix_id Mix_ethnic
## 1            492 reading_3 1041002     Human 
## 2            494 reading_3 1041007     Human 
## 3            523 reading_3 1041008     Human 
## 4            476 reading_3 1041012     Human 
## 5            515 reading_3 1041020     Human 
## 6            450 reading_3 1041021     Human

進行相依樣本的球型檢定請先載入 ez套件，並使用ezANOVA( )，別忘了將id轉換成factor。

aov_mix$Mix_id <-as.factor(aov_mix$Mix_id)
aov_mix$Mix_ethnic <-as.factor(aov_mix$Mix_ethnic)
library(ez)
ezANOVA(data=aov_mix, dv=.(Mix_read_score), wid=.(Mix_id), within=.(Mix_time),between=.(Mix_ethnic), detailed=TRUE)

## Warning: Data is unbalanced (unequal N per group). Make sure you specified
## a well-considered value for the type argument to ezANOVA().

## $ANOVA
##                Effect DFn DFd          SSn       SSd            F
## 1         (Intercept)   1 190 1.381598e+08 181893.70 1.443170e+05
## 2          Mix_ethnic   2 190 2.875359e+04 181893.70 1.501751e+01
## 3            Mix_time   2 380 4.701217e+04  17832.36 5.009047e+02
## 4 Mix_ethnic:Mix_time   4 380 1.728066e+02  17832.36 9.206089e-01
##               p p<.05          ges
## 1 1.133493e-275     * 0.9985564701
## 2  8.807846e-07     * 0.1258474799
## 3 2.976686e-107     * 0.1905345967
## 4  4.518704e-01       0.0008644701
## 
## $`Mauchly's Test for Sphericity`
##                Effect         W           p p<.05
## 3            Mix_time 0.9312618 0.001194766     *
## 4 Mix_ethnic:Mix_time 0.9312618 0.001194766     *
## 
## $`Sphericity Corrections`
##                Effect       GGe         p[GG] p[GG]<.05       HFe
## 3            Mix_time 0.9356828 1.484898e-100         * 0.9446484
## 4 Mix_ethnic:Mix_time 0.9356828  4.471951e-01           0.9446484
##           p[HF] p[HF]<.05
## 3 1.729743e-101         *
## 4  4.478689e-01

(由上述報表發現，球型檢定p<.05，因此本資料違反球型假定。)

進行混合設計多因子變異數分析

aov_mix_re  <- aov(Mix_read_score ~ (Mix_ethnic*Mix_time) +  Error(Mix_id/Mix_time) ,data=aov_mix)
summary(aov_mix_re , multivariate=FALSE)

## 
## Error: Mix_id
##             Df Sum Sq Mean Sq F value   Pr(>F)    
## Mix_ethnic   2  28754   14377   15.02 8.81e-07 ***
## Residuals  190 181894     957                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Error: Mix_id:Mix_time
##                      Df Sum Sq Mean Sq F value Pr(>F)    
## Mix_time              2  47012   23506 500.905 <2e-16 ***
## Mix_ethnic:Mix_time   4    173      43   0.921  0.452    
## Residuals           380  17832      47                   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

輸出各變項平均數、並繪製平均數比較圖

print(model.tables(aov_mix_re,"means"),digits=3)

## Tables of means
## Grand mean
##          
## 488.4853 
## 
##  Mix_ethnic 
##     Human  Orc  Undead 
##        494  476     489
## rep    261  126     192
## 
##  Mix_time 
##     reading_1 reading_2 reading_3
##           477       489       499
## rep       193       193       193
## 
##  Mix_ethnic:Mix_time 
##           Mix_time
## Mix_ethnic reading_1 reading_2 reading_3
##    Human   483       494       506      
##    rep      87        87        87      
##    Orc     465       476       486      
##    rep      42        42        42      
##    Undead  478       491       500      
##    rep      64        64        64

with(aov_mix, plotMeans(Mix_read_score, Mix_ethnic,Mix_time,error.bars="se"))

with(aov_mix, plotMeans(Mix_read_score, Mix_time,Mix_ethnic,error.bars="se"))

事後比較(Mix_time)

library(multcomp)
library(nlme)

pHoc_time<-lme(Mix_read_score~Mix_time,random = ~1|Mix_id/Mix_time, data = aov_mix)
summary(glht(pHoc_time, linfct=mcp(Mix_time="Tukey")))

## 
##   Simultaneous Tests for General Linear Hypotheses
## 
## Multiple Comparisons of Means: Tukey Contrasts
## 
## 
## Fit: lme.formula(fixed = Mix_read_score ~ Mix_time, data = aov_mix, 
##     random = ~1 | Mix_id/Mix_time)
## 
## Linear Hypotheses:
##                            Estimate Std. Error z value Pr(>|z|)    
## reading_2 - reading_1 == 0  11.6010     0.6971   16.64   <2e-16 ***
## reading_3 - reading_1 == 0  22.0622     0.6971   31.65   <2e-16 ***
## reading_3 - reading_2 == 0  10.4611     0.6971   15.01   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## (Adjusted p values reported -- single-step method)

事後比較(Mix_ethnic)

pHoc_ethnic<-lme(Mix_read_score~Mix_ethnic,random = ~1|Mix_id/Mix_ethnic, data = aov_mix)
summary(glht(pHoc_ethnic, linfct=mcp(Mix_ethnic="Tukey")))

## 
##   Simultaneous Tests for General Linear Hypotheses
## 
## Multiple Comparisons of Means: Tukey Contrasts
## 
## 
## Fit: lme.formula(fixed = Mix_read_score ~ Mix_ethnic, data = aov_mix, 
##     random = ~1 | Mix_id/Mix_ethnic)
## 
## Linear Hypotheses:
##                       Estimate Std. Error z value Pr(>|z|)    
## Orc  - Human  == 0     -18.339      3.356  -5.464   <0.001 ***
## Undead  - Human  == 0   -4.803      2.942  -1.633     0.23    
## Undead  - Orc  == 0     13.535      3.547   3.816   <0.001 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## (Adjusted p values reported -- single-step method)

結論

 由此混和設計多因子變異數分析報表得知: 相依樣本之球型檢定結果拒絕虛無假設(Mauchly^' s W=0.931,p<.05)，因此需要進行矯正。由矯正過後的報表可以得知，因子間交互效果並未顯著，且剖面圖顯示並無明顯之交叉或非平行直線。由混合設計二因子變異數分析摘要表可知種族因子（獨立樣本）達顯著水準（F（2,190）=15.02,p<.001），顯示三種種族在閱讀成績均值上存在著顯著差異。且受測者內的三個重複量測時間點之平均數達顯著差異（F（2,380）=500.9,p<.001），顯示在不同的量測時間點上，學生們的閱讀成績確有不同。由事後比較表可得知，除了種族Undead和種族Human並無顯著差異外（均值差異=4.8,p=0.23，則其他水準均值相互比較均達顯著差異，以三個時間點的閱讀成績來說，reading_3> reading_2> reading_1，也就是隨著年齡增加，其閱讀成績表現也會跟著成長；種族則是Human>Undead且Undead> Orc。
 （雖然上例之交互效果不顯著，但為教學說明，仍以此例示範單純主要效果檢驗流程。）

單純主要效果檢驗（當因子間交互效果顯著時進行）

在執行檢驗之前，我們必須將資料分割，使得被檢驗的自變項的主要效果是限定於另一自變項的特定水準下檢驗（邱皓政，2010)。在混合設計多因子分析中： 1. 執行相依因子之單純主要效果檢驗，我們需對獨立因子作資料切割。 2. 執行獨立因子之單純主要效果檢驗，我們需對相依因子作資料切割。

fixtime   <- split(aov_mix,aov_mix$Mix_time)
fixmale <- split(aov_mix,aov_mix$Mix_ethnic)

進行相依因子單純主要效果檢驗

fix_eth1.aov    <- aov(Mix_read_score ~ Mix_time + Error(Mix_id/Mix_time) ,data=fixmale$ `Orc `)
summary(fix_eth1.aov)

## 
## Error: Mix_id
##           Df Sum Sq Mean Sq F value Pr(>F)
## Residuals 41  54440    1328               
## 
## Error: Mix_id:Mix_time
##           Df Sum Sq Mean Sq F value Pr(>F)    
## Mix_time   2   9269    4635   154.2 <2e-16 ***
## Residuals 82   2465      30                   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

fix_eth2.aov    <- aov(Mix_read_score ~ Mix_time + Error(Mix_id/Mix_time) ,data=fixmale$ `Undead `)
summary(fix_eth2.aov)

## 
## Error: Mix_id
##           Df Sum Sq Mean Sq F value Pr(>F)
## Residuals 63  56859   902.5               
## 
## Error: Mix_id:Mix_time
##            Df Sum Sq Mean Sq F value Pr(>F)    
## Mix_time    2  15602    7801    99.7 <2e-16 ***
## Residuals 126   9858      78                   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

fix_eth3.aov    <- aov(Mix_read_score ~ Mix_time + Error(Mix_id/Mix_time) ,data=fixmale$ `Human `)
summary(fix_eth3.aov)

## 
## Error: Mix_id
##           Df Sum Sq Mean Sq F value Pr(>F)
## Residuals 86  70594   820.9               
## 
## Error: Mix_id:Mix_time
##            Df Sum Sq Mean Sq F value Pr(>F)    
## Mix_time    2  22314   11157   348.4 <2e-16 ***
## Residuals 172   5509      32                   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

進行獨立因子單純主要效果檢驗

fixT1.aov   <- aov(Mix_read_score ~ Mix_ethnic ,data=fixtime$reading_1)
summary(fixT1.aov)

##              Df Sum Sq Mean Sq F value   Pr(>F)    
## Mix_ethnic    2   9026    4513   10.78 3.67e-05 ***
## Residuals   190  79523     419                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

fixT2.aov   <- aov(Mix_read_score ~ Mix_ethnic ,data=fixtime$reading_2)
summary(fixT2.aov)

##              Df Sum Sq Mean Sq F value   Pr(>F)    
## Mix_ethnic    2   9143    4571   14.25 1.72e-06 ***
## Residuals   190  60970     321                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

fixT3.aov   <- aov(Mix_read_score ~ Mix_ethnic ,data=fixtime$ reading_3)
summary(fixT3.aov)

##              Df Sum Sq Mean Sq F value  Pr(>F)    
## Mix_ethnic    2  10758    5379   17.25 1.3e-07 ***
## Residuals   190  59233     312                    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

 利用pf( F值, df1=?, df2=?, lower.tail=FALSE)來計算F值的顯著性先整理一下
 相依因子的單純主要效果F值及其自由度：
      種族Orc    =98.61,   df1=2 df2=82
      種族Undead =165.97,  df1=2 df2=126
      種族Human  =237.38,  df1=2 df2=172
 獨立因子的單純主要效果F值及其自由度：
      reading_1 =25.75,    df1=2 df2=190
      reading_2 =26.09,    df1=2 df2=190
      reading_3 =30.7,     df1=2 df2=190

pf(98.61,   df1=2, df2=82,lower.tail=FALSE)

## [1] 1.522511e-22

pf(165.97,  df1=2, df2=126,lower.tail=FALSE)

## [1] 4.924973e-36

pf(237.38,  df1=2, df2=172,lower.tail=FALSE)

## [1] 3.400442e-50

pf(25.75,   df1=2, df2=190,lower.tail=FALSE)

## [1] 1.271933e-10

pf(26.09,   df1=2, df2=190,lower.tail=FALSE)

## [1] 9.737691e-11

pf(30.7,    df1=2, df2=190,lower.tail=FALSE)

## [1] 2.79828e-12

結論

 由上表可知，相依因子（不同時間下的閱讀成績），在獨立因子的三個水準下，都呈現統計顯著的結果：整體檢驗的相依因子結果顯示受測者內的三個重複量測反應時間之平均數達顯著差異（F（2,380）=500.905,p<0.001），顯示在不同的量測時間點上，學生們的平均閱讀成績確有不同；而單純主要效果顯示出無論種族為何，三個時段都呈現出均值不全相等的結果（種族Orc水準下：F（2,82）=98.61, p<0.001; 種族Undead水準下：F（2,126）=165.97, p<0.001, 種族Human水準下：F（2,172）=237.38, p<0.001）。至於獨立因子（種族），由整體檢驗可知種族因子（獨立樣本）也達顯著水準: F (2,190) =15.02, p<.001，顯示種族與閱讀成績間存在著顯著關係；而單純主要效果顯示出無論時段為何，種族與閱讀成績具有顯著關係（reading_1水準下：F（2,190）=25.75, p<0.001; reading_2水準下：F（2,190）=26.09, p<0.001, reading_3水準下：F（2,190）=30.7, p<0.001）。