R包MatchIt错误的摘要输出

时间:2014-01-17 18:15:08

标签: r matching summary

我通过包MatchIt运行粗略精确匹配(CEM)作为预处理步骤,并希望在进一步分析中使用匹配的数据。在查看匹配数据的摘要统计时,我注意到从匹配数据集中提取的方法与MatchIt摘要输出不同。例如,使用lalonde数据集:

library(MatchIt)
library(doBy)
data(lalonde)

m.out <- matchit(treat ~ age + educ + black + hispan + married + nodegree + re74 + re75, data = lalonde, method = "cem")
summary(m.out)   #Means from MatchIt summary output:

Summary of balance for matched data: 

             Means Treated   Means Control 
 age         21.5441         21.1781 
 educ        10.2941         10.3827 
 black       0.8676          0.8676 
 hispan      0.0588          0.0588 
 married     0.0441          0.0441 
 nodegree    0.6176          0.6176 
 re74        456.1345        622.8740 
 re75        350.6728        520.7135 

m.dat<-match.data(m.out)
ExtractedMeans<-summaryBy(age+educ+black+hispan+married+nodegree+re74+re75 ~ treat, data = m.dat, FUN=function(x) { c(Mean=mean(x)) } )
ExtractedMeans   #Means extracted manually from matched data:

treat         1          0 
age.Mean      21.544    19.628 
educ.Mean     10.294     9.7179 
black.Mean    0.8676    0.60256 
hispan.Mean   0.0588    0.10256 
married.Mean  0.0441    0.07692 
nodegree.Mean 0.6176    0.75641 
re74.Mean     456.13    609.61 
re75.Mean     350.67    464.22 

从匹配数据手动提取的控制组的均值与MatchIt摘要输出不一致。有谁知道这里发生了什么?我上周将这个问题发布到了MatchIt gmane电子邮件列表中,但没有收到回复。谢谢你的帮助。

1 个答案:

答案 0 :(得分:2)

'doSummary'功能不使用权重。如果将权重乘以您想要平均的变量,您将获得与包显示的平均值相同的平均值。举个例子,拿你的代码来做这个:

> tapply(m.dat$age, m.dat$treat, mean)
       0        1 
19.62821 21.54412

> tapply(m.dat$age*m.dat$weights, m.dat$treat, mean)
       0        1 
21.17811 21.54412

所以,它们与MatchIt结果相同......