如何将数据框与相同的列名合并

时间:2014-03-21 09:47:45

标签: r merge

我有一个数据框,如下所示:

structure(list(Variables = structure(list(ADA = "ADA", LEAD = "LEAD", 
    BIG4 = "BIG4", LOGMKT = "LOGMKT", LEV = "LEV", ROA = "ROA", 
    ROAL = "ROAL", LOSS = "LOSS", CFO = "CFO", BTM = "BTM", ABSACCRL = "ABSACCRL", 
    GROWTH = "GROWTH", ALTMAN = "ALTMAN", STDEARN = "STDEARN", 
    TENURE = "TENURE", LOGASSETS = "LOGASSETS"), .Names = c("ADA", 
"LEAD", "BIG4", "LOGMKT", "LEV", "ROA", "ROAL", "LOSS", "CFO", 
"BTM", "ABSACCRL", "GROWTH", "ALTMAN", "STDEARN", "TENURE", "LOGASSETS"
)), Mean = structure(list(ADA = 0.061, LEAD = 0.348, BIG4 = 0.7, 
    LOGMKT = 4.893, LEV = 0.512, ROA = -0.061, ROAL = -0.058, 
    LOSS = 0.41, CFO = 0.026, BTM = 0.96, ABSACCRL = 0.107, GROWTH = 0.14, 
    ALTMAN = 2.031, STDEARN = 45.116, TENURE = 0.841, LOGASSETS = 5.433), .Names = c("ADA", 
"LEAD", "BIG4", "LOGMKT", "LEV", "ROA", "ROAL", "LOSS", "CFO", 
"BTM", "ABSACCRL", "GROWTH", "ALTMAN", "STDEARN", "TENURE", "LOGASSETS"
)), SD = structure(list(ADA = 0.062, LEAD = 0.476, BIG4 = 0.458, 
    LOGMKT = 2.245, LEV = 0.307, ROA = 0.251, ROAL = 0.254, LOSS = 0.492, 
    CFO = 0.192, BTM = 1.553, ABSACCRL = 0.124, GROWTH = 0.431, 
    ALTMAN = 5.155, STDEARN = 100.082, TENURE = 0.365, LOGASSETS = 2.049), .Names = c("ADA", 
"LEAD", "BIG4", "LOGMKT", "LEV", "ROA", "ROAL", "LOSS", "CFO", 
"BTM", "ABSACCRL", "GROWTH", "ALTMAN", "STDEARN", "TENURE", "LOGASSETS"
)), Median = structure(list(ADA = 0.042, LEAD = 0, BIG4 = 1, 
    LOGMKT = 4.986, LEV = 0.476, ROA = 0.021, ROAL = 0.022, LOSS = 0, 
    CFO = 0.069, BTM = 0.754, ABSACCRL = 0.07, GROWTH = 0.073, 
    ALTMAN = 2.404, STDEARN = 11.078, TENURE = 1, LOGASSETS = 5.448), .Names = c("ADA", 
"LEAD", "BIG4", "LOGMKT", "LEV", "ROA", "ROAL", "LOSS", "CFO", 
"BTM", "ABSACCRL", "GROWTH", "ALTMAN", "STDEARN", "TENURE", "LOGASSETS"
))), .Names = c("Variables", "Mean", "SD", "Median"), row.names = c(NA, 
-16L), class = "data.frame")

我有另一张桌子就像这样。出于实验目的,您可以假设相同的数据框。我想通过Variables合并这两个数据表(当然它们具有相同的列名)。

merge(a_cada,a_cada, by = c("Variables"))

Error in sort.list(bx[m$xi]) : 'x' must be atomic for 'sort.list'
Have you called 'sort' on a list?

当我合并两个数据帧时,我得到了如上所述的错误。有人能告诉我这个问题的解决方案吗?

1 个答案:

答案 0 :(得分:2)

发生错误是因为data.frame不正常data.frame list lists

str(a_cada, max.level = 1)
## 'data.frame':    16 obs. of  4 variables:
##  $ Variables:List of 16
##  $ Mean     :List of 16
##  $ SD       :List of 16
##  $ Median   :List of 16

请尝试以下操作将其转换为普通data.frame,然后尝试merge

DF <- as.data.frame(lapply(a_cada, function(X) unname(unlist(X))))
merge(DF, DF, by = c("Variables"))
##    Variables Mean.x    SD.x Median.x Mean.y    SD.y Median.y
## 1   ABSACCRL  0.107   0.124    0.070  0.107   0.124    0.070
## 2        ADA  0.061   0.062    0.042  0.061   0.062    0.042
## 3     ALTMAN  2.031   5.155    2.404  2.031   5.155    2.404
## 4       BIG4  0.700   0.458    1.000  0.700   0.458    1.000
## 5        BTM  0.960   1.553    0.754  0.960   1.553    0.754
## 6        CFO  0.026   0.192    0.069  0.026   0.192    0.069
## 7     GROWTH  0.140   0.431    0.073  0.140   0.431    0.073
## 8       LEAD  0.348   0.476    0.000  0.348   0.476    0.000
## 9        LEV  0.512   0.307    0.476  0.512   0.307    0.476
## 10 LOGASSETS  5.433   2.049    5.448  5.433   2.049    5.448
## 11    LOGMKT  4.893   2.245    4.986  4.893   2.245    4.986
## 12      LOSS  0.410   0.492    0.000  0.410   0.492    0.000
## 13       ROA -0.061   0.251    0.021 -0.061   0.251    0.021
## 14      ROAL -0.058   0.254    0.022 -0.058   0.254    0.022
## 15   STDEARN 45.116 100.082   11.078 45.116 100.082   11.078
## 16    TENURE  0.841   0.365    1.000  0.841   0.365    1.000

以上是有效的,因为现在DF是正确的data.frame

str(DF)
## 'data.frame':    16 obs. of  4 variables:
##  $ Variables: Factor w/ 16 levels "ABSACCRL","ADA",..: 2 8 4 11 9 13 14 12 6 5 ...
##  $ Mean     : num  0.061 0.348 0.7 4.893 0.512 ...
##  $ SD       : num  0.062 0.476 0.458 2.245 0.307 ...
##  $ Median   : num  0.042 0 1 4.986 0.476 ...