代码之家  ›  专栏  ›  技术社区  ›  jay.sf

如何用reduce合并多个data.frames并获得有序输出?

  •  1
  • jay.sf  · 技术社区  · 7 年前

    有一个经典的方法 how to simultaneously merge multiple data.frames in a list .

    然而,产量有点混乱。

    例子

    > L
    [[1]]
      a b c  d e
    1 5 2 4 10 1
    
    [[2]]
      a b c d e
    1 6 7 4 6 1
    
    [[3]]
      a b c d
    1 7 3 5 5
    
    [[4]]
      a b c d
    1 5 2 6 5
    
    [[5]]
      a b c d
    1 4 4 2 8
    

    输出的行 Reduce(.) 由5,1,4,2,3排序,这可能意味着从外部到内部的某种缩减工作。

    > Reduce(function(...) merge(..., all=TRUE), L)
    > Reduce(function(x, y) merge(x, y, all=TRUE, by=intersect(names(x), names(y))), L)  # same
      a b c  d  e
    1 4 4 2  8 NA
    2 5 2 4 10  1
    3 5 2 6  5 NA
    4 6 7 4  6  1
    5 7 3 5  5 NA
    

    不管怎样,有没有一种方法可以稍微更改代码以获得如下所示的有序输出?

    #   a b c  d  e
    # 1 5 2 4 10  1
    # 2 6 7 4  6  1
    # 3 7 3 5  5 NA
    # 4 5 2 6  5 NA
    # 5 4 4 2  8 NA
    

    数据

    L <- list(structure(list(a = 5L, b = 2L, c = 4L, d = 10L, e = 1L), class = "data.frame", row.names = c(NA, 
    -1L)), structure(list(a = 6L, b = 7L, c = 4L, d = 6L, e = 1L), class = "data.frame", row.names = c(NA, 
    -1L)), structure(list(a = 7L, b = 3L, c = 5L, d = 5L), class = "data.frame", row.names = c(NA, 
    -1L)), structure(list(a = 5L, b = 2L, c = 6L, d = 5L), class = "data.frame", row.names = c(NA, 
    -1L)), structure(list(a = 4L, b = 4L, c = 2L, d = 8L), class = "data.frame", row.names = c(NA, 
    -1L)))
    
    2 回复  |  直到 7 年前
        1
  •  1
  •   Julius Vainora    7 年前

    这是因为 sort 属于 merge :

    排序-逻辑。结果是否应按列排序?

    所以,你可以用

    Reduce(function(...) merge(..., all = TRUE, sort = FALSE), L)
    #   a b c  d  e
    # 1 5 2 4 10  1
    # 2 6 7 4  6  1
    # 3 7 3 5  5 NA
    # 4 5 2 6  5 NA
    # 5 4 4 2  8 NA
    
        2
  •  1
  •   Dan    7 年前

    在这里,我用 bind_rows dplyr 打包而不是 merge .

    L <- list(structure(list(a = 5L, b = 2L, c = 4L, d = 10L, e = 1L), class = "data.frame", row.names = c(NA, 
              -1L)), structure(list(a = 6L, b = 7L, c = 4L, d = 6L, e = 1L), class = "data.frame", row.names = c(NA, 
              -1L)), structure(list(a = 7L, b = 3L, c = 5L, d = 5L), class = "data.frame", row.names = c(NA, 
              -1L)), structure(list(a = 5L, b = 2L, c = 6L, d = 5L), class = "data.frame", row.names = c(NA, 
              -1L)), structure(list(a = 4L, b = 4L, c = 2L, d = 8L), class = "data.frame", row.names = c(NA, 
              -1L)))
    
    library(dplyr)
    
    Reduce(bind_rows, L) 
    #>   a b c  d  e
    #> 1 5 2 4 10  1
    #> 2 6 7 4  6  1
    #> 3 7 3 5  5 NA
    #> 4 5 2 6  5 NA
    #> 5 4 4 2  8 NA
    

    创建日期:2019-02-09 reprex package (V0.2.1.9000)

    推荐文章