代码之家  ›  专栏  ›  技术社区  ›  Raul Torres

将k列整形为2列,表示k个变量值的连续对

  •  9
  • Raul Torres  · 技术社区  · 8 年前

    id y1 y2 y3 y4  
    --+--+--+--+--
    a |12|13|14|  
    b |12|18|  |
    c |13|  |  |
    d |13|14|15|16  
    

    id from to  
    --+----+--- 
    a |12  |13  
    a |13  |14  
    a |14  |
    b |12  |18
    b |18  |  
    c |13  |
    d |13  |14  
    d |14  |15  
    d |15  |16  
    

    每个 id
    有人知道一种简单的方法吗?我试过使用 reshape2 .我还看了 Combine Multiple Columns Into Tidy Data

    3 回复  |  直到 8 年前
        1
  •  5
  •   HubertL    8 年前

    您可以使用 lapply 在列对上循环,并 rbind

    do.call(rbind,
            lapply(2:(length(df)-1), 
                   function(x) setNames(df[!is.na(df[,x]),c(1,x,x+1)], 
                                        c("id", "from", "to"))))
       id from to
    1   a   12 13
    2   b   12 18
    3   c   13 NA
    4   d   13 14
    11  a   13 14
    21  b   18 NA
    41  d   14 15
    12  a   14 NA
    42  d   15 16
    
        2
  •  5
  •   www    8 年前

    dplyr tidyr . dt2

    # Create example data frame
    dt <- data.frame(id = c("a", "b", "c", "d"),
                     y1 = c(12, 12, 13, 13),
                     y2 = c(13, 18, NA, 14),
                     y3 = c(14, NA, NA, 15),
                     y4 = c(NA, NA, NA, 16),
                     stringsAsFactors = FALSE)
    
    # Load packages
    library(dplyr)
    library(tidyr)
    
    # Process the data
    dt2 <- dt %>%
      gather(STEP, from, -id) %>%
      drop_na(from) %>%
      arrange(id, STEP) %>%
      group_by(id) %>%
      mutate(to = lead(from)) %>%
      select(-STEP)
    
        3
  •  4
  •   thelatemail    8 年前

    在基数R中, stack dt :

    tmp <- na.omit(cbind(dt[1], stack(dt[-1])[-2]))
    names(tmp)[2] <- "from"
    tmp$to <- with(tmp, ave(from, id, FUN=function(x) c(tail(x,-1),NA) ))
    tmp[order(tmp$id),]
    
    #   id from to
    #1   a   12 13
    #5   a   13 14
    #9   a   14 NA
    #2   b   12 18
    #6   b   18 NA
    #3   c   13 NA
    #4   d   13 14
    #8   d   14 15
    #12  d   15 16
    #16  d   16 NA
    

    在世界上 data.table melt 然后 shift by=

    library(data.table)
    dt <- as.data.table(dt)
    
    melt(dt, id.vars="id", value.name="from")[
      !is.na(from),-"variable"][, to := shift(from,1,type="lead"), by=id
    ][order(id)]