代码之家  ›  专栏  ›  技术社区  ›  Patrick Balada

如何在R中使用dplyr进行条件选择?

  •  1
  • Patrick Balada  · 技术社区  · 6 年前

    我有以下情况。把桌子给我

    df <- data.frame(ID = c(1, 2, 2, 3, 3, 4),
                 type = c("MC", "MC", "MK", "MC", "MK", "MC"),
                 value1 = c(512, 261, 4523, 1004, 1221, 2556),
                 value2 = c(726, 4000, 280, 998, 113, 6789))
    

    我试图找到一种实现以下逻辑的方法:如果对于一个ID,两种类型(MC和MK)都出现,那么使用MK中的value1和MC中的value2。否则(只有MC类型出现),使用MC。

    因此,最终结果应该是:

    data.frame(ID = c(1, 2, 3, 4),
                 type = c("MC", "MC", "MC", "MC"),
                 value1 = c(512, 4523, 1221, 2556),
                 value2 = c(726, 4000, 998, 6789))
    

    3 回复  |  直到 6 年前
        1
  •  1
  •   nghauran    6 年前

    为了提高效率,我当然更喜欢@andreelrico'的答案,但这里有一个 dplyr 选项。尝试:

    df <- data.frame(ID = c(1, 2, 2, 3, 3, 4),
                     type = c("MC", "MC", "MK", "MC", "MK", "MC"),
                     value1 = c(512, 261, 4523, 1004, 1221, 2556),
                     value2 = c(726, 4000, 280, 998, 113, 6789)) 
    library(dplyr)
    df %>%
      reshape(., idvar = "ID", timevar = "type", direction = "wide") %>%
      group_by(ID) %>%
      mutate(value1 = ifelse(is.na(value1.MK), value1.MC, value1.MK),
             value2 = ifelse(is.na(value2.MC), value2.MK, value2.MC),
             type = "MC") %>%
      select(ID, type, value1, value2)
    # output
    # A tibble: 4 x 4
    # Groups:   ID [4]
         ID  type value1 value2
      <dbl> <chr>  <dbl>  <dbl>
    1     1    MC    512    726
    2     2    MC   4523   4000
    3     3    MC   1221    998
    4     4    MC   2556   6789
    
        2
  •  2
  •   Ronak Shah    6 年前

    dplyr

    library(dplyr)
    
    df %>%
      group_by(ID) %>%
      mutate(value1 = ifelse(any(type == "MK"), value1[type=="MK"],value1[type=="MC"]), 
             value2 = value2[type == "MC"]) %>%
      filter(type == "MC")
    
    #     ID type  value1 value2
    #  <dbl> <fct>  <dbl>  <dbl>
    #1     1 MC       512    726
    #2     2 MC      4523   4000
    #3     3 MC      1221    998
    #4     4 MC      2556   6789
    

    value1 我们检查“MK”中的值,如果它存在,或者取相应的“MC”值 value2 默认情况下,我们取“MC”值并只保留带有 type ID )会有一个“MC” 类型 行。

        3
  •  1
  •   Andre Elrico    6 年前

    data.table 解决方案

    setDT(df1)[,{x=.SD;if(all(c("MC","MK") %in% type)){x$value1[] = last(value1)};first(x)},by=ID]
    

    结果:

    #  ID type value1 value2
    #1  1   MC    512    726
    #2  2   MC   4523   4000
    #3  3   MC   1221    998
    #4  4   MC   2556   6789
    

    dplyr

    df1 %>% group_by(ID) %>% do(.,(function(x){if(all(c("MC","MK") %in% x$type)){x$value1[] = x$value1[x$type=="MK"]};x[1,]})(.))
    
    # A tibble: 4 x 4
    # Groups:   ID [4]
    #     ID type  value1 value2
    #  <dbl> <fct>  <dbl>  <dbl>
    #1     1 MC       512    726
    #2     2 MC      4523   4000
    #3     3 MC      1221    998
    #4     4 MC      2556   6789