代码之家  ›  专栏  ›  技术社区  ›  LLL

使用merge()和blank合并数据帧-格式问题

  •  0
  • LLL  · 技术社区  · 6 年前

    我想从一个数据帧添加到另一个数据帧,从 dfadd dfmaster ,同时保持行的顺序 数据主机 .

    我试图使用merge()但它改变了 数据主机 . 顺序是关键。有没有data.table()或tidyverse()方法来处理这个问题?

    谢谢!

    # Data
    dfmaster <- data.frame(variable_name=c("Blood_sugar","","","Blood_pressure","","Pulse",""),variable_level=c("high","medium","low","high","low","high","low"),variable_defin=c("baseline, lab","","","baseline, measured","","baseline, measured",""))
    dfadd <- data.frame(variable_name=c("Blood_sugar","Blood_pressure","Pulse","Breakfast","Rest"),centre_names1=c("ST","FD","","QW",""),centre_names2=c("","HF","","",""),centre_names3=c("","LD","","",""),one_or_more=c("one","more","","one",""))
    
    
    # Goal 
    dfgoal <- data.frame(variable_name=c("Blood_sugar","","","Blood_pressure","","Pulse",""),variable_level=c("high","medium","low","high","low","high","low"),variable_defin=c("baseline, lab","","","baseline, measured","","baseline, measured",""),centre_names1=c("ST","","","FD","","",""),centre_names2=c("","","","FD","","",""),centre_names3=c("","","","LD","","",""),one_or_more=c("more","","","more","","",""))
    
    
    # Attempt 
    dfmaster <- merge(dfmaster,dfadd,by="variable_name", all.x=T)
    
    1 回复  |  直到 6 年前
        1
  •  0
  •   Wimpel    6 年前

    看起来你的 dfgoal ... 在 Blood_sugar ,“一个或多个”应为“一个”(根据 dfadd ),而不是“更多”。

    请检查下面的代码是否是您的答案。

    library(dplyr)
    dfmaster %>% 
      #perform left join
      left_join(dfadd) %>% 
      #tidy all the emtpy factors to NA (remove if not desired)
      mutate_if(is.factor, funs(factor(replace(., .=="", NA))))
    
    #    variable_name variable_level     variable_defin centre_names1 centre_names2 centre_names3 one_or_more
    # 1    Blood_sugar           high      baseline, lab            ST          <NA>          <NA>         one
    # 2                        medium               <NA>          <NA>          <NA>          <NA>        <NA>
    # 3                           low               <NA>          <NA>          <NA>          <NA>        <NA>
    # 4 Blood_pressure           high baseline, measured            FD            HF            LD        more
    # 5                           low               <NA>          <NA>          <NA>          <NA>        <NA>
    # 6          Pulse           high baseline, measured          <NA>          <NA>          <NA>        <NA>
    # 7                           low               <NA>          <NA>          <NA>          <NA>        <NA>