代码之家  ›  专栏  ›  技术社区  ›  Chris Ruehlemann

尽管组之间的值共享,如何按不同的组按字母顺序重新排列行?

  •  1
  • Chris Ruehlemann  · 技术社区  · 5 年前

    我有一个数据框:

    df
       speaker                                                             actionx phase
    31    <NA>                                                      are only four=  <NA>
    33   ID1-P                       ((m: r hand holds up three fingers ifo face))     B     # group 1
    35   ID1-G                       ((m: r hand holds up three fingers ifo face))     A     # group 1
    37   ID1-P                       ((m: r hand holds up three fingers ifo face))     D     # group 1
    39    <NA>                                                             (0.215)  <NA>
    41   ID2-A                                                               =mhm,  <NA>
    43    <NA>                                                             (0.270)  <NA>
    45   ID1-A So:: if you take a leave of absence we are going going to be three=  <NA>
    47   ID1-P                       ((m: r hand holds up three fingers ifo face))     E     # group 1
    49    <NA>                                                             (0.282)  <NA>
    74   ID2-A                                                 <no: yeah: it 's:>=  <NA>
    76   ID1-G                       ((m: r hand holds up three fingers ifo face))     A     # group 2
    78   ID1-P                       ((m: r hand holds up three fingers ifo face))     B     # group 2
    80   ID1-A                                                      =we are !four!  <NA>
    82   ID1-P                       ((m: r hand holds up three fingers ifo face))     C     # group 2
    84   ID1-P                       ((m: r hand holds up three fingers ifo face))     E     # group 2
    86    <NA>                                                             (0.031)  <NA>
    

    我想对行进行重新排序,以使列中的组 phase 按字母顺序排列,并且彼此相邻。通过(i)字母可以识别这些群体 A 通过 E 以及(ii)列中的值 actionx 都一样。

    多亏了小组成员的建议,所以我知道如果所有小组都有 不同的 actionx 价值观 在他们之间,即:

    df <- df[order(match(df$actionx, unique(df$actionx)), df$phase), ]
    # or:
    library(dplyr)
    df <- df %>% arrange(match(actionx, unique(actionx)), phase)
    

    然而,有时,这些团体会 相同的 actionx 他们之间的价值观;例如,在 df , group 1 group 2 分享价值 ((m: r hand holds up three fingers ifo face)) 编队 actionx .

    尽管表中的值相同,但如何按字母顺序分组重新排列行 actionx 为了实现这一点 预期结果 ? (请注意 A. 通过 E 每组不得出现一次以上。)

    df[c(1,3,2,4,9,5:8,10:13,15:16,14,17),]
       speaker                                                             actionx phase
    31    <NA>                                                      are only four=  <NA>
    35   ID1-G                       ((m: r hand holds up three fingers ifo face))     A
    33   ID1-P                       ((m: r hand holds up three fingers ifo face))     B
    37   ID1-P                       ((m: r hand holds up three fingers ifo face))     D
    47   ID1-P                       ((m: r hand holds up three fingers ifo face))     E
    39    <NA>                                                             (0.215)  <NA>
    41   ID2-A                                                               =mhm,  <NA>
    43    <NA>                                                             (0.270)  <NA>
    45   ID1-A So:: if you take a leave of absence we are going going to be three=  <NA>
    49    <NA>                                                             (0.282)  <NA>
    74   ID2-A                                                 <no: yeah: it 's:>=  <NA>
    76   ID1-G                       ((m: r hand holds up three fingers ifo face))     A
    78   ID1-P                       ((m: r hand holds up three fingers ifo face))     B
    82   ID1-P                       ((m: r hand holds up three fingers ifo face))     C
    84   ID1-P                       ((m: r hand holds up three fingers ifo face))     E
    80   ID1-A                                                      =we are !four!  <NA>
    86    <NA>                                                             (0.031)  <NA>
    

    可复制数据 :

    df <- dput(t[c(17:26, 39:45), c(2,7,6)])
    structure(list(speaker = c(NA, "ID1-P", "ID1-G", "ID1-P", NA, 
    "ID2-A", NA, "ID1-A", "ID1-P", NA, "ID2-A", "ID1-G", "ID1-P", 
    "ID1-A", "ID1-P", "ID1-P", NA), actionx = c("are only four=", 
    "((m: r hand holds up three fingers ifo face))", "((m: r hand holds up three fingers ifo face))", 
    "((m: r hand holds up three fingers ifo face))", "(0.215)", "=mhm,", 
    "(0.270)", "So:: if you take a leave of absence we are going going to be three=", 
    "((m: r hand holds up three fingers ifo face))", "(0.282)", "<no: yeah: it 's:>=", 
    "((m: r hand holds up three fingers ifo face))", "((m: r hand holds up three fingers ifo face))", 
    "=we are !four!", "((m: r hand holds up three fingers ifo face))", 
    "((m: r hand holds up three fingers ifo face))", "(0.031)"), 
        phase = c(NA, "B", "A", "D", NA, NA, NA, NA, "E", NA, NA, 
        "A", "B", NA, "C", "E", NA)), row.names = c(31L, 33L, 35L, 
    37L, 39L, 41L, 43L, 45L, 47L, 49L, 74L, 76L, 78L, 80L, 82L, 84L, 
    86L), class = "data.frame")
    
    0 回复  |  直到 5 年前
        1
  •  1
  •   Darren Tsai    5 年前

    你需要两步走 arrange() :

    library(dplyr)
    
    df %>%
      arrange(data.table::rleid(!is.na(phase)),
              phase) %>%
      arrange(cumsum(coalesce(phase == "A", F)), ## if phase = "A", jump to the next group
              match(actionx, unique(actionx)))
    
    #    speaker                                                             actionx phase
    # 1     <NA>                                                      are only four=  <NA>
    # 2    ID1-G                       ((m: r hand holds up three fingers ifo face))     A
    # 3    ID1-P                       ((m: r hand holds up three fingers ifo face))     B
    # 4    ID1-P                       ((m: r hand holds up three fingers ifo face))     D
    # 5    ID1-P                       ((m: r hand holds up three fingers ifo face))     E
    # 6     <NA>                                                             (0.215)  <NA>
    # 7    ID2-A                                                               =mhm,  <NA>
    # 8     <NA>                                                             (0.270)  <NA>
    # 9    ID1-A So:: if you take a leave of absence we are going going to be three=  <NA>
    # 10    <NA>                                                             (0.282)  <NA>
    # 11   ID2-A                                                 <no: yeah: it 's:>=  <NA>
    # 12   ID1-G                       ((m: r hand holds up three fingers ifo face))     A
    # 13   ID1-P                       ((m: r hand holds up three fingers ifo face))     B
    # 14   ID1-P                       ((m: r hand holds up three fingers ifo face))     C
    # 15   ID1-P                       ((m: r hand holds up three fingers ifo face))     E
    # 16   ID1-A                                                      =we are !four!  <NA>
    # 17    <NA>                                                             (0.031)  <NA>
    

    注: 不要把两者合并 安排 一起因为 phase 第二 安排 新秩序。