代码之家 › 专栏 › 技术社区 › Anonymous coward

在组之间插入空行并保持原始顺序

Anonymous coward · 技术社区 · 6 年前

我基本上在做 this question 不过,我正努力维持 month 列。一种迂回的方法是在单个数字上加一个前导零。我想找个办法不这么做。

当前代码:

library(dplyr)
df <- structure(list(parameters = c("temp", "temp", "temp", "temp", "temp", "temp", "temp", "temp", "temp", "temp", "temp", "temp", "temp", "temp", "temp"), month = c("2", "2", "2", "5", "5", "5", "8", "8", "8", "11", "11", "11", "annual", "annual", "annual")), class = "data.frame", row.names = c(NA, -15L))
do.call(bind_rows, by(df, df[ ,c("month", "parameters")], rbind, ""))

这个 by 函数似乎将您定义的索引强制为因子,并转换 月 一个因素表明,它使水平按这个顺序排列: 11, 2, 5, 8, annual . 如果它是数字,它将正确地对其进行排序,但是 annual 包括,该列必须作为字符。

如果我先将它转换为一个因子,并对级别排序,我的代码将插入 NA S.

df$month <- ordered(df$month, levels = c("2", "5", "8", "11", "annual"))
do.call(bind_rows, by(df, df[ ,c("month", "parameters")], rbind, ""))

电流输出:

   parameters  month
1        temp     11
2        temp     11
3        temp     11
4                   
5        temp      2
6        temp      2
7        temp      2
8                   
9        temp      5
10       temp      5
11       temp      5
12                  
13       temp      8
14       temp      8
15       temp      8
16                  
17       temp annual
18       temp annual
19       temp annual
20

期望输出:

   parameters month
1        temp     2
2        temp     2
3        temp     2
4                  
5        temp     5
6        temp     5
7        temp     5
8                  
9        temp     8
10       temp     8
11       temp     8
12                 
13       temp    11
14       temp    11
15       temp    11
16                 
17       temp annual
18       temp annual
19       temp annual
20

2 回复 | 直到 6 年前

akrun 6 年前

问题是,在“month”列更改为 ordered 因子, "" 未指定为 levels . 所以,自然,任何不是 level 被视为缺失值,因此我们得到 NA . 可以在前面的步骤中通过包括 “ 作为其中之一 水平

df$month <- ordered(df$month, levels = c("2", "5", "8", "11", "annual", ""))

注: order 的 “ 尚不清楚。因此,它被指定为最后一个 水平

Uwe 6 年前

有一种替代方法,它使用 data.table 的化身 rbind() 函数在每组后面附加一个空行:

library(data.table)
setDT(df)[, rbind(.SD, data.table(parameters = "")), by = month]

     month parameters
 1:      2       temp
 2:      2       temp
 3:      2       temp
 4:      2           
 5:      5       temp
 6:      5       temp
 7:      5       temp
 8:      5           
 9:      8       temp
10:      8       temp
11:      8       temp
12:      8           
13:     11       temp
14:     11       temp
15:     11       temp
16:     11           
17: annual       temp
18: annual       temp
19: annual       temp
20: annual

保持分组顺序。分组变量 month 出现在每行前面。如果需要,此方法还可以用于散布任意数量的空白行:

n_blank <- 2L
setDT(df)[, rbind(.SD, data.table(parameters = rep("", n_blank))), by = month]

     month parameters
 1:      2       temp
 2:      2       temp
 3:      2       temp
 4:      2           
 5:      2           
 6:      5       temp
 7:      5       temp
 8:      5       temp
 9:      5           
10:      5           
11:      8       temp
12:      8       temp
13:      8       temp
14:      8           
15:      8           
16:     11       temp
17:     11       temp
18:     11       temp
19:     11           
20:     11           
21: annual       temp
22: annual       temp
23: annual       temp
24: annual           
25: annual           
     month parameters