在数据上。表C3列中,我想标记每组随机选择的N行(C1)。已经提出了几个类似的问题
here
,则,
here
和
here
.但根据答案,我仍然无法找出如何为我的任务找到解决方案。
set.seed(1)
dt = data.table(C1 = c("A","A","A","B","C","C","C","D","D","D"),
C2 = c(2,1,3,1,2,3,4,5,4,5))
dt
C1 C2
1: A 2
2: A 1
3: A 3
4: B 1
5: C 2
6: C 3
7: C 4
8: D 5
9: D 4
10: D 5
以下是每组C1随机选择的两行的行索引(对于B组不起作用):
dt[, sample(.I, min(.N, 2)), by = C1]$V1
[1] 1 3 3 7 5 10 9
注意:对于B,只应选择一行,因为B组仅由一行组成。
以下是每组中随机选择的一行的解决方案,这通常不适用于B组:
dt[, C3 := .I == sample(.I, 1), by = C1]
dt
C1 C2 C3
1: A 2 FALSE
2: A 1 TRUE
3: A 3 FALSE
4: B 1 FALSE
5: C 2 TRUE
6: C 3 FALSE
7: C 4 FALSE
8: D 5 TRUE
9: D 4 FALSE
10: D 5 FALSE
实际上,我想把它展开N行。我试过(两排):
dt[, C3 := .I==sample(.I, min(.N, 2)), by = C1]
这当然行不通。
非常感谢您的帮助!