为什么传统的列聚合语法
data.table
dt[, sum(x), by = "y"]
如果我们在
j
和
by
,即
dt[, sum(x), by = "x"]
?
library(data.table)
set.seed(1)
dt <- data.table(x = sample(c(1:10), 20, T), y = sample(letters[1:4], 20, T))
setorderv(dt, "y")
我想和
x
通过
X
但以下内容不起作用,它只是复制了
X
专栏:
> dt[, sum(x, na.rm = T), by = "x"]
x V1
1: 4 4
2: 10 10
3: 3 3
4: 9 9
5: 7 7
6: 1 1
7: 8 8
8: 6 6
9: 2 2
10: 5 5
如果我这样做:
> dt[, .(res = lapply(.SD, sum, na.rm = T)), by = 'x', .SDcols = "x"]
x res
1: 4 12
2: 10 30
3: 3 9
4: 9 9
5: 7 21
6: 1 1
7: 8 24
8: 6 6
9: 2 2
10: 5 5
那是有效的。
另一方面,如果
通过
参数是一个不同于用于聚合的列
J
:
> dt[, sum(x, na.rm = T), by = "y"]
y V1
1: a 38
2: b 38
3: c 17
4: d 26