我有这个样品:
> a
Ship duration.minutes event Location
1 a NA enter Skagen
2 a 1616 trip <NA>
3 a 4308 stop Copenhagen
4 b 1646 trip <NA>
5 b 5751 stop Gdynia
6 b 75 trip <NA>
7 b 45666 stop Gdansk
8 c 2531 trip <NA>
9 c 5360 stop Szczecin
10 d 287 trip <NA>
我想添加一个名为“destination”的新列,并在这些单元格中添加目的地的名称。
结果将是:
> output
Ship duration.minutes event Location Destination
1 a NA enter Skagen NA
2 a 1616 trip <NA> Copenhagen
3 a 4308 stop Copenhagen <NA>
4 b 1646 trip <NA> Gdynia
5 b 5751 stop Gdynia <NA>
6 b 75 trip <NA> Gdansk
7 b 45666 stop Gdansk <NA>
8 c 2531 trip <NA> Szczecin
9 c 5360 stop Szczecin <NA>
10 d 287 trip <NA> <NA>
这意味着它在每艘船上工作:它只会给出船的目的地。在这艘船旅行后,它将前往下一个地点。
我试过了
moves <- setDT(a)[, .(from = Location[-.N], to = Location[-1L]) , Ship]
但它没有保留列
duration.minutes
以下内容:
> dput(moves)
structure(list(Ship = c("a", "a", "b", "b", "b", "c"), from = structure(c(4L,
NA, NA, 3L, NA, NA), .Label = c("Copenhagen", "Gdansk", "Gdynia",
"Skagen", "Szczecin"), class = "factor"), to = structure(c(NA,
1L, 3L, NA, 2L, 5L), .Label = c("Copenhagen", "Gdansk", "Gdynia",
"Skagen", "Szczecin"), class = "factor")), row.names = c(NA,
-6L), class = c("data.table", "data.frame"), .Names = c("Ship",
"from", "to"), .internal.selfref = <pointer: 0x00000000003e0788>)
看起来是这样的:
> moves
Ship from to
1: a Skagen <NA>
2: a <NA> Copenhagen
3: b <NA> Gdynia
4: b Gdynia <NA>
5: b <NA> Gdansk
6: c <NA> Szczecin
名为a的数据示例是:
> dput(data)
structure(list(Ship = c("a", "a", "a", "b", "b", "b", "b", "c",
"c", "d"), duration.minutes = c(NA, 1616L, 4308L, 1646L, 5751L,
75L, 45666L, 2531L, 5360L, 287L), event = structure(c(1L, 3L,
2L, 3L, 2L, 3L, 2L, 3L, 2L, 3L), .Label = c("enter", "stop",
"trip"), class = "factor"), Location = structure(c(4L, NA, 1L,
NA, 3L, NA, 2L, NA, 5L, NA), .Label = c("Copenhagen", "Gdansk",
"Gdynia", "Skagen", "Szczecin"), class = "factor")), .Names = c("Ship",
"duration.minutes", "event", "Location"), row.names = c(NA, -10L
), class = c("data.table", "data.frame"))
恐怕和塞特一起工作很难。是否有方法保持列持续时间。分钟?