何が原因なのか完全にはわからないので、これが私が見つけた最小のWEです
library(data.table) # Tested on v 1.9.7
dt <- data.table( origin = c("A", "A", "A", "A", "A", "A", "B", "B", "A", "A", "C", "C", "B", "B", "B", "B", "B", "C", "C", "B", "A", "C", "C", "C", "C", "C", "A", "A", "C", "C", "B", "B"),
destination = c("A", "A", "A", "A", "B", "B", "A", "A", "C", "C", "A", "A", "B", "B", "B", "C", "C", "B", "B", "A", "B", "C", "C", "C", "A", "A", "C", "C", "B", "B", "C", "C"),
points_in_dest = c(5, 5, 5, 5, 4, 4, 5, 5, 3, 3, 5, 5, 4, 4, 4, 3, 3, 4, 4, 5, 4, 3, 3, 3, 5,5, 3, 3, 4, 4, 3, 3),
depart_time = c(7, 8, 16, 18, 7, 8, 16, 18, 7, 8, 16, 18, 7, 8, 16, 7, 8, 16, 18, 8, 16, 7, 8, 18, 7, 8, 16, 18, 7, 8, 16, 18),
travel_time = c(0, 0, 0, 0, 70, 10, 70, 10, 10, 10, 70, 70, 0, 0, 0, 70, 10, 10, 70, 70, 10, 0, 0, 0, 10, 70, 10, 70, 10, 70, 70, 10) )
dt[ depart_time<=8 & travel_time < 60, condition1 := TRUE]
dt[ depart_time>=16 & travel_time < 60, condition2 := TRUE]
setkey(dt, origin, destination)
res <- unique(dt[(condition1)])[unique(dt[(condition2)]),
on = c(destination = "origin", origin = "destination"),
nomatch = 0L]
res[, .(points = sum(points_in_dest)), keyby = origin]
# origin points
#1: A 5
#2: A 4
#3: B 4
#4: B 3
#5: C 5
#6: C 4
#7: C 3
ご覧のとおり、 by
は意図したとおりに機能せず、すべての行が返されました。 以下がこれを修正するので、それは明らかにキーイングの問題です
setattr(res, "sorted", NULL)
res[, .(points = sum(points_in_dest)), keyby = origin]
# origin points
#1: A 9
#2: B 7
#3: C 12
または、代わりにorigin
を係数に事前分類します
res[, .(points = sum(points_in_dest)), keyby = factor(origin)]
# factor points
#1: A 9
#2: B 7
#3: C 12
これは、このSOの質問http://stackoverflow.com/questions/37239649/aggregate-data-table-based-on-condition-in-another-rowから取得したものです。
とても良い例です。 修正します。 ありがとう。
言わなければならない、それは機能を綴る創造的な方法です!
修理済み....
最も参考になるコメント
とても良い例です。 修正します。 ありがとう。