I was exploring alternatives of how to do row operations in data.table
and I think I've found a bug.
These three lines of code should return the same result. However, the result of by = .I
seems return a wrong result.
dt[, sdd := sum(.SD[, 2:4, with=FALSE]), by = 1:NROW(dt) ]
dt[, rowpos := .I][ , sdd := sd(.SD[, -1, with=FALSE]), by = rowpos ]
dt[ , sdd := sd(.SD[, -1, with=FALSE]), by = .I ]
sample data:
dt <- data.table(V0 =LETTERS[c(1,1,2,2,3)],
V1=1:5,
V2=3:7,
V3=5:1)
Alternatively, by = .I
should give an error, though it would be nice to have it work with an i-expression
present.
Similar issue with using .N
in by
(again smth one might naively try - dt[, ..., by = 1:.N]
- although this particular expression gives an error, it's not really the "right" error).
Why not just add feature "rowwise" by using by = .I
, which sounds intuitive.
Hi @leoluyi ,
the behaviour of by = .I
is equivalent to by = NULL
. Have a look at this SO discussion https://stackoverflow.com/questions/37667335/row-operations-in-data-table-using-by-i