Data.table: Row operations in data.table using `by = .I`

Created on 7 Jun 2016  ·  3Comments  ·  Source: Rdatatable/data.table

I was exploring alternatives of how to do row operations in data.table and I think I've found a bug.

These three lines of code should return the same result. However, the result of by = .I seems return a wrong result.

dt[, sdd := sum(.SD[, 2:4, with=FALSE]), by = 1:NROW(dt) ]
dt[, rowpos := .I][ , sdd := sd(.SD[, -1, with=FALSE]), by = rowpos ]
dt[ , sdd := sd(.SD[, -1, with=FALSE]), by = .I ]

sample data:
dt <- data.table(V0 =LETTERS[c(1,1,2,2,3)], V1=1:5, V2=3:7, V3=5:1)

Low bug

All 3 comments

Alternatively, by = .I should give an error, though it would be nice to have it work with an i-expression present.

Similar issue with using .N in by (again smth one might naively try - dt[, ..., by = 1:.N] - although this particular expression gives an error, it's not really the "right" error).

Why not just add feature "rowwise" by using by = .I, which sounds intuitive.

1063

Hi @leoluyi ,

the behaviour of by = .I is equivalent to by = NULL . Have a look at this SO discussion https://stackoverflow.com/questions/37667335/row-operations-in-data-table-using-by-i

Was this page helpful?
0 / 5 - 0 ratings

Related issues

franknarf1 picture franknarf1  ·  3Comments

jangorecki picture jangorecki  ·  3Comments

nachti picture nachti  ·  3Comments

sengoku93 picture sengoku93  ·  3Comments

st-pasha picture st-pasha  ·  3Comments