Data.table: Symbol .I consistency when not grouping

Created on 30 Jan 2018  ·  5Comments  ·  Source: Rdatatable/data.table

This has come up before I'm sure but I can't find the issue or S.O. post. Anyone remember or have the links please? I seem to remember replying to someone something like ".I is intended for use in grouping as per the documentation, but it would be good to extend it to non-grouping too". The man page still contains the words "while grouping" for .I.

Current behaviour in both v1.10.4-3 and dev :

> X = data.table(c("a","a","b","c","c"), 10:14)
> setkey(X,V1)
>  X["b"]
   V1 V2
1:  b 12       # ok
> X["b", .I]
[1] 1          # expected x's row number 3  (*1)
> X["b", .I, by=.EACHI]
   V1 I
1:  b 3        # ok
> X["b", .(.I,V2)]
   I V2
1: 1 12      # expected x's row number 3 not 1  (*2)
> X["b", .(.I,V2), by=.EACHI]
   V1 I V2
1:  b 3 12     # ok
> 

Now, which=TRUE was intended and works for the first case (*1) :

> X["b", which=TRUE]
[1] 3

but including x's row numbers inside j (2) isn't currently possible, unless you add x's row numbers explicitly as a column first. It would be nice for .I to do what which=TRUE does in the simple case (1) and maybe even slowly deprecate which=TRUE argument since my guess is people reach for .I first.

bug consistency

Most helpful comment

Sorry I didn't realize there is a whole thread on this idea! That means this is a good idea right?

All 5 comments

Anyone remember or have the links please?

Can we use .I for global thus row number without grouping, and .i for local thus only inside grouping?

@dracodoc as mentioned by Frank this is #1206

Sorry I didn't realize there is a whole thread on this idea! That means this is a good idea right?

example from https://github.com/Rdatatable/data.table/issues/539

dt <- data.table(a=sample(letters, 100, T), b=rnorm(100))
dt[ a=="c", list(.N, .I)]
   N .I
1: 4  1
2: 4  2
3: 4  3
4: 4  4

dt[a=="c", list(.N, .I), by=a]
   a N .I
1: c 4 54
2: c 4 67
3: c 4 71
4: c 4 86
Was this page helpful?
0 / 5 - 0 ratings