Data.table: FR: assign names to CJ() like data.table() does

Created on 17 Mar 2016 · 3Comments · Source: Rdatatable/data.table

I frequently do

DT[CJ(colA = colA, colB = colB, unique=TRUE), on=c("colA","colB")]
# to complete missing levels

# or 
DT[, CJ(colA = colA, colB = colB, unique=TRUE)][!DT, on=c("colA","colB")]
# to identify missing levels
# http://stackoverflow.com/a/36065607/1191259

It would be nice if I could get away with writing colA and colB fewer times. The FR here is for

CJ(colA, colB, unique=TRUE, names=TRUE)

to infer the names colA and colB, perhaps using whatever method is used by data.frame() and data.table() (make.names?).

(The name repetition could be reduced further if on=.Icols were a thing, I suppose, but I'll leave that for a separate FR.)

SO posts to update...

https://stackoverflow.com/a/46243102/

enhancement

Source

franknarf1

👍4

Most helpful comment

+1 and I don't see the need for the names argument - this should be the only behavior. With the join syntax change to using "on" instead of setkey this has become a big sticking point for me.

I'd also like to see unique = TRUE be the default - I can't think of _ever_ not needing to unique the arguments to CJ.

eantonya on 18 May 2016

👍3

All 3 comments

CJ takes ... as first argument, and that function is going to be generic method, so AFAIK we will need to change it into CJ(x, ...), those changes can be made together #1090

jangorecki on 18 Mar 2016

+1 and I don't see the need for the names argument - this should be the only behavior. With the join syntax change to using "on" instead of setkey this has become a big sticking point for me.

I'd also like to see unique = TRUE be the default - I can't think of _ever_ not needing to unique the arguments to CJ.

eantonya on 18 May 2016

👍3

@jangorecki I didn't touch the #1090 / #814 stuff yet. better as self-contained, I think, unless I'm missing something