Data.table: FR: assign names to CJ() like data.table() does

Created on 17 Mar 2016  ·  3Comments  ·  Source: Rdatatable/data.table

I frequently do

DT[CJ(colA = colA, colB = colB, unique=TRUE), on=c("colA","colB")]
# to complete missing levels

# or 
DT[, CJ(colA = colA, colB = colB, unique=TRUE)][!DT, on=c("colA","colB")]
# to identify missing levels
# http://stackoverflow.com/a/36065607/1191259

It would be nice if I could get away with writing colA and colB fewer times. The FR here is for

CJ(colA, colB, unique=TRUE, names=TRUE) 

to infer the names colA and colB, perhaps using whatever method is used by data.frame() and data.table() (make.names?).

(The name repetition could be reduced further if on=.Icols were a thing, I suppose, but I'll leave that for a separate FR.)

SO posts to update...

enhancement

Most helpful comment

+1 and I don't see the need for the names argument - this should be the only behavior. With the join syntax change to using "on" instead of setkey this has become a big sticking point for me.

I'd also like to see unique = TRUE be the default - I can't think of _ever_ not needing to unique the arguments to CJ.

All 3 comments

CJ takes ... as first argument, and that function is going to be generic method, so AFAIK we will need to change it into CJ(x, ...), those changes can be made together #1090

+1 and I don't see the need for the names argument - this should be the only behavior. With the join syntax change to using "on" instead of setkey this has become a big sticking point for me.

I'd also like to see unique = TRUE be the default - I can't think of _ever_ not needing to unique the arguments to CJ.

@jangorecki I didn't touch the #1090 / #814 stuff yet. better as self-contained, I think, unless I'm missing something

Was this page helpful?
0 / 5 - 0 ratings