I didn't realize DT[TRUE]
was a way to achieve a shallow copy. shallow copy was only intended for internal use. Thanks to @renkun-ken for highlighting this in #3214, and related #2254.
In v1.11.8 we see this :
DT = data.table(id = 1:5, key="id")
DT1 = DT[TRUE]
key(DT1)
[1] "id"
DT1[3, id:=6L]
key(DT1)
# NULL # correct
DT$id
# [1] 1 2 6 4 5 # should be 1:5
key(DT)
# [1] "id" # invalid key
It only occurs after DT[TRUE]
, iiuc, which hopefully folk have not discovered or relied on too much?! I hope the usage out there is like @renkun-ken described to add new columns to the shallow copy, not to change existing columns!
New test 1542.08 was added in PR #2313 ready for when this is fixed.
Yes, setkey, changing existing columns should not be used on the shallow copy since columns themselves are not copied.
If we won't allow to make shallow copy with dt[TRUE]
this issue will be automatically resolved.
Eventually. But in the meantime, we can't break @renkun-ken's workflow.
More detail here: https://github.com/Rdatatable/data.table/issues/3214#issuecomment-462490046
the following code could be added to tests to ensure copy behaviour
DT = data.table(a=c(1,2), b=c("b","a"))
address(DT)
address(DT[])
address(DT[, .SD])
address(DT[TRUE])
sapply(DT, address)
sapply(DT[], address)
sapply(DT[, .SD], address)
sapply(DT[TRUE], address)
Most helpful comment
Yes, setkey, changing existing columns should not be used on the shallow copy since columns themselves are not copied.