μ΄ μ€λ₯λ₯Ό νΌνλ λ°©λ²μ μκ³ μμ§λ§ μ΄λ° λ°©μμΌλ‘ μ€ννλ©΄μ΄ μ€λ₯κ° λ°μνλ μ΄μ λ₯Ό μ μ μμ΅λλ€.
datain <- data.frame(
chrom = c("chr17", "chr4", "chr5", "chr13"),
map = c(81061047, 106061533, 40102442, 73791553),
rs = c("rs75954926", "rs7679673", "rs7708610", "rs78341008"),
start = c(79061048, 104061534, 38102443, 71791554),
end = c(83061048, 108061534, 42102443, 75791554)
)
datain
datain$chr<-datain$chrom
setDT(datain)
setkey(datain, chr, start, end)
datain
κ°μ¬!
μλ μ€
Windows 10κ³Ό macOS 10.13.6 λͺ¨λμμ data.table 1.12.0μΌλ‘μ΄λ₯Ό μ¬ν ν μ μμ΅λλ€. λ΄ macOS μμ€ν μ data.table 1.12.2λ‘ μ κ·Έλ μ΄λ ν ν μλμ λͺ¨λ μμ κ° λμΌν κ²°κ³Όλ₯Ό μ 곡νλμ§ νμΈν μ μμ΅λλ€.
μ’ λ κ°λ¨ν μ :
> set.seed(1)
> d <- data.frame(x = paste0("chr",sample(17)[3:6]), y = 1:4)
> d$x2 <- d$x
> d
x y x2
1 chr9 1 chr9
2 chr13 2 chr13
3 chr3 3 chr3
4 chr11 4 chr11
> setDT(d)
> setkey(d,x2,y)
> d
x y x2
1: chr9 4 chr9
2: chr13 2 chr13
3: chr3 3 chr3
4: chr11 1 chr11
μ΄λ setDT
μ§μ ν€λ₯Ό ν λΉ ν λλ λ°μν©λλ€ ( setDT
setkeyv
λ₯Ό μ¬μ©νμ¬ ν€λ₯Ό μ€μ νκΈ° λλ¬Έμ μμ ν μ μμ).
> set.seed(1)
> d <- data.frame(x = paste0("chr",sample(17)[3:6]), y = 1:4)
> d$x2 <- d$x
> d
x y x2
1 chr9 1 chr9
2 chr13 2 chr13
3 chr3 3 chr3
4 chr11 4 chr11
> setDT(d, key = c("x2","y"))
> d
x y x2
1: chr9 4 chr9
2: chr13 2 chr13
3: chr3 3 chr3
4: chr11 1 chr11
보μλ€μνΌ y
-columnμ μμλ λ³κ²½λμ§λ§ λ€λ₯Έ μ΄μ μμλ λ³κ²½λμ§ μμ΅λλ€. λΆλͺ
ν μ΄κ²μ data.frame λ°©μμΌλ‘ μ΄μ λ³΅μ¬ ν λ€μ setDT
μ μ¬μ©ν λ€μ setkey
λ₯Ό μ¬μ©ν λλ§ λ°μν©λλ€. λͺ¨λ μλ ν κ²°κ³Όλ₯Ό μ 곡νλ λ€μ 5 κ°μ§ κ²½μ°λ₯Ό κ³ λ €νμμμ€.
1) λ°μ΄ν° νλ μ μ΄μ 볡μ¬νμ§ μμ
> set.seed(1)
> d <- data.frame(x = paste0("chr",sample(17)[3:6]), y = 1:4)
> d
x y
1 chr9 1
2 chr13 2
3 chr3 3
4 chr11 4
> setDT(d)
> setkey(d,x)
> d
x y
1: chr11 4
2: chr13 2
3: chr3 3
4: chr9 1
2) λ°μ΄ν° νλ μμ μμμ μ μ΄ μμ±
> set.seed(1)
> d <- data.frame(x = paste0("chr",sample(17)[3:6]), y = 1:4)
> set.seed(1)
> d$x2 <- paste0("chr",sample(17)[1:4])
> d
x y x2
1 chr9 1 chr5
2 chr13 2 chr6
3 chr3 3 chr9
4 chr11 4 chr13
> setDT(d)
> setkey(d,x2,y)
> d
x y x2
1: chr11 4 chr13
2: chr9 1 chr5
3: chr13 2 chr6
4: chr3 3 chr9
3) κΈ°μ‘΄ μ΄μ μνλ§νκ³ μΌλΆ extr λ¬Έμλ₯Ό λΆμ¬ λ£μ΄ μ μ΄ λ§λ€κΈ°
> set.seed(1)
> d <- data.frame(x = paste0("chr",sample(17)[3:6]), y = 1:4)
> set.seed(1)
> d$x2 <- paste0("new_",sample(d$x,4))
> d
x y x2
1 chr9 1 new_chr13
2 chr13 2 new_chr11
3 chr3 3 new_chr3
4 chr11 4 new_chr9
> setDT(d)
> setkey(d,x2,y)
> d
x y x2
1: chr13 2 new_chr11
2: chr9 1 new_chr13
3: chr3 3 new_chr3
4: chr11 4 new_chr9
4) κΈ°μ‘΄ μ΄μ μνλ§νμ¬ μ μ΄ λ§λ€κΈ°
> set.seed(1)
> d <- data.frame(x = paste0("chr",sample(17)[3:6]), y = 1:4)
> set.seed(1)
> d$x2 <- sample(d$x,4)
> d
x y x2
1 chr9 1 chr13
2 chr13 2 chr11
3 chr3 3 chr3
4 chr11 4 chr9
> setDT(d)
> setkey(d,x2,y)
> d
x y x2
1: chr13 2 chr11
2: chr9 1 chr13
3: chr3 3 chr3
4: chr11 4 chr9
5) data.table λ°©μμΌλ‘ μ΄ λ³΅μ¬
> set.seed(1)
> d <- data.frame(x = paste0("chr",sample(17)[3:6]), y = 1:4)
> d
x y
1 chr9 1
2 chr13 2
3 chr3 3
4 chr11 4
> setDT(d)
> d[, x2 := x][]
x y x2
1: chr9 1 chr9
2: chr13 2 chr13
3: chr3 3 chr3
4: chr11 4 chr11
> setkey(d,x2,y)
> d
x y x2
1: chr11 4 chr11
2: chr13 2 chr13
3: chr3 3 chr3
4: chr9 1 chr9
μ΄ λ²κ·Έλ (맀μ°) νΉμ μ¬μ© μ¬λ‘μμ λ°μνλ κ²μ²λΌ 보μ΄μ§λ§ μ¬λ¬ νλ‘λμ
μμ€ν
μμ setkey
λ₯Ό μ¬μ©νκΈ° λλ¬Έμ μ΄κ²μ λμκ² μ€μ ν μ μμ΅λλ€. μ΄ νΉμ μ¬λ‘κ° λ΄ νλ‘λμ
μ½λμμ λ°μνλμ§ νμΈν©λλ€.
@jaapwalhout μ κ΄μ°° νμΈ (data.table_1.12.3 ν¬ν¨).
λν (i) μμΈκ³Ό κ΄λ ¨μ΄μμ μ μλ€κ³ μκ°νμ§λ§ κ·Έλ μ§ μμ΅λλ€ (λ¬Έμ λλ μ μμμλ λ°μ ν¨), (ii) ν€ (x λλ y)μ κ΄κ³μμ΄ λ°μν©λλ€.
μλλ λ κ°λ¨ν μ¬ν κ°λ₯ν μμ
λλ€.
κ³Όμ μ κ²½κ³ λ κ΄λ ¨μ΄μμ μ μμ§λ§ μ΄ν΄ν μ μμ΅λλ€.
options(datatable.verbose = TRUE)
## KO
d <- data.frame(x = c(9, 1), y = c(9, 1))
d$x2 <- d$x
d
# x y x2
# 1 9 9 9
# 2 1 1 1
setDT(d, key = "x")[]
# forder took 0 sec
# reorder took 0 sec
# x y x2
# 1: 9 1 9
# 2: 1 9 1
## KO
d <- data.frame(x = c("9", "1"), y = c(9, 1), stringsAsFactors = FALSE)
d$x2 <- d$x
d
# x y x2
# 1 9 9 9
# 2 1 1 1
setDT(d, key = "x")[]
# forder took 0 sec
# reorder took 0 sec
# x y x2
# 1: 9 1 9
# 2: 1 9 1
## OK (with warning)
d <- data.frame(x = c("9", "1"), y = c(9, 1))
setDT(d)
d$x2 <- d$x
# Assigning to all 2 rows
# RHS for item 1 has been duplicated because NAMED is 2, but then is being plonked. length(values)==2; length(cols)==1)
setkey(d, y, verbose = TRUE)[]
# forder took 0 sec
# reorder took 0 sec
# x y x2
# 1: 1 1 1
# 2: 9 9 9
iiuc, μμ νμ§ μκ³ μ΄ λλ λͺ©λ‘ μμλ₯Ό 볡μ ν λ λ©λͺ¨λ¦¬ μ£Όμλ λμΌνλ©° ν€λ₯Ό μ€μ νλ©΄ "νΌμ‘"ν©λλ€.
copy()
μ¬μ©ν λ μ μμ μΌλ‘ μλν©λλ€.
# KO
l <- list(x = c(9, 1), y = c(9, 1))
l[["z"]] <- l[["x"]]
l
setDT(l, key = "x")[]
address(l$x) == address(l$z)
# TRUE
# OK
l <- list(x = c(9, 1), y = c(9, 1))
l[["z"]] <- copy(l[["x"]])
l
setDT(l, key = "x")[]
κ°μ₯ μ μ©ν λκΈ
iiuc, μμ νμ§ μκ³ μ΄ λλ λͺ©λ‘ μμλ₯Ό 볡μ ν λ λ©λͺ¨λ¦¬ μ£Όμλ λμΌνλ©° ν€λ₯Ό μ€μ νλ©΄ "νΌμ‘"ν©λλ€.
copy()
μ¬μ©ν λ μ μμ μΌλ‘ μλν©λλ€.