çŸåšãSQL having
å¥ãšåçïŒãŸãã¯é¡äŒŒã®ãã®ïŒã䜿çšããã«ã¯ãæåby
ã䜿çšããŠ$ [.data.table
ãèšè¿°ãã次ã«ãã®çµæãi
ã«ãã£ãŒãããå¿
èŠããããŸãã次ã®ããã«ã2çªç®ã®[.data.table
ã®i
ãã©ã¡ãŒã¿ã
dt <- data.table(id = rep(1:2, each = 2),
var = c(0.2, 0.5, 1.5, 1.3))
dt[dt[, mean(var) > 1, by = id]$id]
id var
1: 2 1.5
2: 2 1.3
ãã1ã€ã®ãªãã·ã§ã³ã¯ã j
å
ã§æ¡ä»¶ä»ãã¹ããŒãã¡ã³ãã䜿çšããããšã§ããããã¯éåžžã«åŒ·åã§ããç§ã¯ãã€ãããããŠããŸããããããŸã§ã®ãšãããçŸåšã®æ§æã§èš±å¯ãããŠããªãããšã¯äœããããŸããã ãã ãã having
ãã©ã¡ãŒã¿ãŒã䜿çšãããšãããæ確ã§èªã¿ãããã³ãŒããèšè¿°ã§ããããã«ãªããšæããŸãã ããšãã°ãäžèšã¯æ¬¡ã®ããã«æžãããšãã§ããŸãã
dt[, if(mean(var) > 1) .SD, by = id]
ç§ãææ¡ããã®ã¯æ¬¡ã®ãããªãã®ã§ãã
dt[, .SD, by = id, having = mean(var) > 1]
ã¢ã€ãã¢ã¯ãçŸåšã®ã°ã«ãŒãã«å¯ŸããŠj
ãè©äŸ¡ããå¿
èŠããããã©ããã瀺ããé·ã1ã®è«çã«åžžã«è©äŸ¡ãããåŒãçšæããããšã§ãã
ããããšãã
ãã±ãŒã¬
çŽ æŽãããFRã ç§ããã®ãŠãŒã¹ã±ãŒã¹ã«ã€ããŠããªãé·ãéèããŠããŸããã ãã®ããã«è¿œå ã®åŒæ°ãªãã§ãããè¡ãããšãã§ããŸãïŒ
dt[, .SD[mean(var)>1], by=id]
ïŒãã ããé床ãäžããã«ã¯ãå
éšã§.SD[.]
ãæé©åããå¿
èŠããããŸã-ïŒ735ïŒã
代ããã«.I
ã䜿çšããã®ã¯ããã®å Žåã§ããå¯èœæ§ãæãé«ãã§ãã
dt[dt[, .I[mean(var) > 1], by=id]$V1]
ãããŠããããçŽæ¥ååŸããã®ã¯çŽ æŽãããããšã§ãïŒ having
ãªãã§ãããéæã§ããã°ããã«è¯ãã§ãïŒ-ããããj
åŒã1åã®è«çãã¯ãã«ã«è©äŸ¡ãããå Žåã¯ã©ãã§ããããïŒ å€§å£°ã§èããŠããã ãã§ãã
ããã«ã¡ã¯ã¢ã«ã³ã çããŠãããŠããããšãã .SD
ã®æé©åãå©çšå¯èœã«ãªããšãããã¯ã次ã®éã«èªãããšãããæ確ã«ãªããšããç¹ã§ããå³ãã®åé¡ã«ãªããŸãã
dt[, .SD[mean(var)>1], by=id]
ãš
dt[, .SD, by = id, having = mean(var) > 1]
2ã€ç®ã¯ãä»ã®èšèªïŒç¹ã«SQLïŒããæ¥ã人ã ã«ãšã£ãŠãé åçãããããŸãããã ããããç¹°ãè¿ãã«ãªããŸãããããã¯ç§ã®æèŠãããããŸããã ãã¶ããååã¯SQLã䜿ããããã®ãããããŸããïŒç¬ïŒã
奜ã¿ã®éšåã«é¢ããŠã¯ãåçŽã§æšæºçãªæ§æïŒã€ãŸããäžèšã®æåã®ãªãã·ã§ã³ïŒã§å®è¡ã§ããå Žåã¯ããã©ã¡ãŒã¿ãŒãè¿œå ããã®ã¯æ¬åœã«å«ãã§ãã
å¥åŠã ç§ã¯ããªãããããæãé«ãè©äŸ¡ããå¯èœæ§ãé«ããšç¢ºä¿¡ããŠããŸããïŒ-ïŒïŒç§ãæ£ããèŠããŠããã°ãäž»ã«èªã¿ããããåäžãããããã«ãby-without-byãã©ãã ãæé€ãããããèæ ®ããŠãã ããïŒã ãšã«ãããç§ã¯2ã€ããŸã£ããç°ãªãã·ããªãªã§ããããšãç¥ã£ãŠããŸãã ç§ã¯èªåã®èŠè§£ãå ±æãããã£ãã ãã§ãã
[.data.table
ã®15察14ïŒçŸåšã®ïŒãã©ã¡ãŒã¿ãŒã¯å®éã«ã¯å®³ã¯ãããŸããj
ã®å®è¡ãã¹ãããããå¯èœæ§ãããåŒã«ãªããŸããç§ããµã€ã¬ã³ããã€ãŠã£ãºãã€ãã€ãšãæã£ãŠãããã奜ãã§ã¯ãªãã£ãçç±ã¯å®éã«ã¯åãã§ã-ãããäœåãªãã©ã¡ãŒã¿ã§ããããšäœåãªå¥åŠãªæ¯ãèãã§ããããšãäœåãªãã®ãèŠããã®ã¯å¥œãã§ã¯ãããŸããã
ããªããæžããæåã®åŒã¯ãè¡ãèªã¿ç¶ããå¿ èŠããªãã®ã§ãã¯ããã«èªã¿ããããšäž»åŒµããŸãã次ã«ãæ°ãããã©ã¡ãŒã¿ãæå®ãããŠããããšãçºèŠããæã®æåã«æ»ã£ãŠãäœãèµ·ãã£ãŠãããã®ã¡ã³ã¿ã«ã¢ãã«ã
having
åŒæ°ã[
ã«è¿œå ããã«ããããhaving()
é¢æ°ã«å€æãã i
order()
ã§æ©èœãããããšã«ã€ããŠã©ãæããŸããïŒ
dt[ having(var > 1), .(var = mean(var)), by = id ]
# would perform below without additional copy:
dt[, .(var = mean(var)), by = id ][ var > 1 ]
having
ã¯ã dt
ã®ãã¬ãŒã ã§åŒæ°ãè©äŸ¡ãã i
ã«ãã£ã«ã¿ãªã³ã°ãæäŸããé¢æ°ã«ãªããŸãã
ãã®FRã¯ã httpsïŒ//github.com/Rdatatable/data.table/issues/1269 ãã°ã«ãŒãã®ã¿ãè¿ãããšå¯æ¥ã«é¢é£ããŠãããšæããŸãã ãã®SOæçš¿ã®my_teams
ã®ããã«ãäœããã®å±æ§ãæã€ã°ã«ãŒããååŸããŠããããããã¯ãã«ã«æ ŒçŽãããããšããããããŸãã é¢é£ããè¡ã¯æ¬¡ã®ãšããã§ãã
my_teams <- FantasyTeams[, max(table(Team)) <= 3, by=team_no][(V1)]$team_no
# or
my_teams <- FantasyTeams[, if ( max(table(Team)) <= 3 ) 1, by=team_no]$team_no
having
ãšãã°ã«ãŒãã®ã¿ãè¿ããFRã䜿çšãããšãããã¯æ¬¡ã®ããã«ãªããŸãã
my_teams <- FantasyTeams[, .(), by = team_no, having = { max(table(Team)) <= 3 }]$team_no
ã³ãŒããåããããé·ãã§ãããç§ã¯ããã奜ãã®ã§ãç®çãç解ããããã«j
ã泚ææ·±ãèªãå¿
èŠã¯ãããŸããã
SOããã®å¥ã®äŸã ã°ã«ãŒãããšã®æ¡ä»¶ãæºããããŠããå Žåãç®æšã¯Value
åã3L
ã§äžæžãããããšã§ãã
DT = setDT(structure(list(Ind = c(1L, 1L, 2L, 2L, 3L, 3L, 4L, 4L), ID = c("A",
"A", "A", "A", "B", "B", "B", "B"), RegionStart = c(1L, 101L,
1L, 101L, 1L, 101L, 1L, 101L), RegionEnd = c(100L, 200L, 100L,
200L, 100L, 200L, 100L, 200L), Value = c(3L, 2L, 3L, 2L, 3L,
2L, 5L, 5L), TN = c("N", "N", "T", "T", "N", "N", "T", "T")), .Names = c("Ind",
"ID", "RegionStart", "RegionEnd", "Value", "TN"), row.names = c(NA,
-8L), class = "data.frame"))
# current syntax
DT[, Value := {
fixit = ( Value[TN=="N"] != 3L ) & ( uniqueN(Value) == 1L )
if (fixit) 3L else Value
}, by=.(ID, RegionStart)]
# with "having"
DT[,
Value := 3L
, by=.(ID, RegionStart)
, having={ ( Value[TN=="N"] != 3L ) & ( n_distinct(Value) == 1L ) }]
ããããããè¯ãæ§æã«å ããŠãã°ã«ãŒãããšã®ãµãã»ããã®ã¿ãå€æŽããå¿
èŠãããããã having=
ã®æ¹æ³ãããå¹ççã§ãããšæããŸãã having=
ã䜿çšããªãæãå¹ççãªæ¹æ³ã¯ããããã次ã®ããã«ãªããŸã...
myeyes = DT[, .I[ ( Value[TN=="N"] != 3L ) & ( uniqueN(Value) == 1L )], by=.(ID, RegionStart)]$V1
DT[ myeyes, Value := 3L]
# or
mygs = DT[, ( Value[TN=="N"] != 3L ) & ( uniqueN(Value) == 1L ), by=.(ID, RegionStart)][(V1)][, V1 := NULL]
DT[ mygs, Value := 3L, on=names(mygs)]
ããªãè€éã§ãã
ç·šéïŒãããŠããã®æ©èœãå©çšå¯èœãã©ãã/ãã€å©çšã§ããããæŽæ°ããå¥ã®äŸïŒ http ïŒ//stackoverflow.com/q/36292702
ïŒ2016/4/26 :) http://stackoverflow.com/q/36869784
ïŒ2016/06/16 :) http://stackoverflow.com/q/37855013/
SOããã®å¥ã®äŸã å³å¯ã«äžæã®è¡ïŒïŒ1163ã«é¢é£ïŒãéžæããããã«äœ¿çšã§ããŸãã
DT = setDT(structure(list(id = c(1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4, 1,
2, 3, 4), dsp = c(5, 6, 7, 8, 6, 6, 7, 8, 5, 6, 9, 8, 5, 6, 7,
NA), status = c(FALSE, TRUE, FALSE, TRUE, FALSE, FALSE, FALSE,
TRUE, FALSE, TRUE, FALSE, FALSE, FALSE, TRUE, FALSE, TRUE)), .Names = c("id",
"dsp", "status"), row.names = c(NA, -16L), class = "data.frame"))
# my current way to select "strictly unique" rows
Bigdt[, .N, by=names(Bigdt)][N == 1][, N := NULL][]
# could be...
Bigdt[, .SD, by=names(Bigdt), having ={.N == 1L}]
.SD
ã¯ç©ºã§ãããããããã§ã¯Bigdt[, if (.N == 1L) .SD, by=names(Bigdt)]
ã¯æ©èœããªãããšã«æ³šæããŠãã ããã ãã¶ãããã¯ïŒ1269ã«ãã£ãŠå©ãããããããããŸããã
ãããŠå¥ã®SOããïŒ http ïŒ//stackoverflow.com/q/38272608/圌ãã¯æåŸã®è¡ã®ãã®ã«åºã¥ããŠã°ã«ãŒããéžæãããã®ã§ã having =
ããŒã¹ã³ã³ãã£ã·ã§ã³[.N] == "not healthy"
ããããè¡ãå¿
èŠããããŸãã
ãããŠå¥ã®åçŽãªã±ãŒã¹ïŒãµã€ãºã«ãããã£ã«ã¿ãªã³ã°ïŒïŒ http ïŒ//stackoverflow.com/q/39085450/
ãããŠå¥ã®ãåçµåä»ãïŒ
ID <- c("A","A","A","B","B","C","D")
Value <- c(0,1,2,0,2,0,0)
df <- data.frame(ID,Value)
library(data.table)
setDT(df)
# use j = max() to get GForce speedup
df[ !df[, max(Value), by=ID][V1 > 0, .(ID, Value = 0)], on=.(ID, Value)]
# do the more standard thing, if j = if (...) x
df[ !df[, if (max(Value) > 0) .(Value = 0), by=ID], on=.(Value, ID) ]
# desired syntax
df[ !df[, .(Value = 0), by=ID, having = max(Value) > 0], on=.(Value, ID) ]
ããããããã»ã©è¯ãäŸã§ã¯ãããŸããã
ãããŠã dt[, if(uniqueN(time)==1L) .SD, by=name, .SDcols="time"]
ã®ãããªçããæã€å¥ã®
ãããŠå¥ã®ïŒ http ïŒ//stackoverflow.com/q/43354165/
ãããŠå¥ã®ïŒ http ïŒ//stackoverflow.com/q/43613087/
å¥ã®ïŒåé€ãããå¯èœæ§ããããŸããïŒïŒ http ïŒ//stackoverflow.com/q/43635968/
å¥ã®http://stackoverflow.com/a/43765352/
å¥ã®http://chat.stackoverflow.com/transcript/message/37148860#37148860
Un autre https://stackoverflow.com/q/45557011/
Haiyou https://stackoverflow.com/questions/45598397/filter-data-frame-matching-all-values-of-a-vector
Um mais https://stackoverflow.com/a/45721286/
lingwai yige https://stackoverflow.com/a/45820567/
ããã³https://stackoverflow.com/q/46251221/
uno mas https://stackoverflow.com/questions/46307315/show-sequences-that-include-a-variable-in-r
tambem https://stackoverflow.com/q/46638058/
ãããŠããäžã€ã data.tableïŒmyDTïŒããåç §ããŒãã«ïŒidDTïŒã«ãªããšã³ããªã«ãµãã»ããåãããïŒ
library(data.table)
idDT = data.table(id = 1:3, v = c("A","B","C"))
myDT = data.table(id = 3:4, z = c("gah","egad"))
# my attempt
idDT[myDT, on=.(id), .SD[.N == 0L], by=.EACHI]
# Empty data.table (0 rows) of 2 cols: id,v
# workaround
myDT[, .SD[idDT[.SD, on=.(id), .N == 0, by=.EACHI]$V1]]
# desired notation (with having=)
myDT[, .SD, by = id, having = idDT[.BY, on=.(id), .N]==0L]
ãã ããããã¯éå¹ççã§ããç§ã®åžæããè¡šèšã§ã¯ãåby =å€ããidDTãžã®åå¥ã®çµåãè¡ãå¿ èŠãããããã§ãã ãã®æå³ã§ãããã¯æè¯ã®äŸã§ã¯ãªããããããŸããã
mais um https://stackoverflow.com/questions/47765283/r-data-table-group-by-where/47765308?noredirect=1#comment82524998_47765308ã¯ã DT[, if (any(status == "A") && !any(status == "B")) .SD, by=id]
ãŸãã¯ãã©ã¡ãŒã¿DT[, .SD, by=id, having = any(status == "A") && !any(status == "B")]
ã䜿çšããŠå®è¡ã§ããŸã
次ã«ã httpsïŒ//stackoverflow.com/a/48669032/ m[, if(isTRUE(any(passed))) .SD, by=id]
ã¯m[by = id, having = isTRUE(any(passed))]
ã«ãªããŸã
mais um exemplo https://stackoverflow.com/q/49072250/
ein anderer https://stackoverflow.com/a/49211292/ stock_profile[, sum(Value), by=Pcode, having=any(Location=="A" & NoSales == "Y")][, sum(V1)]
mais um https://stackoverflow.com/a/49366998/
autre https://stackoverflow.com/a/49919015/
y https://stackoverflow.com/questions/50257643/deleting-rows-in-r-with-value-less-than-x
ããã声https://stackoverflow.com/q/54582048
e https://stackoverflow.com/q/56283005
.N == kã®å Žåã¯ã°ã«ãŒããä¿æããŸãïŒéè€ã¿ãŒã²ããã«ãå€æ°ãããŸãïŒ https://stackoverflow.com/questions/56794306/only-get-data-table-groups-with-a-given-number-of-rows
ã°ã«ãŒããä¿æããïŒdiffïŒsorted_colïŒïŒ<=ãããå€https://stackoverflow.com/q/57512417
maxïŒxïŒ<ãããå€ã®å Žåã¯ä¿æhttps://stackoverflow.com/a/57698641
@eantonya IMHO ã having
ãã©ã¡ãŒã¿ãè¿œå ãããšãå®éã«èŠãããããªããŸãã é床ã®ç°¡æœãã¯èŠãã«ããå ŽåããããŸãã ããã«ã data.table
ãSQLã®ããã«ããããšã¯æªãèãã§ã¯ãããŸããã
data.table
FAQïŒ
2.16data.tableæ§æã¯SQLã«é¡äŒŒããŠãããšèããŸããã
ã¯ã ïŒ ...
@ywhuofu data.tableã¯order
é¢æ°ãi
åŒæ°ã«ãã§ã«åãå
¥ããŠããŸããããã¯ãããŒã¹RãŠãŒã¶ãŒãæåŸ
ãããã®ã§ãã sql _ORDER_ãi = order(...)
ã«å€æããã®ãšåãæ¹æ³ã§ã_HAVING_ã䜿çšããŠå®è¡ã§ããŸãã data.frameã®i
ã¯ããµãã»ããåïŒ_having_ã¯éçŽåŸã®ãµãã»ããåã®é
延ïŒãŸãã¯äžŠã¹æ¿ãã«äœ¿çšããããããé©åã«é©åããŸãã
ããã¯APIã§ããããïŒ
dt <- data.table(id = rep(1:2, each = 2),
var = c(0.2, 0.5, 1.5, 1.3))
dt[having.i(mean(var) > 1, by = id)]
id var
1 2 1.5
2 2 1.3
ãã®ããŒãžã§ã³ãå®è£
ããŸãããã gforce
æé©åãããé¢æ°ãšãã°ã«ãŒãåã«äŸåããªãäžéšã®é¢æ°ïŒ +
ã |
ã &
ãªã©ïŒã®ã¿ã䜿çšãããšããå¶éãèšå®ãããŠããŸãã Cdogroups
ããµããŒãããããã©ããã¯ããããŸãã
1ã€ã®è¿œå ã®ã¡ã¢ã çŸåšã®'[.data.table'
ã³ãŒãå
ã«dt[having(var > 3), .(var = mean(x)), by = .(grp)]
ãåããã®ã¯é£ããããã§ãã æ§æãæ£ããããšã確èªããããã«ãããã€ãã®ãã§ãã¯ãå¿
èŠã«ãªããŸãã
`` `
n = 1e6
grps = 1e5
head_n = 2L
dt = data.table :: data.tableïŒx = sampleïŒgrpsãnãTRUEïŒãy = runifïŒnïŒïŒ
åŒã®æå°äžå€®å€
1 lw [having.iïŒ.N <2L | sumïŒyïŒ> 11 |äžå€®å€ïŒyïŒ<0.7ãby = xïŒ] 114.13ms 124.98ms
2 dt [dt [ããI [.N <2L | sumïŒyïŒ> 11 | äžå€®å€ïŒyïŒ<0.7]ãby = x] $ V1] 4000ms 4000ms
åŒã®æå°äžå€®å€itr/sec
mem_alloc gc/sec
n_itr
1 lw [having.iïŒ.N <2Lãby = xïŒ] 30.2ms 35.3ms 27.9 8.02MB 3.99 14
2 dt [dt [ã.I [.N <2L]ãby = x] $ V1] 106.1ms 110.4ms 8.81 6.13MB 10.6 5
ç§ã¯ãããhaving=
ãŸãã¯group_filter=
ïŒãŸãã¯SQLèªèã«äŸåããã«èŠèŠçã«äœãããããç¥ãããã®äœãïŒãšããååã®è¿œå ãã©ã¡ãŒã¿ãŒãšããŠå¥œãã§ãããã
ããšãã°ã i
ã®è¡ãã£ã«ã¿ãŒãš$ïŒ$ i
$ã®ã°ã«ãŒãã¬ãã«ã®ãã£ã«ã¿ãŒãçµã¿åãããã®ã¯æ··ä¹±ãããšæããŸã
having =
ã¯ããŒã¿ã®ãµãã»ããã§æ©èœããŸããããããšãi
åŒæ°ãŸãã¯having
åŒæ°ãã䜿çšã§ããŸãããïŒ ãŸãã j
ãè©äŸ¡ãããåã«having
ãçºçãããšæããŸãã .BY
ãš.GRP
ããããŠãŸããªã.NGRP
ã¯ã `` `hasing =` `` `ã§ã©ã®ããã«æ©èœããŸããïŒ
æ§æäžã®éžæè¢ã¯å€ããããŸããã
having
ãªã©ã®æ°ããåŒæ°ãè¿œå ããi
ã j
ã by
ãè¡ãã£ã«ã¿ãŒãšã°ã«ãŒããã£ã«ã¿ãŒã®äž¡æ¹ãå¿
èŠãªå Žåã dt[row_selector & group_selector, ...]
ã¯æ£ãã衚瀺ãããªãããããã®ãããªãŠãŒã¹ã±ãŒã¹ã§ã¯ãè¡ãã£ã«ã¿ãŒãšã°ã«ãŒããã£ã«ã¿ãŒãåãåŒæ°ã«å«ããã¹ãã§ã¯ãªãããã§ãã ãã®åŸã i
ã¯é€å€ãããŸãã
ããããã°ãæ§æäžã®éžæè¢ã¯å€ããããŸããã
by
ãå©çšãããšãæ··ä¹±ãæãå¯èœæ§ããããŸãã äŸãã°ã
dt[, .SD, by = having(.(id), mean(var > 1))]
dt[, .SD, by = id ~ mean(var) > 1]
j
ã«ç¹å¥ãªé¢æ°ãè¿œå ãããšèŠæ ããæªããªããŸãã
dt[, having(mean(var) > 1, .SD), by = id]
ä»ãç§ãæãããèŠãããšæãã³ãŒãã¯æããªãªãžãã«ãªããŒãžã§ã³ã§ã
dt[, if (mean(var) > 1) .SD, by = id]
dt[, if (mean(var) > 1) .(x = sum(x), y = sum(y)), by = id]
ç§ãæ¬åœã«æãã§ããã®ã¯ãã°ã«ãŒããã£ã«ã¿ãªã³ã°ã®åŸã«æé©åãå®è¡ãç¶ããããšã§ãã j
å
ã®if
åŒãæ€åºããGForceãifã¹ããŒãã¡ã³ãå
ã§æ©èœããããã«ç¶æãããªã©ãããã«æé©åã§ããŸããïŒ
@ renkun-kenãŸãã¯ãå¥ã®äžçœ®æŒç®åããªãŒããŒããŒãããŸããïŒ
dt[, mean(var) > 1 ? .SD, by=id]
if
ã«å¯Ÿããç¹å¥ãªã·ã³ãã«ã®å©ç¹ã®1ã€ã¯ããŠãŒã¶ãŒãäžèŽããelse
åŸã§çœ®ãå¯èœæ§ããªãããšã§ãã
@ franknarf1 j
if
ãæ€åºããããšããŠãããšãã«ã if
ã«else if
ãšelse
ãããããšã確èªã§ããããã§ãã if
ã®ã¿ã®å Žåãæé©åãã if-else
ãæé©åããªããŸãŸã«ããããšãã§ããŸãã åŸã§ã if-else
ã®å ŽåãåŠçã§ããããã«ãªããŸãã å人çã«ã¯ãæ¢åã®æŒç®åããªãŒããŒã©ã€ãããã掻çšãããããããããã³ãŒããæé©åããæ¹ã奜ãã§ãã
@ franknarf1ããã¯ã¯ãŒã«ãªCæ§æã§ãããããã§ããã»ã©è€éã«ãªããªããã©ããã¯ããããŸããã
var > 1 ? d : e
ãåæ§ã«æ©èœããå¯èœæ§ããããŸããã
var > 1 ? d : e
ã¯ç°¡æœã«èŠããŸããã d
ãše
ã¯{...}
ã®ãããªãã®ã§ãããæŒç®åã®åªå
é äœãæ··ä¹±ããå¯èœæ§ããããããã€ã³ã©ã€ã³ã®åçŽãªå Žåã«ã®ã¿æ©èœããŸãã .SD
ãçŽç²ãªã°ã«ãŒããã£ã«ã¿ãªã³ã°ãå®è¡ã§ããããã«ããã ãã§ããããããšãããã§j
ã®åŒãå®è¡ããã ãã§ããïŒ
æ§æãè¿œå ãããšãæ§æãç¹å¥ã«åŠçããã j
å
ã§ã¯æ©èœããªãããšã«ãŠãŒã¶ãŒã泚æããå¿
èŠããããšããåé¡ããããŸãã ããšãã°ããŠãŒã¶ãŒã¯æåŸ
ãããããããŸãã
dt[, mean(var) > 1 ? 0 : (sd(var) < 1 ? 1 : 0), by = id]
åãããã«ããããŠãã
dt[, mean(var) > 1 ? 0 : 1]
dt[, mean(var) > 1 ? 0 : (sd(var) < 1 ? 1 : 0)]
äžè¬çã«åäœããŸãã
ç§ã¯ããã§å°ãæ··ä¹±ããŠããŸãã
ããŸãã
dt[, .SD, by = id, having = mean(var) > 1]
ã«å©ç¹ããããŸã
dt[, if(mean(var) > 1) .SD, by = id]
mean(var) > 1
ã¯åžžã«ã°ã«ãŒãããšã«è©äŸ¡ãããããã§ãã ããã¯æ§æç³è¡£ãšããŠã®ã¿æ©èœããŸããããããšãããã©ãŒãã³ã¹ãåäžãããããã«ãããäœããã®æ¹æ³ã§æé©åããããšããŠããŸããïŒ
@jangorecki
@ franknarf1ããã¯ã¯ãŒã«ãªCæ§æã§ãããããã§ããã»ã©è€éã«ãªããªããã©ããã¯ããããŸããã
var > 1 ? d : e
ãåæ§ã«æ©èœããå¯èœæ§ããããŸããã
ãããããã¯ã¯ãŒã«ã§ãããã @ renkun-kenãææããããã«ãæŒç®åã®åªå
é äœã¯{}
ãªãã§éªéã«ãªãå¯èœæ§ããããŸãïŒ ex = quote(x & y ? a+b : v+w); str(rapply(as.list(ex), as.list, how="replace"))
ïŒ
ç§ã¯ããã§å°ãæ··ä¹±ããŠããŸãã
ããŸãã
dt[, .SD, by = id, having = mean(var) > 1]
ã«å©ç¹ããããŸã
dt[, if(mean(var) > 1) .SD, by = id]
mean(var) > 1
ã¯åžžã«ã°ã«ãŒãããšã«è©äŸ¡ãããããã§ãã ããã¯æ§æç³è¡£ãšããŠã®ã¿æ©èœããŸããããããšãããã©ãŒãã³ã¹ãåäžãããããã«ãããäœããã®æ¹æ³ã§æé©åããããšããŠããŸããïŒ
j
ã«æ§æäžã®éæ³ãè¿œå ããããããèªã¿ããããä¿å®ãç°¡åã ãšæãã®ã§ããããŸã§having=
ã奜ãã§ãããšæããŸãã äžæ¹ãç§ã¯ä»£ããã«j
æ§æã®éæ³ã奜ããããããªããšæããŸãã
if () ...
ã«æ
£ããŠããŸãã å¯èœã§ããã°ã ?
ã®æ¹æ³ã奜ãã§ããj
ã«çµ±åãããŠããå Žåããã®åäœã«ã€ããŠè¿œå ã®è³ªåã«çããå¿
èŠã¯ãããŸããïŒããšãã°ã DT[, x := if (cond) y, by=id]
ã¯ãæ¡ä»¶ãäžéšã®ã°ã«ãŒãã§æºããããä»ã®ã°ã«ãŒãã§ã¯æºããããªãå Žåã«NAãäœæãããã®åäœã¯ãã¹ãã§ã¯ãããŸããã having=
ã«ã€ããŠå説æããå¿
èŠããããŸãïŒãæé©åã«é¢ããŠã¯ãéåžžã¯max(x) > 0
ã max(x) == 0
ã®ãããªåŒã§ãããããGForceã®ããŒãžã§ã³ã«ãã£ãŠã¯ãæã€æ¡ä»¶èªäœãæ©æµãåããå¯èœæ§ã®ããäŸãããããããããã§ãã
ç§èªèº«ã®äœ¿çšã§ã¯ãæé©åã«å ããŠãäžèšã®return-only-groupsã®å Žåã«ã»ãšãã©åœ¹ç«ã€ãšæããŸãhttps://github.com/Rdatatable/data.table/issues/1269
> dt[, if (mean(var) > 1) .(), by=id]
> # instead of ...
> dt[, mean(var) > 1, by=id][V1 == TRUE, !"V1"]
id
1: 2
ããç¹ãã©ã³ã¯ã ããªããæã£ãŠãããŠãŒã¹ã±ãŒã¹ã®å·šå€§ãªå€§èŠã«å ããŠ
æ§ç¯ãããŸããïŒãšããã§ããäžåºŠããããšãïŒïŒã
å®éãhaving =ããŒãžã§ã³ã§GForceãå®è¡ããæ¹ãç°¡åãªå ŽåããããŸãã
gforceããžãã¯ããå®è¡ããããšããã®ã§ã¯ãªããjã«é¡äŒŒãããã®ã«é©çšããã ãã§ãã
åãããšãéæããããã®NSEã
ãã ããããã¯w Janã®WIPãšçžäºäœçšããŠãå€ãã®jã³ãŒããCã«ç§»åããå¯èœæ§ããããŸãã
ããã®èãã€ã³ïŒ
2020幎2æ15æ¥åææ¥ãååŸ1æ40åã«[email protected]ã¯æ¬¡ã®ããã«æžããŠããŸãã
@jangorecki https://github.com/jangorecki
@ franknarf1https ïŒ//github.com/franknarf1ããã¯ã¯ãŒã«ãªCæ§æã§ãã
ããã§ããã»ã©è€éã«ãªããªããã©ããã¯ããããŸãããã
var> 1ïŒ dïŒeãããŸãããã§ããããããããããã¯ã¯ãŒã«ã§ãããã æŒç®åã®åªå é äœããªããšéªéã«ãªãå¯èœæ§ããããŸã
{} s @ renkun-ken https://github.com/renkun-kenãææããããã«ïŒex =
quoteïŒxïŒyïŒa + bïŒv + wïŒ; strïŒrapplyïŒas.listïŒexïŒãas.listãhow = "replace"ïŒïŒ
ïŒãç§ã¯ããã§å°ãæ··ä¹±ããŠããŸãã
ããŸãã
dt [ã.SDãby = idãhave = meanïŒvarïŒ> 1]
ã«å©ç¹ããããŸã
dt [ãifïŒmeanïŒvarïŒ> 1ïŒ.SDãby = id]
meanïŒvarïŒ> 1ã¯ãåžžã«åã°ã«ãŒãã«å¯ŸããŠè©äŸ¡ãããããã§ãã ããã ãã§ãã
ã·ã³ã¿ãã¯ã¹ã·ã¥ã¬ãŒãšããŠæ©èœããããããããªããšãããŠæé©åããããšããŠããŸã
ããé«ãããã©ãŒãã³ã¹ãåŸãã«ã¯ïŒä»ãŸã§ç§ã¯æã£ãŠããããšã奜ãã ãšæããŸã=ãããå°ãèŠã€ããã®ã§
è¿œå ãããããèªã¿ããããç¶æãããããšæ³åãã
jãžã®ãããªãæ§æäžã®éæ³ã äžæ¹ã§ãç§ã¯ãããããããªããšæããŸã
代ããã«ãjæ§æã®éæ³ã奜ã
- ç§ã¯ïŒïŒ...ãã§ã«; ãšã®ããã«ïŒ ãããããªã
å®è¡å¯èœã- jã«çµ±åãããŠããå Žåã¯ãè¿œå ã®è³ªåãããå¿ èŠã¯ãããŸããã
ãã®åäœã«ã€ããŠåçããŸããïŒããšãã°ãDT [ãxïŒ= ifïŒcondïŒyãby = id]ãäœæããŸã
äžéšã®ã°ã«ãŒãã§æ¡ä»¶ãæºããããŠããããä»ã®ã°ã«ãŒãã§ã¯æºããããŠããªãå Žåã®NAãšãã®åäœ
æã£ãŠããããã«å説æããå¿ èŠã¯ãããŸãã=ïŒãæé©åã«é¢ããŠã¯ãäŸãããããããããã§ãã
æã€æ¡ä»¶èªäœã¯ãGForceã®ããã€ãã®ããŒãžã§ã³ããæ©æµãåããå¯èœæ§ããããŸãã
éåžžãmaxïŒxïŒ> 0ã®ãããªåŒã§ãããããmaxïŒxïŒ== 0ã§ããç§èªèº«ã®äœ¿çšã§ã¯ãæé©å以å€ã«ãã»ãšãã©ã®å Žå
äžèšã®return-only-groupsã®å Žåã«åœ¹ç«ã¡ãŸãïŒ1269
https://github.com/Rdatatable/data.table/issues/1269dt [ãifïŒmeanïŒvarïŒ> 1ïŒãïŒïŒãby = id]
ãã以å€ã® ...
dt [ãmeanïŒvarïŒ> 1ãby = id] [V1 == TRUE ãïŒ "V1"]
id
1ïŒ2â
ã³ã¡ã³ãããã®ã§ãããåãåã£ãŠããŸãã
ãã®ã¡ãŒã«ã«çŽæ¥è¿ä¿¡ããGitHubã§è¡šç€ºããŠãã ãã
https://github.com/Rdatatable/data.table/issues/788?email_source=notifications&email_token=AB2BA5OCN4IW3N6QQJU6RJ3RC555BA5CNFSM4ATSQPMKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN
ãŸãã¯è³Œèªã解é€ãã
https://github.com/notifications/unsubscribe-auth/AB2BA5MD7ZXWSRRHVEJM6C3RC555BANCNFSM4ATSQPMA
ã
Cã«ç§»åããjã³ãŒãã¯ãåã®éžæã®ã¿ãæ
åœããã³ãŒãã§ããããã with
åŒæ°ãæšæž¬ããŸãã ããã§å¹²æžããŸããã
FRã¯ãhavingããã©ã¡ãŒã¿ãè¿œå ããããã®ãã®ã§ããããã having
ãšããåèªã¯ãœãªã¥ãŒã·ã§ã³ã®ã©ããã«ããã¯ãã§ãã äžé
æŒç®åã®æé©åã¯å¥ã®åé¡ã®ããã§ãã
having.i()
ã®ç§ã®å¥œã¿ã¯ãdata.tableã®ãã³ãã©ã®ããã§ãïŒiã®ãµãã»ãã/é åºãjã®éžæãbyã®ã°ã«ãŒãåã having
ã¯ããµãã»ããåã®ç¹æ®ãªã±ãŒã¹ã§ãã
ãšã«ãããæ°ããåŒæ°having
ãããå ŽåãAPIã¯i
åŒæ°ããµããŒãããŸããïŒ ã»ãšãã©ã®ãŠãŒã¹ã±ãŒã¹ã§ã¯ããã®èŠä»¶ã¯å¿
èŠãªãããã§ãã
泚æã®æ¯ãèãã¯ã©ãããã¹ãã§ããïŒ ã€ãŸããçŸåšã®ã¢ãããŒãã®ã»ãšãã©ã¯èªåçã«äžŠã¹æ¿ããããŸãã
library(data.table)
dt = data.table(grp = c(1L, 2L, 1L, 2L), x = letters[sample(4L)])
dt
#> grp x
#> <int> <char>
#> 1: 1 a
#> 2: 2 b
#> 3: 1 c
#> 4: 2 d
dt[dt[, .I[.N > 0L], by = grp]$V1]
#> grp x
#> <int> <char>
#> 1: 1 a
#> 2: 1 c
#> 3: 2 b
#> 4: 2 d
having
åŒæ°ã¯ã by
ã«åŸã£ãŠäžŠã¹æ¿ããããçµæãè¿ãå¿
èŠããããŸããïŒ
泚æã®æ¯ãèãã¯ã©ãããã¹ãã§ããïŒ
having
åŒæ°ã¯ãby
ã«åŸã£ãŠäžŠã¹æ¿ããããçµæãè¿ãå¿ èŠããããŸããïŒ
@ ColeMiller1 Fwiwã having=
ã¯ã by=
ã衚瀺ãããå Žåã«ã®ã¿è¡šç€ºããããšæããŸãããããã£ãŠãçµæã¯ãäŸã®ããã«...$V1
ã§ã°ã«ãŒãåãããŸãã
ã¯ããé åºã¯äžè²«ããŠãããšæããŸãã
DT[i, j, by, having]
# < == >
DT[i, if (having) j, by]
APIãç¹ã«[
ã«æ°ããhaving
åŒæ°ãå«ããããšã«ã€ããŠã¯åæããªãã£ããšæããŸãã @mattdowle wdytïŒ
DT[, if (.N > 1L) .SD, col1]
ã䜿çšããçŸåšã®ã¢ãããŒãã¯åªãããã®ã§ãããããã»ã©è€éã§ã¯ãªããæ¡åŒµãç°¡åã§ãããæé©åããã®ã¯å°ãé£ããã§ãã
ç§ã®ã¢ã€ãã¢ã¯ã i
ïŒ DT[having(N > 1L), .N, col1]
ã®é¢æ°åŒã³åºããšããŠhaving
ã䜿çšããããšã§ãããã i
ã«éåžžã®ãµãã»ãããæäŸããããšã¯ã§ããŸããã
ãããã¯ãæ°ããåŒæ°ã¯by
ã®ãµãåŒæ°ã§ããå¯èœæ§ããããããã«ã€ããŠã¯ããŸãèããŠããŸãããã DT[, .N, by=.(col1, .having = N > 1L)]
ã®ãããªãã®ã§ãããããè¿œå ã®ã°ã«ãŒãåé¢é£ã®åŒæ°ã¯by
ã«ã«ãã»ã«åãããŸããå£è«ã ããã¯ãåŒæ°ã®æ°ãå¢ããããã®é©åãªæ¹æ³ã§ãã
æãåèã«ãªãã³ã¡ã³ã
SOããã®å¥ã®äŸã å³å¯ã«äžæã®è¡ïŒïŒ1163ã«é¢é£ïŒãéžæããããã«äœ¿çšã§ããŸãã
.SD
ã¯ç©ºã§ãããããããã§ã¯Bigdt[, if (.N == 1L) .SD, by=names(Bigdt)]
ã¯æ©èœããªãããšã«æ³šæããŠãã ããã ãã¶ãããã¯ïŒ1269ã«ãã£ãŠå©ãããããããããŸããããããŠå¥ã®SOããïŒ http ïŒ//stackoverflow.com/q/38272608/圌ãã¯æåŸã®è¡ã®ãã®ã«åºã¥ããŠã°ã«ãŒããéžæãããã®ã§ã
having =
ããŒã¹ã³ã³ãã£ã·ã§ã³[.N] == "not healthy"
ããããè¡ãå¿ èŠããããŸãããããŠå¥ã®åçŽãªã±ãŒã¹ïŒãµã€ãºã«ãããã£ã«ã¿ãªã³ã°ïŒïŒ http ïŒ//stackoverflow.com/q/39085450/
ãããŠå¥ã®ãåçµåä»ãïŒ
ããããããã»ã©è¯ãäŸã§ã¯ãããŸããã
ãããŠã
dt[, if(uniqueN(time)==1L) .SD, by=name, .SDcols="time"]
ã®ãããªçããæã€å¥ã®ãããŠå¥ã®ïŒ http ïŒ//stackoverflow.com/q/43354165/
ãããŠå¥ã®ïŒ http ïŒ//stackoverflow.com/q/43613087/
å¥ã®ïŒåé€ãããå¯èœæ§ããããŸããïŒïŒ http ïŒ//stackoverflow.com/q/43635968/
å¥ã®http://stackoverflow.com/a/43765352/
å¥ã®http://chat.stackoverflow.com/transcript/message/37148860#37148860
å¥ã®https://stackoverflow.com/questions/45464333/assign-a-binary-vector-based-on-blocks-of-data-within-another-vector/
å¥ã®https://stackoverflow.com/questions/32259620/how-to-remove-unique-entry-and-keep-duplicates-in-r/32259758#32259758
Un autre https://stackoverflow.com/q/45557011/
Haiyou https://stackoverflow.com/questions/45598397/filter-data-frame-matching-all-values-of-a-vector
Um mais https://stackoverflow.com/a/45721286/
lingwai yige https://stackoverflow.com/a/45820567/
ããã³https://stackoverflow.com/q/46251221/
uno mas https://stackoverflow.com/questions/46307315/show-sequences-that-include-a-variable-in-r
tambem https://stackoverflow.com/q/46638058/
ãããŠããäžã€ã data.tableïŒmyDTïŒããåç §ããŒãã«ïŒidDTïŒã«ãªããšã³ããªã«ãµãã»ããåãããïŒ
ãã ããããã¯éå¹ççã§ããç§ã®åžæããè¡šèšã§ã¯ãåby =å€ããidDTãžã®åå¥ã®çµåãè¡ãå¿ èŠãããããã§ãã ãã®æå³ã§ãããã¯æè¯ã®äŸã§ã¯ãªããããããŸããã
mais um https://stackoverflow.com/questions/47765283/r-data-table-group-by-where/47765308?noredirect=1#comment82524998_47765308ã¯ã
DT[, if (any(status == "A") && !any(status == "B")) .SD, by=id]
ãŸãã¯ãã©ã¡ãŒã¿DT[, .SD, by=id, having = any(status == "A") && !any(status == "B")]
ã䜿çšããŠå®è¡ã§ããŸã次ã«ã httpsïŒ//stackoverflow.com/a/48669032/
m[, if(isTRUE(any(passed))) .SD, by=id]
ã¯m[by = id, having = isTRUE(any(passed))]
ã«ãªããŸãmais um exemplo https://stackoverflow.com/q/49072250/
ein anderer https://stackoverflow.com/a/49211292/
stock_profile[, sum(Value), by=Pcode, having=any(Location=="A" & NoSales == "Y")][, sum(V1)]
mais um https://stackoverflow.com/a/49366998/
autre https://stackoverflow.com/a/49919015/
y https://stackoverflow.com/questions/50257643/deleting-rows-in-r-with-value-less-than-x
ããã声https://stackoverflow.com/q/54582048
e https://stackoverflow.com/q/56283005
.N == kã®å Žåã¯ã°ã«ãŒããä¿æããŸãïŒéè€ã¿ãŒã²ããã«ãå€æ°ãããŸãïŒ https://stackoverflow.com/questions/56794306/only-get-data-table-groups-with-a-given-number-of-rows
ã°ã«ãŒããä¿æããïŒdiffïŒsorted_colïŒïŒ<=ãããå€https://stackoverflow.com/q/57512417
maxïŒxïŒ<ãããå€ã®å Žåã¯ä¿æhttps://stackoverflow.com/a/57698641