Pandas: BUG : TimeGrouper๊ฐ€์žˆ๋Š” groupby (.., as_index = False)

์— ๋งŒ๋“  2017๋…„ 08์›” 08์ผ  ยท  6์ฝ”๋ฉ˜ํŠธ  ยท  ์ถœ์ฒ˜: pandas-dev/pandas

์•ˆ๋…•ํ•˜์„ธ์š”,

๊ธฐ๋ณธ ์ง‘๊ณ„ ๋ฐ ๋ถ์„ ์ˆ˜ํ–‰ ์ค‘์ด๋ฉฐ ์ด์ƒํ•œ ๋ฒ„๊ทธ๊ฐ€ ๋ฐœ์ƒํ•ฉ๋‹ˆ๋‹ค.
๋‹ค์Œ์€ reprex์ž…๋‹ˆ๋‹ค.

import pandas as pd
import numpy as np


idx2=[pd.to_datetime('2016-08-31 22:08:12.000') , 
     pd.to_datetime('2016-08-31 22:09:12.200'),
     pd.to_datetime('2016-08-31 22:20:12.400')]

test=pd.DataFrame({'quant':[1.0,1.0,3.0], 
                   'quant2':[1.0,1.0,3.0],
                   'time2':[pd.to_datetime('2016-08-31 22:08:12.000') , 
                             pd.to_datetime('2016-08-31 22:09:12.200'),
                             pd.to_datetime('2016-08-31 22:20:12.400')]}, 
                    index=idx2)
test.reset_index(inplace = True)

test
Out[22]: 
                    index  quant  quant2                   time2
0 2016-08-31 22:08:12.000    1.0     1.0 2016-08-31 22:08:12.000
1 2016-08-31 22:09:12.200    1.0     1.0 2016-08-31 22:09:12.200
2 2016-08-31 22:20:12.400    3.0     3.0 2016-08-31 22:20:12.400

df= test.groupby(pd.Grouper(key='time2', freq='1T', closed = 'left', label = 'left'),as_index = False).agg(
                     {'quant' : 'sum',
                      'quant2' : 'sum'})

์ค€๋‹ค

  File "<ipython-input-20-c09863316397>", line 19, in <module>
    'quant2' : 'sum'})

  File "C:\\Anaconda2\lib\site-packages\pandas\core\groupby.py", line 4036, in aggregate
    return super(DataFrameGroupBy, self).aggregate(arg, *args, **kwargs)

  File "C:\\Anaconda2\lib\site-packages\pandas\core\groupby.py", line 3491, in aggregate
    self._insert_inaxis_grouper_inplace(result)

  File "C:\\Anaconda2\lib\site-packages\pandas\core\groupby.py", line 4090, in _insert_inaxis_grouper_inplace
    self.grouper.get_group_levels(),

  File "C:\\Anaconda2\lib\site-packages\pandas\core\groupby.py", line 1911, in get_group_levels
    if not self.compressed and len(self.groupings) == 1:

AttributeError: 'BinGrouper' object has no attribute 'compressed'

์˜ˆ์ƒ๋˜๋Š” ์ผ์ž…๋‹ˆ๊นŒ? as_index = False ์—์„œ์ด ์˜ค๋ฅ˜๊ฐ€ ๋ฐœ์ƒํ•˜๋Š” ์ด์œ ๋Š” ๋ฌด์—‡์ž…๋‹ˆ๊นŒ?

Bug Groupby Resample

๊ฐ€์žฅ ์œ ์šฉํ•œ ๋Œ“๊ธ€

@randomgambit : ๊ฐ์‚ฌํ•ฉ๋‹ˆ๋‹ค! ํ•ด๋‹น ์ฝ”๋“œ๋ฅผ ์ดˆ๊ธฐ ๋ฌธ์ œ ๋ณด๊ณ ์„œ๋กœ ์˜ฎ๊ฒจ ์ฃผ์‹œ๊ฒ ์Šต๋‹ˆ๊นŒ? ๊ทธ๋ ‡๊ฒŒํ•˜๋ฉด ์šฐ๋ฆฌ๊ฐ€ ํ•œ ๋ˆˆ์—๋ณด๊ธฐ์—๋„ ๋” ์‰ฝ๊ฒŒ ์ฝ์„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

๋ชจ๋“  6 ๋Œ“๊ธ€

@randomgambit : ์‹ ๊ณ  ํ•ด ์ฃผ์…”์„œ ๊ฐ์‚ฌํ•ฉ๋‹ˆ๋‹ค! ์•ˆํƒ€๊น๊ฒŒ๋„ df ์ด ๋ฌด์—‡์ธ์ง€ ์ง€์ •ํ•˜์ง€ ์•Š์•˜๊ธฐ ๋•Œ๋ฌธ์—์ด๋ฅผ ๋ณต์ œ ํ•  ์ˆ˜ ์—†์Šต๋‹ˆ๋‹ค. ์ƒ˜ํ”Œ ์ฝ”๋“œ์— ์ œ๊ณต ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๊นŒ?

@randomgambit : ๊ฐ์‚ฌํ•ฉ๋‹ˆ๋‹ค! ํ•ด๋‹น ์ฝ”๋“œ๋ฅผ ์ดˆ๊ธฐ ๋ฌธ์ œ ๋ณด๊ณ ์„œ๋กœ ์˜ฎ๊ฒจ ์ฃผ์‹œ๊ฒ ์Šต๋‹ˆ๊นŒ? ๊ทธ๋ ‡๊ฒŒํ•˜๋ฉด ์šฐ๋ฆฌ๊ฐ€ ํ•œ ๋ˆˆ์—๋ณด๊ธฐ์—๋„ ๋” ์‰ฝ๊ฒŒ ์ฝ์„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

@gfyoung ์–ด๋–ค ์•„์ด๋””์–ด?

cc @jreback

TimeGrouper๋ฅผ ์‚ฌ์šฉํ•  ๋•Œ๋Š” as_index ๊ฐ€ ์ง€์›๋˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค.

๊ท€ํ•˜์˜ ์˜ˆ๋Š” ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค.

test.resample('1T', on='time2').sum()

๋‚˜๋Š” ์ด๊ฒƒ์ด ํšจ๊ณผ๊ฐ€ ์žˆ๋‹ค๊ณ  ์ƒ๊ฐํ•œ๋‹ค.

๋ˆ„๊ตฐ๊ฐ€ dibs๋ฅผ ํ˜ธ์ถœ ํ•œ ์ง€ 3 ๋…„์ด ์ง€๋‚ฌ์ง€ ๋งŒ ๋น ๋ฅธ ํ•ด๊ฒฐ ๋ฐฉ๋ฒ•์„ ์›ํ•œ๋‹ค๋ฉด ์ง‘๊ณ„ ํ›„ .reset_index() ํ•˜์—ฌ ์ธ๋ฑ์Šค col์„ ์ž์ฒด ์—ด์— ํ”„๋กœ์ ์…˜ ํ•  ์ˆ˜์žˆ์—ˆ์Šต๋‹ˆ๋‹ค.

df.groupby(pd.Grouper(key='date_col', freq='1d')).count().reset_index()
์ด ํŽ˜์ด์ง€๊ฐ€ ๋„์›€์ด ๋˜์—ˆ๋‚˜์š”?
0 / 5 - 0 ๋“ฑ๊ธ‰