Pandas: drop으로 set_index가 작동하지 않음

에 만든 2016년 07월 14일 · 6코멘트 · 출처: pandas-dev/pandas

코드 샘플, 가능한 경우 복사하여 붙여 넣을 수있는 예제

from io import StringIO
from pandas import read_csv

dtf = read_csv(StringIO("DATE_TIME,A\n2/8/2015  6:00:30,1"))

print(dtf)

dtf.set_index(dtf.DATE_TIME, drop=True, inplace=True)
print(dtf.columns)
print(dtf)

전류 출력

           DATE_TIME  A
0  2/8/2015  6:00:30  1
Index(['DATE_TIME', 'A'], dtype='object')
                           DATE_TIME  A
DATE_TIME                              
2/8/2015  6:00:30  2/8/2015  6:00:30  1

예상 출력

           DATE_TIME  A
0  2/8/2015  6:00:30  1
Index(['A'], dtype='object')
                           A
DATE_TIME                              
2/8/2015  6:00:30  1

`pd.show_versions()` 출력

INSTALLED VERSIONS
------------------
commit: None
python: 3.5.1.final.0
python-bits: 64
OS: Darwin
OS-release: 15.5.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: None
LANG: None

pandas: 0.18.1
nose: None
pip: 8.1.2
setuptools: 20.6.7
Cython: None
numpy: 1.11.1
scipy: 0.16.1
statsmodels: None
xarray: None
IPython: 4.0.1
sphinx: None
patsy: None
dateutil: 2.5.3
pytz: 2016.6
blosc: None
bottleneck: None
tables: None
numexpr: None
matplotlib: 1.5.0
openpyxl: 2.3.5
xlrd: 1.0.0
xlwt: 1.0.0
xlsxwriter: None
lxml: None
bs4: 4.4.1
html5lib: None
httplib2: 0.9.2
apiclient: 1.5.0
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: 2.8
boto: None
pandas_datareader: None
None

Bug Error Reporting Reshaping

출처

VelizarVESSELINOV

가장 유용한 댓글

thx, 버그 인 것 같습니다. 입력이 원본에서 슬라이스 된 Series 이면 해당 열을 삭제해야합니다.

열 이름을 전달하면 제대로 작동합니다.

dtf.set_index('DATE_TIME', drop=True, inplace=True)
dtf.columns
# Index(['A'], dtype='object')

sinhrks 에 2016년 07월 14일

👍2

모든 6 댓글

thx, 버그 인 것 같습니다. 입력이 원본에서 슬라이스 된 Series 이면 해당 열을 삭제해야합니다.

열 이름을 전달하면 제대로 작동합니다.

dtf.set_index('DATE_TIME', drop=True, inplace=True)
dtf.columns
# Index(['A'], dtype='object')

sinhrks 에 2016년 07월 14일

👍2

버그가 아닙니다-이것은 set_index의 보장을 위반합니다

여기에 실제 열을 전달하는 것은 유효하지 않습니다.

실제로 인덱스를 할당하는 것과 같지 않습니다.

jreback 에 2016년 07월 14일

😕1

이 작업을 시도하는 PR이 있지만 본질적으로 모호합니다.

이것에 대해 경고 할 수 있는지조차 확신하지 못합니다.
(내 생각에 inplace 및 drop을 사용하는 것은 오류이지만)

jreback 에 2016년 07월 14일

😕1

버그가 아닙니다-이것은 set_index의 보장을 위반합니다

set_index의 보장에 대해 자세히 설명해 주시겠습니까? 특별히 drop=True 사용하면 혼란스럽고 어떤 이유로 드롭이 허용되지 않거나 가능하지 않을 때 오류가 발생하지 않습니다.

michaelaye 에 2016년 10월 13일

👍1

헉헉

키 목록을 전달하는 것은 정의에 의한 인덱스 설정입니다. 그러나 [58]이 [57]의 실제 결과라고 생각할 수 있습니다.

In [55]: df = pd.DataFrame({'A':range(2),'B':range(2),'C':range(2)})

In [56]: df
Out[56]: 
   A  B  C
0  0  0  0
1  1  1  1

In [57]: df.set_index(['A','B'])
Out[57]: 
     C
A B   
0 0  0
1 1  1

In [58]: df.index=['A','B']

In [59]: df
Out[59]: 
   A  B  C
A  0  0  0
B  1  1  1

In [54]: DataFrame.set_index?
Signature: DataFrame.set_index(self, keys, drop=True, append=False, inplace=False, verify_integrity=False)
Docstring:
Set the DataFrame index (row labels) using one or more existing
columns. By default yields a new object.

Parameters
----------
keys : column label or list of column labels / arrays
drop : boolean, default True
    Delete columns to be used as the new index
append : boolean, default False
    Whether to append columns to existing index
inplace : boolean, default False
    Modify the DataFrame in place (do not create a new object)
verify_integrity : boolean, default False
    Check the new index for duplicates. Otherwise defer the check until
    necessary. Setting to False will improve the performance of this
    method

Examples
--------
>>> indexed_df = df.set_index(['A', 'B'])
>>> indexed_df2 = df.set_index(['A', [0, 1, 2, 0, 1, 2]])
>>> indexed_df3 = df.set_index([[0, 1, 2, 0, 1, 2]])

Returns
-------
dataframe : DataFrame

jreback 에 2016년 10월 13일

😕1

이 문제를 해결할 계획이 있습니까?

ron819 에 2018년 11월 27일

👍1

이 페이지가 도움이 되었나요?

0 / 5 - 0 등급