Pandas: df.duplicated ๋ฐ drop_duplicates๋Š” ์„ค์ • ๋ฐ ๋ชฉ๋ก ๊ฐ’์œผ๋กœ TypeError๋ฅผ ๋ฐœ์ƒ์‹œํ‚ต๋‹ˆ๋‹ค.

์— ๋งŒ๋“  2016๋…„ 03์›” 22์ผ  ยท  3์ฝ”๋ฉ˜ํŠธ  ยท  ์ถœ์ฒ˜: pandas-dev/pandas

์—:

import pandas as pd
df = pd.DataFrame([[{'a', 'b'}], [{'b','c'}], [{'b', 'a'}]])
df

๋ฐ–:

    0
0   {a, b}
1   {c, b}
2   {a, b}

์—:

df.duplicated()

๋ฐ–:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-77-7cc63ba1ed41> in <module>()
----> 1 df.duplicated()

venv/lib/python3.5/site-packages/pandas/util/decorators.py in wrapper(*args, **kwargs)
     89                 else:
     90                     kwargs[new_arg_name] = new_arg_value
---> 91             return func(*args, **kwargs)
     92         return wrapper
     93     return _deprecate_kwarg

venv/lib/python3.5/site-packages/pandas/core/frame.py in duplicated(self, subset, keep)
   3100 
   3101         vals = (self[col].values for col in subset)
-> 3102         labels, shape = map(list, zip(*map(f, vals)))
   3103 
   3104         ids = get_group_index(labels, shape, sort=False, xnull=False)

TypeError: type object argument after * must be a sequence, not map

๋‚˜๋Š” ๊ธฐ๋Œ€:

0    False
1    False
2     True
dtype: bool

pd.show_versions() ์ถœ๋ ฅ:

INSTALLED VERSIONS
------------------
commit: None
python: 3.5.1.final.0
python-bits: 64
OS: Linux
OS-release: 4.3.0-1-amd64
machine: x86_64
processor: 
byteorder: little
LC_ALL: None
LANG: ru_RU.UTF-8

pandas: 0.18.0
nose: None
pip: 1.5.6
setuptools: 18.8
Cython: None
numpy: 1.10.4
scipy: None
statsmodels: None
xarray: None
IPython: 4.1.2
sphinx: None
patsy: None
dateutil: 2.5.1
pytz: 2016.1
blosc: None
bottleneck: None
tables: None
numexpr: None
matplotlib: 1.5.1
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: 0.999
httplib2: None
apiclient: None
sqlalchemy: 1.0.12
pymysql: None
psycopg2: 2.6.1 (dt dec pq3 ext lo64)
jinja2: 2.8
boto: None
Bug Missing-data

๊ฐ€์žฅ ์œ ์šฉํ•œ ๋Œ“๊ธ€

์ค‘๋ณต ํ•ญ๋ชฉ์„ ์‚ญ์ œํ•  ๋ชฉ์ ์œผ๋กœ ํ•ด์‹œํ•  ์ˆ˜ ์—†๋Š” ์—ด์„ ๋ฌด์‹œํ•˜๋Š” ๊ฒƒ์€ ์–ด๋–ป์Šต๋‹ˆ๊นŒ?
๊ธฐ๋ณธ๊ฐ’์ด 'raise'(ํ˜„์žฌ๋กœ ์ž‘๋™ํ•จ)์ด์ง€๋งŒ '๋ฌด์‹œ'๋กœ ์„ค์ •ํ•  ์ˆ˜ ์žˆ๋Š” kwarg 'unhashable_type'์„ ์ถ”๊ฐ€ํ•˜๋Š” ๊ฒƒ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค(์™„์ „ํžˆ ์ค‘๋ณต๋˜์ง€ ์•Š์€ ํ–‰์„ ์‚ญ์ œํ•  ์œ„ํ—˜์ด ์žˆ์Œ).

๋ชจ๋“  3 ๋Œ“๊ธ€

๋‚˜๋Š” ์ถ”์ธกํ•œ๋‹ค. ํ”„๋ ˆ์ž„์˜ ์…€ ๋‚ด๋ถ€์—์„œ ๋ชฉ๋ก๊ณผ ๊ฐ™์€ ๊ฐ’์„ ์‚ฌ์šฉํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค. ์ด๊ฒƒ์€ ๋งค์šฐ ๋น„ํšจ์œจ์ ์ด๋ฉฐ ์ผ๋ฐ˜์ ์œผ๋กœ ์ง€์›๋˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค. pull-request๋Š” ์–ด๋–ค ๊ฒฝ์šฐ์—๋„ ์ˆ˜์ •์„ ์ˆ˜๋ฝํ•ฉ๋‹ˆ๋‹ค.

ํ˜„์žฌ pandas๋Š” ์•ฝ๊ฐ„ ๋‹ค๋ฅธ TypeError( TypeError: unhashable type: 'set' )๋ฅผ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค. ์ด๋Š” ์š”์ ์— ๋„๋‹ฌํ•ฉ๋‹ˆ๋‹ค. ์ง‘ํ•ฉ์ด๋‚˜ ๋ชฉ๋ก์„ ์–ด๋–ป๊ฒŒ ์ค‘๋ณต ์ œ๊ฑฐํ• ๊นŒ์š”? ํŠœํ”Œ ๋ฐ ๊ธฐ๋ณธ ์œ ํ˜•๊ณผ ๋‹ฌ๋ฆฌ ์ด๋“ค์€ ํ•ด์‹œ ๊ฐ€๋Šฅํ•˜์ง€ ์•Š์œผ๋ฏ€๋กœ(์„ธํŠธ๋ฅผ ํ•ด์‹œ ๊ฐ€๋Šฅํ•œ frozenset์œผ๋กœ ๋ณ€ํ™˜ํ•  ์ˆ˜ ์žˆ์Œ) ์ค‘๋ณต ์ œ๊ฑฐ ์ „๋žต์„ ๋งˆ๋ จํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.

์–ด๋–ค ๊ฒฝ์šฐ๋“  dtype ๊ฐœ์ฒด๋ฅผ ๋‹ค๋ฃจ๊ธฐ ๋•Œ๋ฌธ์— ๋‹ค์Œ ํ–‰์— ์ง‘ํ•ฉ์ด๋‚˜ ๋ชฉ๋ก์ด ํฌํ•จ๋˜์ง€ ์•Š๋Š”๋‹ค๋Š” ๋ณด์žฅ์ด ์—†์œผ๋ฏ€๋กœ ์ด ์ค‘๋ณต ์ œ๊ฑฐ๋Š” ๊ทธ ์ดํ›„๋กœ ์•…ํ™”๋  ๋ฟ์ž…๋‹ˆ๋‹ค. ๋”ฐ๋ผ์„œ pandas๋Š” ๊ฐ ๊ฐ’์„ ๋ณ„๋„์˜ ๊ฐ’์œผ๋กœ ์ทจ๊ธ‰ํ•˜๊ณ  ํ•ด์‹œ ๊ฐ€๋Šฅํ•œ ํ•œ ์ฒ˜๋ฆฌํ•ฉ๋‹ˆ๋‹ค. ์„ธ ๊ฐœ์˜ ํŠœํ”Œ์ด ์žˆ๋Š” ์—ด์„ ์‹œ๋„ํ•˜๋ฉด ์ž‘๋™ํ•˜๊ณ  ๋งˆ์ง€๋ง‰ ํ•˜๋‚˜๋ฅผ ์ง‘ํ•ฉ์œผ๋กœ ๋ณ€๊ฒฝํ•˜๋ฉด ๋ฐ”๋กœ ๊ทธ ๊ฐ’์—์„œ ์‹คํŒจํ•ฉ๋‹ˆ๋‹ค.

๋”ฐ๋ผ์„œ ๋ชฉ๋ก์— ํ•ด์‹œ ๊ฐ€๋Šฅ์„ฑ์ด ์—†๋‹ค๋Š” ์ ์„ ๊ฐ์•ˆํ•  ๋•Œ ์—ฌ๊ธฐ์—์„œ ์ž‘๋™ํ•˜๋Š” ๊ฒฌ๊ณ ํ•œ ๊ตฌํ˜„์ด ์žˆ๋Š”์ง€ ํ™•์‹คํ•˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค. ํ•ด์‹œ ๋งต ์‚ฝ์ž… ์‹œ frozenset์œผ๋กœ ๋ณ€ํ™˜๋˜๋Š” ์„ธํŠธ์— ๋Œ€ํ•œ ์ˆ˜์ •์ด ์ž ์žฌ์ ์œผ๋กœ ์žˆ์„ ์ˆ˜ ์žˆ์ง€๋งŒ ์ด๋Š” ํ•ดํ‚น๋˜๊ณ  ์ž„์˜์ ์œผ๋กœ ๋ณด์ž…๋‹ˆ๋‹ค. .

์ค‘๋ณต ํ•ญ๋ชฉ์„ ์‚ญ์ œํ•  ๋ชฉ์ ์œผ๋กœ ํ•ด์‹œํ•  ์ˆ˜ ์—†๋Š” ์—ด์„ ๋ฌด์‹œํ•˜๋Š” ๊ฒƒ์€ ์–ด๋–ป์Šต๋‹ˆ๊นŒ?
๊ธฐ๋ณธ๊ฐ’์ด 'raise'(ํ˜„์žฌ๋กœ ์ž‘๋™ํ•จ)์ด์ง€๋งŒ '๋ฌด์‹œ'๋กœ ์„ค์ •ํ•  ์ˆ˜ ์žˆ๋Š” kwarg 'unhashable_type'์„ ์ถ”๊ฐ€ํ•˜๋Š” ๊ฒƒ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค(์™„์ „ํžˆ ์ค‘๋ณต๋˜์ง€ ์•Š์€ ํ–‰์„ ์‚ญ์ œํ•  ์œ„ํ—˜์ด ์žˆ์Œ).

์ด ํŽ˜์ด์ง€๊ฐ€ ๋„์›€์ด ๋˜์—ˆ๋‚˜์š”?
0 / 5 - 0 ๋“ฑ๊ธ‰