Pandas: νƒ€μž„ μŠ€νƒ¬ν”„ κ°’μ΄μžˆλŠ” μ‹œλ¦¬μ¦ˆμ— μ μš©μ„ μ‚¬μš©ν•  수 μ—†μŠ΅λ‹ˆλ‹€.

에 λ§Œλ“  2017λ…„ 08μ›” 03일  Β·  3μ½”λ©˜νŠΈ  Β·  좜처: pandas-dev/pandas

μ½”λ“œ μƒ˜ν”Œ, κ°€λŠ₯ν•œ 경우 λ³΅μ‚¬ν•˜μ—¬ λΆ™μ—¬ 넣을 μˆ˜μžˆλŠ” 예제

import pandas as pd

ts = pd.Series([pd.Timestamp('2017-07-31 20:08:46.110998-04:00'), 
                pd.Timestamp('2017-08-01 20:08:46.110998-04:00'), 
                pd.Timestamp('2017-08-02 20:08:46.110998-04:00')])

def func(elem):
    print(type(elem))
    return elem
print(type(ts))
print(type(ts[0]))

ts.apply(func);

# Prints out:
# <class 'pandas.core.series.Series'>
# <class 'pandas._libs.tslib.Timestamp'>
# <class 'pandas.core.indexes.datetimes.DatetimeIndex'>

문제 μ„€λͺ…

μΈλ±μŠ€κ°€ μ•„λ‹Œ κ°’μœΌλ‘œ νƒ€μž„ μŠ€νƒ¬ν”„κ°€μžˆλŠ” μ‹œλ¦¬μ¦ˆκ°€ μžˆμŠ΅λ‹ˆλ‹€. apply λ©”μ„œλ“œκ°€ 각 μš”μ†Œμ—μ„œ 호좜 될 κ²ƒμœΌλ‘œ μ˜ˆμƒν•˜μ§€λ§Œ 그렇지 μ•Šκ³  였히렀 DatetimeIndexμ—μ„œ ν˜ΈμΆœλ©λ‹ˆλ‹€.

μ˜ˆμƒ 좜λ ₯





pd.show_versions()

μ„€μΉ˜λœ 버전

컀밋 : μ—†μŒ
파이썬 : 3.6.0.final.0
파이썬 λΉ„νŠΈ : 64
운영체제 : Darwin
OS 릴리슀 : 16.7.0
컴퓨터 : x86_64
ν”„λ‘œμ„Έμ„œ : i386
byteorder : 쑰금
LC_ALL : μ—†μŒ
LANG : en_CA.UTF-8
둜컬 : en_CA.UTF-8

νŒλ‹€ : 0.20.2
pytest : 3.0.5
핍 : 9.0.1
setuptools : 35.0.1
Cython : μ—†μŒ
numpy : 1.13.0
scipy : 0.19.1
xarray : 0.9.6
IPython : 6.0.0
μŠ€ν•‘ν¬μŠ€ : 1.5.3
patsy : μ—†μŒ
dateutil : 2.6.0
pytz : 2016.10
blosc : μ—†μŒ
병λͺ© ν˜„μƒ : 1.2.0
ν…Œμ΄λΈ” : μ—†μŒ
numexpr : μ—†μŒ
κΉƒν„Έ : μ—†μŒ
matplotlib : 2.0.0
openpyxl : 2.4.8
xlrd : 1.0.0
xlwt : μ—†μŒ
xlsxwriter : 0.9.8
lxml : μ—†μŒ
bs4 : μ—†μŒ
html5lib : 0.999
sqlalchemy : μ—†μŒ
pymysql : μ—†μŒ
psycopg2 : μ—†μŒ
jinja2 : 2.9.5
s3fs : μ—†μŒ
pandas_gbq : μ—†μŒ
pandas_datareader : μ—†μŒ

Timezones Usage Question

λͺ¨λ“  3 λŒ“κΈ€

λ˜ν•œ λ‚΄ μ‚¬μš© 사둀에 λŒ€ν•œ λͺ‡ 가지 정보 :

μ‹œλ¦¬μ¦ˆμ˜ 각 νƒ€μž„ μŠ€νƒ¬ν”„μ— tz_localize λ©”μ„œλ“œλ₯Ό μ μš©ν•˜κ³  μ‹ΆμŠ΅λ‹ˆλ‹€. μ›λž˜ μ‹œλ¦¬μ¦ˆ μžμ²΄μ—μ„œ tz_localize λ₯Ό μ‹œλ„ν–ˆμ§€λ§Œ

TypeError: index is not a valid DatetimeIndex or PeriodIndex

reindex λ₯Ό μ‚¬μš©ν•˜μ—¬μ΄λ₯Ό 달성 ν•  수 μžˆλ‹€λŠ” 것을 μ•Œκ³  μžˆμ§€λ§Œ Timestampsλ₯Ό Series κ°’μœΌλ‘œ μ‚¬μš©ν•˜μ—¬μ΄ μž‘μ—…μ„ μˆ˜ν–‰ ν•  수 μžˆλŠ”μ§€ κΆκΈˆν•©λ‹ˆλ‹€.

@nathanielatom 은 dt μ ‘κ·Ό 자λ₯Ό 톡해 μ‹œλ¦¬μ¦ˆμ—μ„œ tz_localize / tz_convert λ₯Ό μ‚¬μš©ν•  수 μžˆμŠ΅λ‹ˆλ‹€.

In [19]: ts.dt.tz_convert('UTC')
Out[19]: 
0   2017-08-01 00:08:46.110998+00:00
1   2017-08-02 00:08:46.110998+00:00
2   2017-08-03 00:08:46.110998+00:00
dtype: datetime64[ns, UTC]

λ˜ν•œ, apply ν‘œμ‹œλ˜λŠ” 좜λ ₯을 μ–»λŠ” μ΄μœ λŠ” applyκ°€ λ¨Όμ € λͺ¨λ“  값에 λŒ€ν•΄ ν•¨μˆ˜λ₯Ό ν˜ΈμΆœν•˜λ €κ³  μ‹œλ„ν•˜κΈ° λ•Œλ¬Έμž…λ‹ˆλ‹€. ), μ‹€νŒ¨ν•˜λŠ” κ²½μš°μ—λ§Œ 각 μš”μ†Œμ—μ„œ ν•¨μˆ˜λ₯Ό ν˜ΈμΆœν•©λ‹ˆλ‹€.

슀칼라 값을 얻지 λͺ»ν–ˆμ„ λ•Œ ν•¨μˆ˜λ₯Ό μ•½κ°„ μ˜¬λ¦¬λ„λ‘ μ‘°μ •ν•˜λ©΄ μ˜ˆμƒ 좜λ ₯이 ν‘œμ‹œλ©λ‹ˆλ‹€.

In [21]: def func(elem):
    ...:     assert not hasattr(elem, 'ndim')
    ...:     print(type(elem))
    ...:     return elem
    ...: 

In [22]: ts.apply(func)
<class 'pandas._libs.tslib.Timestamp'>
<class 'pandas._libs.tslib.Timestamp'>
<class 'pandas._libs.tslib.Timestamp'>
Out[22]: 
0   2017-07-31 20:08:46.110998-04:00
1   2017-08-01 20:08:46.110998-04:00
2   2017-08-02 20:08:46.110998-04:00
dtype: datetime64[ns, pytz.FixedOffset(-240)]
이 νŽ˜μ΄μ§€κ°€ 도움이 λ˜μ—ˆλ‚˜μš”?
0 / 5 - 0 λ“±κΈ‰