Pandas: Cannot use apply on Series with Timestamp values

Created on 3 Aug 2017  ·  3Comments  ·  Source: pandas-dev/pandas

Code Sample, a copy-pastable example if possible

import pandas as pd

ts = pd.Series([pd.Timestamp('2017-07-31 20:08:46.110998-04:00'), 
                pd.Timestamp('2017-08-01 20:08:46.110998-04:00'), 
                pd.Timestamp('2017-08-02 20:08:46.110998-04:00')])

def func(elem):
    print(type(elem))
    return elem
print(type(ts))
print(type(ts[0]))

ts.apply(func);

# Prints out:
# <class 'pandas.core.series.Series'>
# <class 'pandas._libs.tslib.Timestamp'>
# <class 'pandas.core.indexes.datetimes.DatetimeIndex'>

Problem description

I have a Series with Timestamps as values rather than the index. I expect the apply method to be called on each element, but it is not, rather it gets called on a DatetimeIndex.

Expected Output





Output of pd.show_versions()

INSTALLED VERSIONS

commit: None
python: 3.6.0.final.0
python-bits: 64
OS: Darwin
OS-release: 16.7.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: None
LANG: en_CA.UTF-8
LOCALE: en_CA.UTF-8

pandas: 0.20.2
pytest: 3.0.5
pip: 9.0.1
setuptools: 35.0.1
Cython: None
numpy: 1.13.0
scipy: 0.19.1
xarray: 0.9.6
IPython: 6.0.0
sphinx: 1.5.3
patsy: None
dateutil: 2.6.0
pytz: 2016.10
blosc: None
bottleneck: 1.2.0
tables: None
numexpr: None
feather: None
matplotlib: 2.0.0
openpyxl: 2.4.8
xlrd: 1.0.0
xlwt: None
xlsxwriter: 0.9.8
lxml: None
bs4: None
html5lib: 0.999
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: 2.9.5
s3fs: None
pandas_gbq: None
pandas_datareader: None

Timezones Usage Question

All 3 comments

Also, some info on my use case:

I want to apply the tz_localize method to each Timestamp in the series. I originally tried tz_localize on the series itself, but that raised

TypeError: index is not a valid DatetimeIndex or PeriodIndex

I realize it is possible to achieve this by using reindex, but I was wondering if it was possible to do this with Timestamps as Series values as well.

@nathanielatom you can use tz_localize/tz_convert on the Series through the dt accessor:

In [19]: ts.dt.tz_convert('UTC')
Out[19]: 
0   2017-08-01 00:08:46.110998+00:00
1   2017-08-02 00:08:46.110998+00:00
2   2017-08-03 00:08:46.110998+00:00
dtype: datetime64[ns, UTC]

Further, the reason you get the output you see with apply, is because apply will first try to invoke the function on all values (which are holded under the hood as a DatetimeIndex, although it are the values of the Series), and only if that fails, will call the function on each element.

If you adapt the function a little bit to raise when it doesn't get a scalar value, you see the expected output:

In [21]: def func(elem):
    ...:     assert not hasattr(elem, 'ndim')
    ...:     print(type(elem))
    ...:     return elem
    ...: 

In [22]: ts.apply(func)
<class 'pandas._libs.tslib.Timestamp'>
<class 'pandas._libs.tslib.Timestamp'>
<class 'pandas._libs.tslib.Timestamp'>
Out[22]: 
0   2017-07-31 20:08:46.110998-04:00
1   2017-08-01 20:08:46.110998-04:00
2   2017-08-02 20:08:46.110998-04:00
dtype: datetime64[ns, pytz.FixedOffset(-240)]
Was this page helpful?
0 / 5 - 0 ratings

Related issues

scls19fr picture scls19fr  ·  3Comments

Ashutosh-Srivastav picture Ashutosh-Srivastav  ·  3Comments

mfmain picture mfmain  ·  3Comments

ericdf picture ericdf  ·  3Comments

swails picture swails  ·  3Comments