Pandas: Interpolate (upsample) non-equispaced timeseries into equispaced 18.0rc1

Created on 7 Mar 2016  ·  3Comments  ·  Source: pandas-dev/pandas

I want to interpolate (upscale) nonequispaced time-series to obtain equispaced time-series.

Currently I am doing it in following way:

  1. take original timeseries.
  2. create new timeseries with NaN values at each 30 seconds intervals ( using resample('30S').asfreq() )
  3. concat original timeseries and new timeseries
  4. sort the timeseries to restore order of times (This I do not like - sorting has complexity of O = n log(n) )
  5. interpolate
  6. remove original points from the timeseries

is there a more simple way? like in matlab you have original timeseries and you pass new times as a parameter to the interpolate() function to receive values at desired times. Ideally I would like to have a function such as

origTimeSeries.interpolate(newIndex=newTimeIndex, method='spline')

I remark that times of original timeseries might not be be a subset of the times of desired timeseries.

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

values = [271238, 329285, 50, 260260, 263711]
timestamps = pd.to_datetime(['2015-01-04 08:29:4',
                             '2015-01-04 08:37:05',
                             '2015-01-04 08:41:07',
                             '2015-01-04 08:43:05',
                             '2015-01-04 08:49:05'])

ts = pd.Series(values, index=timestamps)
ts[ts==-1] = np.nan




lines, labels = plt.gca().get_legend_handles_labels()
labels = ['original values (nonequispaced)', 'original + interpolated at new frequency (nonequispaced)', 'interpolated values without original values (equispaced!)']
plt.legend(lines, labels, loc='best')


Enhancement Resample Timeseries

Most helpful comment

this gets you pretty close

In [42]: ts.reindex(ts.resample('60s').asfreq().index, method='nearest', tolerance=pd.Timedelta('60s')).interpolate('time')
2015-01-04 08:29:00    271238.000000
2015-01-04 08:30:00    271238.000000
2015-01-04 08:31:00    279530.428571
2015-01-04 08:32:00    287822.857143
2015-01-04 08:33:00    296115.285714
2015-01-04 08:34:00    304407.714286
2015-01-04 08:35:00    312700.142857
2015-01-04 08:36:00    320992.571429
2015-01-04 08:37:00    329285.000000
2015-01-04 08:38:00    329285.000000
2015-01-04 08:39:00    219540.000000
2015-01-04 08:40:00    109795.000000
2015-01-04 08:41:00        50.000000
2015-01-04 08:42:00        50.000000
2015-01-04 08:43:00    260260.000000
2015-01-04 08:44:00    260260.000000
2015-01-04 08:45:00    260950.200000
2015-01-04 08:46:00    261640.400000
2015-01-04 08:47:00    262330.600000
2015-01-04 08:48:00    263020.800000
2015-01-04 08:49:00    263711.000000
Freq: 60S, dtype: float64

All 3 comments

use ordered_merge rather than concat and sort

It would be nice to do it without need of merge altogether since I do not really need the merged time series, I only need the resultant equispaced time series. Is the way I described (enhanced with the ordered_merge) the most efficient way to do such? Maybe using spicy directly would be better then
scipy allows to do it in Matlab style, keep the original timeseries, and pass new index to obtain new timeseries.

also I will be working will online data so the original time series will grow and I will need to interpolate the new data and add them to the interpolated (equispaced) time series.

this gets you pretty close

In [42]: ts.reindex(ts.resample('60s').asfreq().index, method='nearest', tolerance=pd.Timedelta('60s')).interpolate('time')
2015-01-04 08:29:00    271238.000000
2015-01-04 08:30:00    271238.000000
2015-01-04 08:31:00    279530.428571
2015-01-04 08:32:00    287822.857143
2015-01-04 08:33:00    296115.285714
2015-01-04 08:34:00    304407.714286
2015-01-04 08:35:00    312700.142857
2015-01-04 08:36:00    320992.571429
2015-01-04 08:37:00    329285.000000
2015-01-04 08:38:00    329285.000000
2015-01-04 08:39:00    219540.000000
2015-01-04 08:40:00    109795.000000
2015-01-04 08:41:00        50.000000
2015-01-04 08:42:00        50.000000
2015-01-04 08:43:00    260260.000000
2015-01-04 08:44:00    260260.000000
2015-01-04 08:45:00    260950.200000
2015-01-04 08:46:00    261640.400000
2015-01-04 08:47:00    262330.600000
2015-01-04 08:48:00    263020.800000
2015-01-04 08:49:00    263711.000000
Freq: 60S, dtype: float64
Was this page helpful?
0 / 5 - 0 ratings