python - pandas df.apply TypeError data type not understood -
i'm trying apply operation every value in datetime series. i've reduced lambda print illustrate problem. works in similar dataframe not on one? python version 3.5.1, pandas version 0.17.1.
some more padding satisfy question verbosity requirement.
print(dfy.info()) print(dfy) dfy.apply(lambda rr: print(rr['predicted_time']), 1)
output
<class 'pandas.core.frame.dataframe'> int64index: 21 entries, 0 20 data columns (total 1 columns): predicted_time 21 non-null datetime64[ns, pytz.fixedoffset(60)] dtypes: datetime64[ns, pytz.fixedoffset(60)](1) memory usage: 336.0 bytes none predicted_time 0 2005-02-01 02:40:00+01:00 1 2005-02-01 02:40:00+01:00 2 2005-02-01 02:40:00+01:00 3 2005-02-01 02:40:00+01:00 4 2005-02-01 02:43:00+01:00 5 2005-02-01 02:43:00+01:00 6 2005-02-01 02:43:00+01:00 <snip> 19 2005-02-01 02:50:00+01:00 20 2005-02-01 02:50:00+01:00 --------------------------------------------------------------------------- typeerror traceback (most recent call last) <ipython-input-43-8ae0cf570812> in <module>() 1 print(dfy.info()) 2 print(dfy) ----> 3 dfy.apply(lambda rr: print(rr['predicted_time']), 1) /.../projects/software/timetillcomplete/venv/lib/python3.5/site-packages/pandas/core/frame.py in apply(self, func, axis, broadcast, raw, reduce, args, **kwds) 3970 if reduce none: 3971 reduce = true -> 3972 return self._apply_standard(f, axis, reduce=reduce) 3973 else: 3974 return self._apply_broadcast(f, axis) /.../projects/software/timetillcomplete/venv/lib/python3.5/site-packages/pandas/core/frame.py in _apply_standard(self, func, axis, ignore_failures, reduce) 4017 # create dummy series empty array 4018 index = self._get_axis(axis) -> 4019 empty_arr = np.empty(len(index), dtype=values.dtype) 4020 dummy = series(empty_arr, index=self._get_axis(axis), 4021 dtype=values.dtype) typeerror: data type not understood
i don't known what's going on, workaround can expected output calling apply()
on column:
dfy['predicted_time'].apply(lambda rr: print(rr))
edit looks hit bug in pandas. issue triggered using time zone aware timestamps in dataframe. using series works seen above. using naive timestamps works:
df = pd.dataframe(pd.series(dfy['predicted_time'].values), columns=['predicted_time']) df.apply(lambda rr: print(rr['predicted_time']), 1)
Comments
Post a Comment