python - Rolling standard deviation using parts of data in dataframe with Pandas -
i able calculate rolling standard deviation based on part of data in dataframe. example explain want accomplish.
b c 2000-01-01 0.425615 1.679789 -1.903056 2000-01-02 0.791313 0.562471 0.098124 2000-01-03 1.223165 -0.548387 -1.558204 2000-01-04 0.354931 -0.685773 0.647817 2000-01-05 1.137434 1.000594 0.428180 2000-01-06 -0.265311 -1.807045 0.533477 2000-01-07 0.717505 1.647540 -0.141123 2000-01-08 -2.405664 1.278410 1.043872 2000-01-09 0.463943 0.982042 -0.382241 2000-01-10 -0.403267 -0.615421 0.583384 2000-01-11 -0.714163 0.470505 -0.291396 2000-01-12 0.209979 -0.118331 -0.369776 2000-01-13 -0.779638 0.924612 -0.477497 2000-01-14 0.149868 -0.376292 0.747637 2000-01-15 -0.464360 0.821400 1.412874
this able do:
- the calculation should done rolling each column.
- i calculate rolling standard deviation using data every n:th date in dataftame. if n=3 , want calculate standard deviation 2000-01-15 using values following dates: 2000-01-15, 2000-01-12, 2000-01-09, 2000-01-06, 2000-01-03. 2000-01-14 use 2000-01-14, 2000-01-11, 2000-01-08, 2000-01-05, 2000-01-02. same logic other dates rolling standard deviation.
- it great if logic applied other calculations. can't figure out how switch between different resolutions of time.
window_step_size = 3 rolling_window = 3 >>> pd.rolling_std(df.ix[df.index[::-1][::window_step_size][::-1]], window=rolling_window) b c 2000-01-03 nan nan nan 2000-01-06 nan nan nan 2000-01-09 0.744288 1.396749 1.048535 2000-01-12 0.370182 1.404848 0.525129 2000-01-15 0.479753 0.594379 1.032831
df.index[::-1]
reverses dates in index recent date first. df.ix[df.index[::-1][::window_step_size]
takes every nth
value index (e.g. every third date). finally, df.index[::-1][::window_step_size][::-1] resorts index oldest date first.
>>> df.index[::-1][::window_step_size][::-1] index([u'2000-01-03', u'2000-01-06', u'2000-01-09', u'2000-01-12', u'2000-01-15'], dtype='object')
based on new index, select values database:
>>> df.ix[df.index[::-1][::window_step_size][::-1]] b c 2000-01-03 1.223165 -0.548387 -1.558204 2000-01-06 -0.265311 -1.807045 0.533477 2000-01-09 0.463943 0.982042 -0.382241 2000-01-12 0.209979 -0.118331 -0.369776 2000-01-15 -0.464360 0.821400 1.412874
you can use regular pd.rolling_std
function chosen rolling window.
pd.rolling_std(df.ix[df.index[::-1][::window_step_size][::-1]], window=rolling_window)
edit daily values, can concatenate.
def roll_sd(df, rolling_window, window_step_size): return pd.rolling_std(df.ix[df.index[::-1][::window_step_size][::-1]], window=rolling_window) df_sd = pd.concat([roll_sd(df.iloc[0:len(df)-n], rolling_window, window_step_size) n in range(window_step_size)]) df_sd.sort_index() b c 2000-01-01 nan nan nan 2000-01-02 nan nan nan 2000-01-03 nan nan nan 2000-01-04 nan nan nan 2000-01-05 nan nan nan 2000-01-06 nan nan nan 2000-01-07 0.192205 1.356544 1.305998 2000-01-08 1.953373 0.360948 0.480009 2000-01-09 0.744288 1.396749 1.048535 2000-01-10 0.571905 1.327296 0.438081 2000-01-11 1.772152 0.410464 0.668307 2000-01-12 0.370182 1.404848 0.525129 2000-01-13 0.778805 1.155806 0.542145 2000-01-14 1.299902 0.827427 0.701223 2000-01-15 0.479753 0.594379 1.032831
Comments
Post a Comment