python - Rolling standard deviation using parts of data in dataframe with Pandas -


i able calculate rolling standard deviation based on part of data in dataframe. example explain want accomplish.

                           b         c 2000-01-01  0.425615  1.679789 -1.903056 2000-01-02  0.791313  0.562471  0.098124 2000-01-03  1.223165 -0.548387 -1.558204 2000-01-04  0.354931 -0.685773  0.647817 2000-01-05  1.137434  1.000594  0.428180 2000-01-06 -0.265311 -1.807045  0.533477 2000-01-07  0.717505  1.647540 -0.141123 2000-01-08 -2.405664  1.278410  1.043872 2000-01-09  0.463943  0.982042 -0.382241 2000-01-10 -0.403267 -0.615421  0.583384 2000-01-11 -0.714163  0.470505 -0.291396 2000-01-12  0.209979 -0.118331 -0.369776 2000-01-13 -0.779638  0.924612 -0.477497 2000-01-14  0.149868 -0.376292  0.747637 2000-01-15 -0.464360  0.821400  1.412874 

this able do:

  1. the calculation should done rolling each column.
  2. i calculate rolling standard deviation using data every n:th date in dataftame. if n=3 , want calculate standard deviation 2000-01-15 using values following dates: 2000-01-15, 2000-01-12, 2000-01-09, 2000-01-06, 2000-01-03. 2000-01-14 use 2000-01-14, 2000-01-11, 2000-01-08, 2000-01-05, 2000-01-02. same logic other dates rolling standard deviation.
  3. it great if logic applied other calculations. can't figure out how switch between different resolutions of time.

window_step_size = 3 rolling_window = 3 >>> pd.rolling_std(df.ix[df.index[::-1][::window_step_size][::-1]], window=rolling_window)                            b         c 2000-01-03       nan       nan       nan 2000-01-06       nan       nan       nan 2000-01-09  0.744288  1.396749  1.048535 2000-01-12  0.370182  1.404848  0.525129 2000-01-15  0.479753  0.594379  1.032831 

df.index[::-1] reverses dates in index recent date first. df.ix[df.index[::-1][::window_step_size] takes every nth value index (e.g. every third date). finally, df.index[::-1][::window_step_size][::-1] resorts index oldest date first.

>>> df.index[::-1][::window_step_size][::-1] index([u'2000-01-03', u'2000-01-06', u'2000-01-09', u'2000-01-12', u'2000-01-15'], dtype='object') 

based on new index, select values database:

>>> df.ix[df.index[::-1][::window_step_size][::-1]]                             b         c 2000-01-03  1.223165 -0.548387 -1.558204 2000-01-06 -0.265311 -1.807045  0.533477 2000-01-09  0.463943  0.982042 -0.382241 2000-01-12  0.209979 -0.118331 -0.369776 2000-01-15 -0.464360  0.821400  1.412874 

you can use regular pd.rolling_std function chosen rolling window.

pd.rolling_std(df.ix[df.index[::-1][::window_step_size][::-1]], window=rolling_window) 

edit daily values, can concatenate.

def roll_sd(df, rolling_window, window_step_size):     return pd.rolling_std(df.ix[df.index[::-1][::window_step_size][::-1]],                            window=rolling_window)  df_sd = pd.concat([roll_sd(df.iloc[0:len(df)-n], rolling_window, window_step_size)                     n in range(window_step_size)])  df_sd.sort_index()                            b         c 2000-01-01       nan       nan       nan 2000-01-02       nan       nan       nan 2000-01-03       nan       nan       nan 2000-01-04       nan       nan       nan 2000-01-05       nan       nan       nan 2000-01-06       nan       nan       nan 2000-01-07  0.192205  1.356544  1.305998 2000-01-08  1.953373  0.360948  0.480009 2000-01-09  0.744288  1.396749  1.048535 2000-01-10  0.571905  1.327296  0.438081 2000-01-11  1.772152  0.410464  0.668307 2000-01-12  0.370182  1.404848  0.525129 2000-01-13  0.778805  1.155806  0.542145 2000-01-14  1.299902  0.827427  0.701223 2000-01-15  0.479753  0.594379  1.032831 

Comments

Popular posts from this blog

sublimetext3 - what keyboard shortcut is to comment/uncomment for this script tag in sublime -

dataset - MPAndroidchart returning no chart Data available -

post - imageshack API cURL -