r - Optimizing rollapplyr custom function -


i have following data:

y <- data.table(cbind(week = rep(1:61,5352), id = rep(1:5352, each = 61), w = runif(326472), v = runif(326472))) y$v[sample(1:326472, 10000, replace=false)] <- na 

for i'm running code bellow creates rolling mean of variable v, ignoring outliers , nas. code working, poor perfomance. i'm sure there more efficient way run using apply or similar, i've been unsuccessful in creating faster version. can shed light on how make more efficient?

ids <- unique(y$id) y$vol_m12 <- 0  (i in 1:length(ids)) {   x <- y[id==ids[i]]    outlier <- 0.2   w_outlier <- quantile(x$w, c(outlier), na.rm = t)   v_outlier <-quantile(x$v, c(1 - outlier), na.rm = t)  # ignore outliers         x$v_temp <- x$v   x$v_temp[((x$v_temp >= v_outlier)                  & (x$w <= w_outlier))] <- na  # creating rolling mean   y$vol_m12[y$id==ids[i]] <- x[, rollapplyr(v_temp, 12, (mean), fill = na, na.rm=t)] } 

thanks replies. following 42 advice, i've produced following code:

library(rcpproll) # ignore outliers y[, w_out := quantile(w, c(outlier), na.rm = t), by=id] y[, v_out := quantile(v, c(1-outlier), na.rm = t), by=id] y[((v <= v_out) & (w >= w_out)), v_temp := v] y[,w_out := null] y[,v_out := null]  y[, v_m12 := roll_mean(as.matrix(v_temp), n =12l, fill = na,                      align = c("right"), normalize = true, na.rm = t), = id] 

system time .59 seconds against 10.36 solution bellow, uses rollapplyr (but possible make outlier removal more efficient).

y[, v_m12 :=rollapplyr(v_temp, 12, (mean), fill = na, na.rm=t), = id] 

Comments

Popular posts from this blog

sublimetext3 - what keyboard shortcut is to comment/uncomment for this script tag in sublime -

java - No use of nillable="0" in SOAP Webservice -

ubuntu - Laravel 5.2 quickstart guide gives Not Found Error -