r - Optimizing rollapplyr custom function -
i have following data:
y <- data.table(cbind(week = rep(1:61,5352), id = rep(1:5352, each = 61), w = runif(326472), v = runif(326472))) y$v[sample(1:326472, 10000, replace=false)] <- na
for i'm running code bellow creates rolling mean of variable v, ignoring outliers , nas. code working, poor perfomance. i'm sure there more efficient way run using apply or similar, i've been unsuccessful in creating faster version. can shed light on how make more efficient?
ids <- unique(y$id) y$vol_m12 <- 0 (i in 1:length(ids)) { x <- y[id==ids[i]] outlier <- 0.2 w_outlier <- quantile(x$w, c(outlier), na.rm = t) v_outlier <-quantile(x$v, c(1 - outlier), na.rm = t) # ignore outliers x$v_temp <- x$v x$v_temp[((x$v_temp >= v_outlier) & (x$w <= w_outlier))] <- na # creating rolling mean y$vol_m12[y$id==ids[i]] <- x[, rollapplyr(v_temp, 12, (mean), fill = na, na.rm=t)] }
thanks replies. following 42 advice, i've produced following code:
library(rcpproll) # ignore outliers y[, w_out := quantile(w, c(outlier), na.rm = t), by=id] y[, v_out := quantile(v, c(1-outlier), na.rm = t), by=id] y[((v <= v_out) & (w >= w_out)), v_temp := v] y[,w_out := null] y[,v_out := null] y[, v_m12 := roll_mean(as.matrix(v_temp), n =12l, fill = na, align = c("right"), normalize = true, na.rm = t), = id]
system time .59 seconds against 10.36 solution bellow, uses rollapplyr (but possible make outlier removal more efficient).
y[, v_m12 :=rollapplyr(v_temp, 12, (mean), fill = na, na.rm=t), = id]
Comments
Post a Comment