garbage collection - not all RAM is released after gc() after using ffdf object in R -
i running script follows:
library(ff) library(ffbase) setwd("d:/my_package/personal/r/reading") x<-cbind(rnorm(1:100000000),rnorm(1:100000000),1:100000000) system.time(write.csv2(x,"test.csv",row.names=false)) #make ffdf object minimal ram overheads system.time(x <- read.csv2.ffdf(file="test.csv", header=true, first.rows=1000, next.rows=10000,levels=null)) #make increase 5 of column#1 of ffdf object 'x' chunk approach chunk_size<-100 m<-numeric(chunk_size) #list of chunks chunks <- chunk(x, length.out=chunk_size) #for loop increase column#1 5 system.time( for(i in seq_along(chunks)){ x[chunks[[i]],][[1]]<-x[chunks[[i]],][[1]]+5 } ) # output of x print(x) #clear ram used rm(list = ls(all = true)) gc() #another option run garbage collector explicitly. gc(reset=true)
the issue still ram unreleased objects , functions have been swept away current environment.
moreover, next run of script increase portion of ram unreleased if cumulative function (by task manager in win7 64bit).
however, if make non-ffdf object , sweep away, output of rm() , gc() ok.
guess ram unreleased connected specifics of ffdf objects , ff package.
so effective way clear ram quit current r-session , re-run again. not convinient.
have scanned bunch of posts memory cleaning including one:
tricks manage available memory in r session
have not found clear explanations of such situation , effective ways overcome (without resetting r-session). grateful comments.
Comments
Post a Comment