r - Simple and efficient way to select non-NA data range in data frames -


suppose have following data frame:

dat <- data.frame(a = c(1:3, na), b = c(letters[1:3], na), c = na)  > dat       b  c 1  1    na 2  2    b na 3  3    c na 4 na <na> na 

how select non-na region in efficient way?

this use:

ensurenonnarange <- function(dat) {   idx_col <- ! sapply(dat, function(ii) all(is.na(ii)))   idx_row <- ! sapply(1:nrow(dat), function(ii) all(is.na(unlist(dat[ii, ]))))   dat[idx_row, idx_col] }  > ensurenonnarange(dat)   b 1 1 2 2 b 3 3 c 

as today pointed useful function type.convert hadn't known before, thought there might exist neet "of-the-shelf" task in base r.

update

some comparisons based on answers/comments got:

ensurenonnarange2 <- function(dat) {   dat[rowsums(!is.na(dat)) != 0, colsums(!is.na(dat)) != 0] }  microbenchmark::microbenchmark(   = ensurenonnarange(dat),   b = ensurenonnarange2(dat) )  unit: microseconds  expr     min       lq     mean   median       uq     max neval     296.178 310.1070 346.2259 329.0210 349.9875 680.035   100     b 112.313 120.0845 134.1716 125.6555 133.7200 338.112   100 

while there may yet built-in function this, can subsetting.

when is.na passed entire data.frame, makes boolean mask, if sum rows , columns of !is.na(dat) (i.e. add true values of not na), sums of 0 rows , columns have only nas.

thus, if subset when our row , column sums != 0, left rows , columns non-na values:

> dat[rowsums(!is.na(dat)) != 0, colsums(!is.na(dat)) != 0]   b 1 1 2 2 b 3 3 c 

if not values in row or column na, approach leaves row/column:

> dat[2,2] <- na > dat[rowsums(!is.na(dat)) != 0, colsums(!is.na(dat)) != 0]      b 1 1    2 2 <na> 3 3    c 

(if you'd rather ditch rows/columns any nas, adjust exclamation points, or use complete.cases.)

further, should pretty super-fast, because rowsums , colsums highly optimized, should still work on huge data structures.


Comments

Popular posts from this blog

sublimetext3 - what keyboard shortcut is to comment/uncomment for this script tag in sublime -

java - No use of nillable="0" in SOAP Webservice -

ubuntu - Laravel 5.2 quickstart guide gives Not Found Error -