machine learning - How to transform (calculate) two or several variables into one using R? -


i'm having difficulties merging 2 or several variables in data. i'm able in excel can't figure out how perform same thing in r.

basically want create 2 combined variables using variables below:

data1: creating variable combinea+b

country  year       a1         b1        **combinea1+b1** usa      2002       0          0            0 usa      2003       1          1            2 usa      2004       na         1            1 usa      2005       0          0            0 usa      2006       0          1            1 usa      2007       0          0            0 usa      2008       0          1            1 usa      2009       na         na           na usa      2010       0          1            1 usa      2011       na         0            0 usa      2012       0          1            1 usa      2013       0          0            0 usa      2014       0          1            1 

creating variable "combinea1+b1" seems simple, need add 2 (a1 , b1). in excel simple , guess in r well. however, na values create problems when adding 2 variables. so, how create combinea1+b1 variable 1 above?

if both a1 , b1 have na's, combinea1+b1 should have na. if 1 has na value , other has 1 or 0 value, should give respective number (see ex usa 2004).

i'd create combine variable: "combinea1+b1+c1+d1"

data 2: creating variable "combinea1+b1+c1+d1"

country year    a1  b1  c1  d1  combineabcd usa     2002    0   0   0   0   0 usa     2003    1   1   0   0   2 usa     2004    na  1   0   0   1 usa     2005    0   0   0   0   0 usa     2006    0   1   0   0   1 usa     2007    0   0   0   0   0 usa     2008    0   1   1   0   2 usa     2009    na  na  na  na  na usa     2010    0   1   1   0   2 usa     2011    na  0   0   0   0 usa     2012    0   1   1   0   2 usa     2013    0   0   0   0   0 usa     2014    0   1   1   0   2 

i guess once know how create first combine variable i'll able well. although i'm not sure how these na's can handled?

grateful suggestions can come add these variable properly.

with little bit of searching, found this article. take no credit code.

mysum <- function(x) if (all(is.na(x))) na else sum(x, na.rm=t)  df$combineda1b1 <- apply(df[, c("a1", "b1")], 1, mysum)  df #    country year a1 b1 combineda1b1 # 1      usa 2002  0  0            0 # 2      usa 2003  1  1            2 # 3      usa 2004 na  1            1 # 4      usa 2005  0  0            0 # 5      usa 2006  0  1            1 # 6      usa 2007  0  0            0 # 7      usa 2008  0  1            1 # 8      usa 2009 na na           na # 9      usa 2010  0  1            1 # 10     usa 2011 na  0            0 # 11     usa 2012  0  1            1 # 12     usa 2013  0  0            0 # 13     usa 2014  0  1            1 

Comments

Popular posts from this blog

sublimetext3 - what keyboard shortcut is to comment/uncomment for this script tag in sublime -

java - No use of nillable="0" in SOAP Webservice -

ubuntu - Laravel 5.2 quickstart guide gives Not Found Error -