machine learning - How to transform (calculate) two or several variables into one using R? -
i'm having difficulties merging 2 or several variables in data. i'm able in excel can't figure out how perform same thing in r.
basically want create 2 combined variables using variables below:
data1: creating variable combinea+b
country year a1 b1 **combinea1+b1** usa 2002 0 0 0 usa 2003 1 1 2 usa 2004 na 1 1 usa 2005 0 0 0 usa 2006 0 1 1 usa 2007 0 0 0 usa 2008 0 1 1 usa 2009 na na na usa 2010 0 1 1 usa 2011 na 0 0 usa 2012 0 1 1 usa 2013 0 0 0 usa 2014 0 1 1
creating variable "combinea1+b1" seems simple, need add 2 (a1 , b1). in excel simple , guess in r well. however, na values create problems when adding 2 variables. so, how create combinea1+b1 variable 1 above?
if both a1 , b1 have na's, combinea1+b1 should have na. if 1 has na value , other has 1 or 0 value, should give respective number (see ex usa 2004).
i'd create combine variable: "combinea1+b1+c1+d1"
data 2: creating variable "combinea1+b1+c1+d1"
country year a1 b1 c1 d1 combineabcd usa 2002 0 0 0 0 0 usa 2003 1 1 0 0 2 usa 2004 na 1 0 0 1 usa 2005 0 0 0 0 0 usa 2006 0 1 0 0 1 usa 2007 0 0 0 0 0 usa 2008 0 1 1 0 2 usa 2009 na na na na na usa 2010 0 1 1 0 2 usa 2011 na 0 0 0 0 usa 2012 0 1 1 0 2 usa 2013 0 0 0 0 0 usa 2014 0 1 1 0 2
i guess once know how create first combine variable i'll able well. although i'm not sure how these na's can handled?
grateful suggestions can come add these variable properly.
with little bit of searching, found this article. take no credit code.
mysum <- function(x) if (all(is.na(x))) na else sum(x, na.rm=t) df$combineda1b1 <- apply(df[, c("a1", "b1")], 1, mysum) df # country year a1 b1 combineda1b1 # 1 usa 2002 0 0 0 # 2 usa 2003 1 1 2 # 3 usa 2004 na 1 1 # 4 usa 2005 0 0 0 # 5 usa 2006 0 1 1 # 6 usa 2007 0 0 0 # 7 usa 2008 0 1 1 # 8 usa 2009 na na na # 9 usa 2010 0 1 1 # 10 usa 2011 na 0 0 # 11 usa 2012 0 1 1 # 12 usa 2013 0 0 0 # 13 usa 2014 0 1 1
Comments
Post a Comment