R split data to frequency -


this question has answer here:

i have data set this, variable ("item") contains comma-separated codes:

id  item 1    102, 103,401, 2    108,102,301 3    103, 108 , 405, 505, 708 

for each id, frequencies of each separate item, this:

id  102  103   104   108  301 401 ... 1    1    1                    1 2    1                 1    1 3         1            1 

how can that?

we can mtabulate qdaptools

library(qdaptools) cbind(dat['id'], mtabulate(strsplit(dat$item, '\\s*,\\s*'))) #  id 102 103 108 301 401 405 505 708 #1  1   1   1   0   0   1   0   0   0 #2  2   1   0   1   1   0   0   0   0 #3  3   0   1   1   0   0   1   1   1 

note: data taken @thelatemail's post.


or option (if need sparsematrix)

library(matrix) #split 'item' column `list` lst <- strsplit(dat$item, '\\s*,\\s*') #get `unique` elements after `unlist`ing. un1 <- sort(unique(unlist(lst))) #create `sparsematrix` specifying row #column index along dim names (if needed) sm <-  sparsematrix(rep(dat$id, lengths(lst)),              match(unlist(lst), un1), x= 1,               dimnames=list(dat$id, un1)) sm #    3 x 8 sparse matrix of class "dgcmatrix" #   102 103 108 301 401 405 505 708 #1   1   1   .   .   1   .   .   . #2   1   .   1   1   .   .   .   . #3   .   1   1   .   .   1   1   1 

it can converted matrix wrapping as.matrix

as.matrix(sm) #   102 103 108 301 401 405 505 708 #1   1   1   0   0   1   0   0   0 #2   1   0   1   1   0   0   0   0 #3   0   1   1   0   0   1   1   1 

Comments

Popular posts from this blog

sublimetext3 - what keyboard shortcut is to comment/uncomment for this script tag in sublime -

java - No use of nillable="0" in SOAP Webservice -

ubuntu - Laravel 5.2 quickstart guide gives Not Found Error -