sparse matrix - Spark MLlib RowMatrix from SparseVector -


i trying create rowmatrix rdd of sparsevectors getting following error:

<console>:37: error: type mismatch;  found   : datarows.type (with underlying type org.apache.spark.rdd.rdd[org.apache.spark.mllib.linalg.sparsevector])  required: org.apache.spark.rdd.rdd[org.apache.spark.mllib.linalg.vector] note: org.apache.spark.mllib.linalg.sparsevector <: org.apache.spark.mllib.linalg.vector (and datarows.type <: org.apache.spark.rdd.rdd[org.apache.spark.mllib.linalg.sparsevector]), class rdd invariant in type t. may wish define t +t instead. (sls 4.5)        val svd = new rowmatrix(datarows.persist()).computesvd(20, computeu = true) 

my code is:

import org.apache.spark.mllib.linalg.distributed.rowmatrix import org.apache.spark.mllib.linalg._ import org.apache.spark.{sparkconf, sparkcontext}  val data_file_dir = "/user/cloudera/data/" val data_file_name = "dataoct.txt"  val datarows = sc.textfile(data_file_dir.concat(data_file_name)).map(line => vectors.dense(line.split(" ").map(_.todouble)).tosparse)  val svd = new rowmatrix(datarows.persist()).computesvd(20, computeu = true) 

my input data file approximately 150 rows 50,000 columns of space separated integers.

i running:

spark: version 1.5.0-cdh5.5.1  java: 1.7.0_67 

just provide explicit type annotation either rdd

val datarows: org.apache.spark.rdd.rdd[vector] = ??? 

or result of anonymous function:

...   .map(line => vectors.dense(line.split(" ").map(_.todouble)).tosparse: vector) 

Comments

Popular posts from this blog

sublimetext3 - what keyboard shortcut is to comment/uncomment for this script tag in sublime -

java - No use of nillable="0" in SOAP Webservice -

ubuntu - Laravel 5.2 quickstart guide gives Not Found Error -