Connect to two different HDFS servers with different usernames in Spark -

is there way data hdfs (e.g. sc.textfile) 2 separate usernames in same spark job? instance, if have file on hdfs-server-1.com , alice user has permission view it, , have file on hdfs-server-2.com , bob user has permission view it, i'd able like:

val rdd1 = sc.textfile("hdfs://hdfs-server-1.com:9000/file.txt", user="alice") val rdd2 = sc.textfile("hdfs://hdfs-server-2.com:9000/file.txt", user="bob")

is there way this? or can spark connect hdfs same username it's running as?

as far know (and i've tried in past, spark 1.4.0), isn't possible: default, spark uses user running driver process accessing hdfs. user can overridden using hadoop_user_name vm option when running driver application (e.g. -dhadoop_user_name=alice). option read when constructing sparkcontext, can't changed afterwards.

Search This Blog

Ben

Connect to two different HDFS servers with different usernames in Spark -

Comments

Post a Comment

Popular posts from this blog

sublimetext3 - what keyboard shortcut is to comment/uncomment for this script tag in sublime -

post - imageshack API cURL -

dataset - MPAndroidchart returning no chart Data available -