Connect to two different HDFS servers with different usernames in Spark -
is there way data hdfs (e.g. sc.textfile) 2 separate usernames in same spark job? instance, if have file on hdfs-server-1.com , alice user has permission view it, , have file on hdfs-server-2.com , bob user has permission view it, i'd able like:
val rdd1 = sc.textfile("hdfs://hdfs-server-1.com:9000/file.txt", user="alice") val rdd2 = sc.textfile("hdfs://hdfs-server-2.com:9000/file.txt", user="bob")
is there way this? or can spark connect hdfs same username it's running as?
as far know (and i've tried in past, spark 1.4.0), isn't possible: default, spark uses user running driver process accessing hdfs. user can overridden using hadoop_user_name
vm option when running driver application (e.g. -dhadoop_user_name=alice
). option read when constructing sparkcontext, can't changed afterwards.
Comments
Post a Comment