scala - Applying function to Spark Dataframe Column -

coming r, used doing operations on columns. there easy way take function i've written in scala

def round_tenths_place( un_rounded:double ) : double = {     val rounded = bigdecimal(un_rounded).setscale(1, bigdecimal.roundingmode.half_up).todouble     return rounded }

and apply 1 column of dataframe - kind of hoped do:

 bid_results.withcolumn("bid_price_bucket", round_tenths_place(bid_results("bid_price")) )

i haven't found easy way , struggling figure out how this. there's got easier way converting dataframe , rdd , selecting rdd of rows right field , mapping function across of values, yeah? , more succinct creating sql table , doing sparksql udf?

you can define udf follows:

val round_tenths_place_udf = udf(round_tenths_place _) bid_results.withcolumn(   "bid_price_bucket", val round_tenths_place_udf($"bid_price"))

although built-in round expression using same logic function , should more enough, not mention more efficient:

import org.apache.spark.sql.functions.round  bid_results.withcolumn("bid_price_bucket", round($"bid_price", 1))

Search This Blog

Ben

scala - Applying function to Spark Dataframe Column -

Comments

Post a Comment

Popular posts from this blog

sublimetext3 - what keyboard shortcut is to comment/uncomment for this script tag in sublime -

post - imageshack API cURL -

dataset - MPAndroidchart returning no chart Data available -