scala - Applying function to Spark Dataframe Column -
coming r, used doing operations on columns. there easy way take function i've written in scala
def round_tenths_place( un_rounded:double ) : double = { val rounded = bigdecimal(un_rounded).setscale(1, bigdecimal.roundingmode.half_up).todouble return rounded }
and apply 1 column of dataframe - kind of hoped do:
bid_results.withcolumn("bid_price_bucket", round_tenths_place(bid_results("bid_price")) )
i haven't found easy way , struggling figure out how this. there's got easier way converting dataframe , rdd , selecting rdd of rows right field , mapping function across of values, yeah? , more succinct creating sql table , doing sparksql udf?
you can define udf follows:
val round_tenths_place_udf = udf(round_tenths_place _) bid_results.withcolumn( "bid_price_bucket", val round_tenths_place_udf($"bid_price"))
although built-in round
expression using same logic function , should more enough, not mention more efficient:
import org.apache.spark.sql.functions.round bid_results.withcolumn("bid_price_bucket", round($"bid_price", 1))
see also:
Comments
Post a Comment