scala - Applying function to Spark Dataframe Column -


coming r, used doing operations on columns. there easy way take function i've written in scala

def round_tenths_place( un_rounded:double ) : double = {     val rounded = bigdecimal(un_rounded).setscale(1, bigdecimal.roundingmode.half_up).todouble     return rounded } 

and apply 1 column of dataframe - kind of hoped do:

 bid_results.withcolumn("bid_price_bucket", round_tenths_place(bid_results("bid_price")) ) 

i haven't found easy way , struggling figure out how this. there's got easier way converting dataframe , rdd , selecting rdd of rows right field , mapping function across of values, yeah? , more succinct creating sql table , doing sparksql udf?

you can define udf follows:

val round_tenths_place_udf = udf(round_tenths_place _) bid_results.withcolumn(   "bid_price_bucket", val round_tenths_place_udf($"bid_price")) 

although built-in round expression using same logic function , should more enough, not mention more efficient:

import org.apache.spark.sql.functions.round  bid_results.withcolumn("bid_price_bucket", round($"bid_price", 1)) 

see also:


Comments

Popular posts from this blog

sublimetext3 - what keyboard shortcut is to comment/uncomment for this script tag in sublime -

java - No use of nillable="0" in SOAP Webservice -

ubuntu - Laravel 5.2 quickstart guide gives Not Found Error -