get specific row from spark dataframe -
is there alternative df[100, c("column")] in scala spark data frames. want select specific row column of spark data frame. example 100th row in above r equivalent code
firstly, must understand dataframes distributed, means can't access them in typical procedural way, must analysis first. although, asking scala suggest read pyspark documentation, because has more examples of other documentations.
however, continuing explanation, use methods of rdd api cause dataframes have 1 rdd attribute. please, see example bellow, , notice how take 2nd record.
df = sqlcontext.createdataframe([("a", 1), ("b", 2), ("c", 3)], ["letter", "name"]) myindex = 1 values = (df.rdd.zipwithindex() .filter(lambda ((l, v), i): == myindex) .map(lambda ((l,v), i): (l, v)) .collect()) print(values[0]) # (u'b', 2) hopefully, gives solution fewer steps.
Comments
Post a Comment