pandas - Converting float into range in Python -
i doing data analysis pandas , struggling find nice, clean way of summing range of numbers. have data frame column of floats, not interested in exact number, rough range. want run pivot , count how many values in range. therefore ideally want create new column in data frame, converts column of floats range. df[number] = 3.5, df[range] = 0-10
the ranges should 0-10, 10-20, ... >100
this may sound arbitrary, i've been struggling find answer on this. many thanks
pandas has cut function this
in [18]: s = pd.series(np.random.uniform(0, 110, 100)) in [19]: s out[19]: 0 57.614427 1 30.576853 2 95.578943 3 53.010340 4 63.947381 ... 95 42.252644 96 14.814418 97 81.271527 98 5.732966 99 90.932890 in [12]: s = pd.series(np.random.uniform(0, 110, 100)) in [13]: s out[13]: 0 2.652461 1 46.536276 2 6.455352 3 6.075963 4 40.013378 ... 95 39.775493 96 99.688307 97 41.064469 98 91.401904 99 60.580600 dtype: float64 in [14]: cuts = np.arange(0, 101, 10) in [15]: pd.cut(s, cuts) out[15]: 0 (0, 10] 1 (40, 50] 2 (0, 10] 3 (0, 10] 4 (40, 50] ... 95 (30, 40] 96 (90, 100] 97 (40, 50] 98 (90, 100] 99 (60, 70] dtype: category categories (10, object): [(0, 10] < (10, 20] < (20, 30] < (30, 40] ... (60, 70] < (70, 80] < (80, 90] < (90, 100]] see docs controlling happens endpoints.
note in 0.18 (coming out soonish) result intervalindex instead of categorical, make things nicer.
to counts per interval, use value_counts method
in [17]: pd.cut(s, cuts).value_counts() out[17]: (30, 40] 15 (40, 50] 13 (50, 60] 12 (60, 70] 10 (0, 10] 10 (90, 100] 8 (70, 80] 8 (80, 90] 7 (10, 20] 6 (20, 30] 3 dtype: int64
Comments
Post a Comment