python - Is there a better way to collect unique index values in pandas? -
i've got data looks this:
>>> print totals.sample(4) start end \ time region_type 2016-01-24 02:17:10.238 stack guard 79940452352 79940665344 2016-01-23 20:14:17.043 malloc metadata 64688259072 64688996352 2016-01-22 23:20:53.752 iokit 47857778688 47861174272 2016-01-23 08:17:06.561 __data 3711964667904 3711979212800 vsize rsdnt dirty swap time region_type 2016-01-24 02:17:10.238 stack guard 212992 0 0 0 2016-01-23 20:14:17.043 malloc metadata 737280 81920 81920 8192 2016-01-22 23:20:53.752 iokit 3395584 24576 24576 3371008 2016-01-23 08:17:06.561 __data 14544896 4907008 618496 4780032
i want know region_type row dirty+swap greater 1e7:
this works, seems pretty verbose:
>>> print totals[(totals.dirty + totals.swap) > 1e7].groupby(level='region_type').\ apply(lambda x: 'lol').index.tolist() ['malloc_nano', 'malloc_small']
is there better way?
i have thought work, gives region_types in data set, not ones selected:
totals[(totals.dirty + totals.swap) > 1e7].index.levels[1].tolist()
use index.get_level_values
(which returns values used), not index.levels
(which returns values index knows about):
mask = totals['dirty']+totals['swap'] > 1e7 result = mask.loc[mask] region_types = result.index.get_level_values('region_type').unique()
for example,
in [243]: mask = totals['dirty']+totals['swap'] > 1e3; mask out[243]: time region_type 2016-01-24 02:17:10.238 stack guard false 2016-01-23 20:14:17.043 malloc metadata true 2016-01-22 23:20:53.752 iokit true 2016-01-23 08:17:06.561 __data true dtype: bool in [244]: result = mask.loc[mask]; result out[244]: time region_type 2016-01-23 20:14:17.043 malloc metadata true 2016-01-22 23:20:53.752 iokit true 2016-01-23 08:17:06.561 __data true dtype: bool in [245]: result.index.get_level_values('region_type').unique() out[245]: array(['malloc metadata', 'iokit', '__data'], dtype=object)
Comments
Post a Comment