python - Is there a better way to collect unique index values in pandas? -


i've got data looks this:

>>> print totals.sample(4)                                                  start            end  \ time                    region_type                                      2016-01-24 02:17:10.238 stack guard        79940452352    79940665344    2016-01-23 20:14:17.043 malloc metadata    64688259072    64688996352    2016-01-22 23:20:53.752 iokit              47857778688    47861174272    2016-01-23 08:17:06.561 __data           3711964667904  3711979212800                                                 vsize    rsdnt   dirty     swap   time                    region_type                                           2016-01-24 02:17:10.238 stack guard        212992        0       0        0   2016-01-23 20:14:17.043 malloc metadata    737280    81920   81920     8192   2016-01-22 23:20:53.752 iokit             3395584    24576   24576  3371008   2016-01-23 08:17:06.561 __data           14544896  4907008  618496  4780032   

i want know region_type row dirty+swap greater 1e7:

this works, seems pretty verbose:

>>> print totals[(totals.dirty + totals.swap) > 1e7].groupby(level='region_type').\          apply(lambda x: 'lol').index.tolist()      ['malloc_nano', 'malloc_small'] 

is there better way?

i have thought work, gives region_types in data set, not ones selected:

totals[(totals.dirty + totals.swap) > 1e7].index.levels[1].tolist() 

use index.get_level_values (which returns values used), not index.levels (which returns values index knows about):

mask = totals['dirty']+totals['swap'] > 1e7 result = mask.loc[mask] region_types = result.index.get_level_values('region_type').unique() 

for example,

in [243]: mask = totals['dirty']+totals['swap'] > 1e3; mask out[243]:  time                     region_type     2016-01-24 02:17:10.238  stack guard        false 2016-01-23 20:14:17.043  malloc metadata     true 2016-01-22 23:20:53.752  iokit               true 2016-01-23 08:17:06.561  __data              true dtype: bool  in [244]: result = mask.loc[mask]; result out[244]:  time                     region_type     2016-01-23 20:14:17.043  malloc metadata    true 2016-01-22 23:20:53.752  iokit              true 2016-01-23 08:17:06.561  __data             true dtype: bool  in [245]: result.index.get_level_values('region_type').unique() out[245]: array(['malloc metadata', 'iokit', '__data'], dtype=object) 

Comments

Popular posts from this blog

sublimetext3 - what keyboard shortcut is to comment/uncomment for this script tag in sublime -

java - No use of nillable="0" in SOAP Webservice -

ubuntu - Laravel 5.2 quickstart guide gives Not Found Error -