Grouping in Python -
i have list of dictionaries (which uploaded using csv), , run "group by" equivalent based on 1 of "columns". trying group on teamid , sum "r" columns based on groupings.
i trying following code:
import itertools key, group in itertools.groupby(batting, lambda item: item["teamid"]): print key, sum([item["r"] item in group])
however, not seeing them grouped correctly. there multiple instances of same team id.
for example:
rc1 30 cl1 28 ws3 28 rc1 29 fw1 9 rc1 0 bs1 66 fw1 1 bs1 13 cl1 18
as padric said in comment, itertools.groupby()
needs ordered data want. simplest solution (as in least code edits) be:
import itertools key_func = lambda item: item["teamid"] key, group in itertools.groupby(sorted(batting, key=key_func), key_func): print key, sum([item["r"] item in group])
if data relatively big, may want consider more efficient doesn't require duplicate sorted copy in memory. defaultdict
mentioned in comment may choice.
from collections import defaultdict d = defaultdict(int) item in batting: d[item['teamid']] += item.get('r', 0) or 0 team, r_sum in sorted(d.items(), key=lambda x: x[0]): print team, r_sum
the code may need slight adjustments python 3.
Comments
Post a Comment