python - Finding the most common pair, trio, etc. in a list of 20 random numbers, played 100 times -

so have csv 100 rows, 20 numbers on each row separated comma such:

23, 52, 63, 76, 23, 45, ... 39, 52, 83, 33, 35, 23, ... etc.

i write algorithm finds different combinations of pair, , list amount of times has occurred, , same trio, quad, etc.

obviously each line there lot of combinations (since order wouldn't matter)

this answer every line

so 15 combinations per line.. can imagine how many there if 20 instead of 6, let's not worry now.

since there 15, wouldn't want program show me every single combination, "count" higher 1, "count" being number of times have occurred.

so if ran program 2 lines above, 6 numbers instead of 20, program return

23, 52: 2

see how pair occurred more once shown, that's i'd do, easy, not show if count equal 1.

anyway, how started on creating algorithm? don't know start, guess begin scraping every line, , getting every pair, how this?

thank in advance, , no doubt keep hacking @ problem, if find solution post (as code). thank again.

code annotated inline.

import itertools  lines = [[23, 52, 63, 76, 23, 45],          [39, 52, 83, 33, 35, 23]]  # set store unique elements # across lines uniq = set()  # list hold dict each line # these dicts contain frequency # of each unique number in line  freq = [{} _ in range(len(lines))] index, line in enumerate(lines):     j in line:          freq[index][j] = freq[index].get(j, 0) + 1         uniq.add(j)  # store frequency of each possible pair counter = {}  # there k * (k - 1) / 2 combinations, # k number of unique pairs  in itertools.combinations(uniq, 2):     j in range(len(lines)):          freq1 = freq[j].get(i[0], 0)         freq2 = freq[j].get(i[1], 0)          # multiplying frequencies gives number         # of pairs these numbers in line          counter[i] = counter.get(i, 0) + (freq1 * freq2)  # performs descending sort on pairs sol = sorted(counter.items(), key=lambda value: value[1], reverse=true) print(sol)

output:

   [((52, 23), 3), ((45, 23), 2), ((23, 63), 2), ((76, 23), 2), ((33, 35), 1),     ((76, 45), 1), ((52, 63), 1), ((39, 83), 1), ((76, 52), 1), ((45, 52), 1),     ((35, 39), 1), ((39, 23), 1), ((33, 52), 1), ((39, 52), 1), ((33, 23), 1),     ((35, 23), 1), ((35, 52), 1), ((76, 63), 1), ((33, 83), 1), ((35, 83), 1),     ((33, 39), 1), ((83, 23), 1), ((45, 63), 1), ((83, 52), 1), ((83, 63), 0),     ((35, 76), 0), ((33, 76), 0), ((35, 45), 0), ((33, 45), 0), ((45, 83), 0),     ((39, 63), 0), ((33, 63), 0), ((39, 76), 0), ((39, 45), 0), ((76, 83), 0),     ((35, 63), 0)]

beware! algorithm run in ~ o(n^4) if list consists of unique values.

edit: here version should perform better (it maintains uniq set every line, , hence avoids zero-size combinations).

import functools import itertools import operator  lines = [[23, 52, 63, 76, 23, 45],          [39, 52, 83, 33, 35, 23]]  # set variable before use! permlength = 2  uniq = [set() _ in range(len(lines))] freq = [{} _ in range(len(lines))]  index, line in enumerate(lines):     j in line:         freq[index][j] = freq[index].get(j, 0) + 1         uniq[index].add(j)  counter = {}  print("total number of lines: {}".format(len(lines)))  in range(len(lines)):     print("line {}...".format(i + 1))      j in itertools.combinations(uniq[i], permlength):          freqp = [freq[i].get(j[x], 0) x in range(permlength)]         counter[j] = counter.get(j, 0) + functools.reduce(operator.mul, freqp)  sol = sorted(counter.items(), key=lambda value: value[1], reverse=true)  print(sol)

output:

[((52, 23), 3), ((45, 23), 2), ((63, 23), 2), ((76, 23), 2), ((76, 52), 1),  ((76, 45), 1), ((52, 63), 1), ((39, 83), 1), ((35, 39), 1), ((45, 52), 1),  ((33, 52), 1), ((39, 52), 1), ((33, 23), 1), ((35, 23), 1), ((35, 52), 1),  ((39, 23), 1), ((76, 63), 1), ((33, 83), 1), ((33, 39), 1), ((83, 23), 1),  ((45, 63), 1), ((83, 52), 1), ((33, 35), 1), ((35, 83), 1)]

Search This Blog

Ben

python - Finding the most common pair, trio, etc. in a list of 20 random numbers, played 100 times -

Comments

Post a Comment

Popular posts from this blog

routing - AngularJS State management ->load multiple states in one page -

python - GRASS parser() error -

Swift game error message -