Python - Variation on [k] * n for list generation -
given list of elements [1,27,10,...]
need generate list of n
repetitions of each element, in [1, 1, ..., 1, 27, 27, ..., 27, 10, ..., 10]
what elegant, pythonic , fastest way of doing this?
answer
numpy
fastest , concise solution.
np.repeat(my_list, n)
looks pythonic (credit b.m.), while flattening numpy array seems marginally faster.
also have @ numba
alternative below in b.m.'s post
more detail
i tested 3 approaches: i) double looping, ii) single looping indexing function , iii) flattening numpy array. (edit: 4th approach mike using extend
, 5th approach np.repeat
b.m., 6th approach comprehension gsb-eng, 7th approach itertools)
surprisingly, find flattening array by far fastest method on machine in python 2.7. however, on machines , in python 3 might want test itertools , comprehensions. can copy/paste python 2 code below quick check, sorted timeit
results are:
flattened array: 8.8ms numpy repeat: 10.87ms extend list: 14.37ms itertools repeat: 14.91ms itertools chain comprehension: 18.72ms itertools chain: 18.73ms double loop : 58.4ms single loop + index division: 251.29ms double loop + comprehension: 255.76ms
and code generates result:
import numpy np import timeit n = 100 my_list = range(10) n_elements = len(my_list) # === double loop ============================================================= def double_loop(): my_long_list = [] list_element in my_list: my_long_list += [list_element] * n return my_long_list # === double loop comprehension ========================================================= def double_loop_comp(): # list comprehension return [i in my_list j in xrange(n)] # === single loop indexing function ====================================== def one_loop_with_indexing(): my_long_list = [] in range(n*n_elements): my_long_list.append(my_list[i // n]) return my_long_list # === flattened array ========================================================= def flattened_array(): my_array = np.zeros([n_elements, n]) in range(n_elements): my_array[i,:] = my_list[i] return my_array.flatten() # === extend list ========================================================= def extend_list(): my_long_list = [] list_element in my_list: my_long_list.extend([list_element] * n) return my_long_list # === numpy repeat ========================================================= def numpy_repeat(): return np.repeat(my_list, n) # === itertools repeat ======================================================== def iter_repeat(): my_long_list = [] x in my_list: my_long_list.extend( itertools.repeat(x,n) ) return my_long_list # === itertools chain ========================================================= def iter_chain(): return list( itertools.chain.from_iterable( itertools.repeat(x,n) x in my_list ) ) # === itertools chain comp ==================================================== def iter_chain_comp(): return list( itertools.chain.from_iterable( [itertools.repeat(x,n) x in my_list] ) ) time_double_loop = timeit.timeit(double_loop, number=1000) time_double_loop_comp = timeit.timeit(double_loop_comp, number=1000) time_single_loop = timeit.timeit(one_loop_with_indexing, number=1000) time_flattened_array = timeit.timeit(flattened_array, number=1000) time_extend_list = timeit.timeit(extend_list, number=1000) time_np_repeat = timeit.timeit(numpy_repeat, number=1000) time_it_repeat = timeit.timeit(iter_repeat, number=1000) time_it_chain = timeit.timeit(iter_chain, number=1000) time_it_chain_comp = timeit.timeit(iter_chain_comp, number=1000) print 'double loop : ' + str(round(time_double_loop*1000,2))+'ms' print 'double loop + comprehension: ' + str(round(time_double_loop_comp*1000,2))+'ms' print 'single loop + index division: ' + str(round(time_single_loop*1000,2))+'ms' print 'flattened array: ' + str(round(time_flattened_array*1000,2))+'ms' print 'extend list: ' + str(round(time_extend_list*1000,2))+'ms' print 'numpy repeat: ' + str(round(time_np_repeat*1000,2))+'ms' print 'itertools repeat: ' + str(round(time_it_repeat*1000,2))+'ms' print 'itertools chain: ' + str(round(time_it_chain*1000,2))+'ms' print 'itertools chain comprehension: ' + str(round(time_it_chain_comp*1000,2))+'ms'
in fast category, a=np.array(my_list)
(100 elements in tests):
readable :
in [12]: %timeit np.repeat(a,100) 10000 loops, best of 3: 80.4 µs per loop
tricky :
in [13]: %timeit np.lib.stride_tricks.as_strided(a,(100,100),(a.itemsize,0)).ravel() 10000 loops, best of 3: 29.5 µs per loop
just in time compillation numba (after conda install numba
)
from numba import jit @jit def numbarep(a,n): res=np.empty(a.size*n,dtype=a.dtype) offset=0 e in a: k in range(offset,offset+n): res[k]=e offset+=n return res in [14]: %timeit numbarep(a,100) 100000 loops, best of 3: 14.8 µs per loop
Comments
Post a Comment