Python - Variation on [k] * n for list generation -


given list of elements [1,27,10,...] need generate list of n repetitions of each element, in [1, 1, ..., 1, 27, 27, ..., 27, 10, ..., 10]

what elegant, pythonic , fastest way of doing this?

answer

numpy fastest , concise solution.

np.repeat(my_list, n) looks pythonic (credit b.m.), while flattening numpy array seems marginally faster.

also have @ numba alternative below in b.m.'s post

more detail

i tested 3 approaches: i) double looping, ii) single looping indexing function , iii) flattening numpy array. (edit: 4th approach mike using extend, 5th approach np.repeat b.m., 6th approach comprehension gsb-eng, 7th approach itertools)

surprisingly, find flattening array by far fastest method on machine in python 2.7. however, on machines , in python 3 might want test itertools , comprehensions. can copy/paste python 2 code below quick check, sorted timeit results are:

flattened array: 8.8ms numpy repeat: 10.87ms extend list: 14.37ms itertools repeat: 14.91ms itertools  chain comprehension: 18.72ms itertools chain: 18.73ms double loop : 58.4ms single loop + index division: 251.29ms double loop + comprehension: 255.76ms 

and code generates result:

import numpy np  import timeit  n = 100  my_list = range(10) n_elements = len(my_list)  # === double loop ============================================================= def double_loop():     my_long_list = []      list_element in my_list:         my_long_list += [list_element] * n      return my_long_list  # === double loop comprehension =========================================================     def double_loop_comp():      # list comprehension     return [i in my_list j in xrange(n)]     # === single loop indexing function ====================================== def one_loop_with_indexing():     my_long_list = []      in range(n*n_elements):         my_long_list.append(my_list[i // n])        return my_long_list  # === flattened array ========================================================= def flattened_array():     my_array = np.zeros([n_elements, n])      in range(n_elements):         my_array[i,:] = my_list[i]      return my_array.flatten()  # === extend list ========================================================= def extend_list():     my_long_list = []     list_element in my_list:         my_long_list.extend([list_element] * n)     return my_long_list  # === numpy repeat ========================================================= def numpy_repeat():     return np.repeat(my_list, n)  # === itertools repeat ======================================================== def iter_repeat():     my_long_list = []     x in my_list:         my_long_list.extend( itertools.repeat(x,n) )     return my_long_list  # === itertools chain ========================================================= def iter_chain():     return list( itertools.chain.from_iterable( itertools.repeat(x,n) x in my_list ) )  # === itertools chain comp ==================================================== def iter_chain_comp():     return list( itertools.chain.from_iterable( [itertools.repeat(x,n) x in my_list] ) )    time_double_loop = timeit.timeit(double_loop, number=1000) time_double_loop_comp = timeit.timeit(double_loop_comp, number=1000) time_single_loop = timeit.timeit(one_loop_with_indexing, number=1000) time_flattened_array = timeit.timeit(flattened_array, number=1000) time_extend_list = timeit.timeit(extend_list, number=1000) time_np_repeat = timeit.timeit(numpy_repeat, number=1000) time_it_repeat = timeit.timeit(iter_repeat, number=1000) time_it_chain = timeit.timeit(iter_chain, number=1000) time_it_chain_comp = timeit.timeit(iter_chain_comp, number=1000)  print 'double loop : ' + str(round(time_double_loop*1000,2))+'ms' print 'double loop + comprehension: ' + str(round(time_double_loop_comp*1000,2))+'ms' print 'single loop + index division: ' + str(round(time_single_loop*1000,2))+'ms' print 'flattened array: ' + str(round(time_flattened_array*1000,2))+'ms' print 'extend list: ' + str(round(time_extend_list*1000,2))+'ms' print 'numpy repeat: ' + str(round(time_np_repeat*1000,2))+'ms' print 'itertools repeat: ' + str(round(time_it_repeat*1000,2))+'ms' print 'itertools chain: ' + str(round(time_it_chain*1000,2))+'ms' print 'itertools  chain comprehension: ' + str(round(time_it_chain_comp*1000,2))+'ms' 

in fast category, a=np.array(my_list) (100 elements in tests):

readable :

in [12]: %timeit np.repeat(a,100) 10000 loops, best of 3: 80.4 µs per loop 

tricky :

in [13]: %timeit np.lib.stride_tricks.as_strided(a,(100,100),(a.itemsize,0)).ravel() 10000 loops, best of 3: 29.5 µs per loop 

just in time compillation numba (after conda install numba)

from numba import jit  @jit  def numbarep(a,n):     res=np.empty(a.size*n,dtype=a.dtype)     offset=0     e in a:         k in range(offset,offset+n):             res[k]=e         offset+=n     return res   in [14]: %timeit numbarep(a,100) 100000 loops, best of 3: 14.8 µs per loop  

Comments

Popular posts from this blog

sublimetext3 - what keyboard shortcut is to comment/uncomment for this script tag in sublime -

java - No use of nillable="0" in SOAP Webservice -

ubuntu - Laravel 5.2 quickstart guide gives Not Found Error -