python - memory leak when using multiprocessing -
as in title, i'm struggling memory leak when using multiprocessing
. know question has been asked before, still cannot find right solution problem.
i have list of rgb images (30.000
total). need read each image, process 3 rgb channels, keep result in memory (to saved in 1
big file later)
i'm trying use this:
import multiprocessing mp import random import numpy np # define output queue store result output = mp.queue() # define example function def read_and_process_image(id, output): result = np.random.randint(256, size=(100, 100, 3)) #fake image output.put(result) # setup list of processes want run processes = [mp.process(target=read_and_process_image, args=(id, output)) id in range(30000)] # run processes p in processes: p.start() # # exit completed processes # p in processes: # p.join() # process results output queue results = [output.get() p in processes] print(results)
this code uses lot of memory. this answer explained problem, cannot find way apply code. suggestion? thanks!
edit: try joblib
, pool
class, code won't use cores expected (i see no difference between using normal for
loop these 2 cases)
i'd use pool limit number of processes spawned. i've written demonstration relying on code:
import multiprocessing mp import os import numpy np # define example function def read_and_process_image(_id): print("process %d working" % os.getpid()) return np.random.randint(256, size=(100, 100, 3)) # setup list of arguments want run function taskargs = [(_id) _id in range(100)] # open pool of processes pool = mp.pool(max(1, mp.cpu_count() // 2)) # run processes results = pool.map(read_and_process_image, taskargs) print(results)
i know arguemnts not used, thought you'd want see how in case need (also, i've changed id
_id
since id
builtin).
Comments
Post a Comment