linux - Easy way to tell apart python multiprocessing's OS processes -
summary
i'd use python multiprocessing module run multiple jobs in parallel on linux server. further, i'd able @ running processes top
or ps
, kill
1 of them let others run.
however, i'm seeing every process launched python multiprocessing module looks identical ps -f
command.
all i'm seeing this:
fermion:workspace ross$ ps -f uid pid ppid c stime tty time cmd 501 32257 32256 0 8:52pm ttys000 0:00.04 -bash 501 32333 32257 0 9:05pm ttys000 0:00.04 python ./parallel_jobs.py 501 32334 32333 0 9:05pm ttys000 0:00.00 python ./parallel_jobs.py 501 32335 32333 0 9:05pm ttys000 0:00.00 python ./parallel_jobs.py 501 32336 32333 0 9:05pm ttys000 0:00.00 python ./parallel_jobs.py 501 32272 32271 0 8:53pm ttys001 0:00.05 -bash
is there way more descriptive in cmd column? need keep track of pids in log files? or there option?
background
i doing batch processing jobs can run hours. need able run of jobs in parallel save time. , parallel jobs need complete before can run job depends on them all. however, if 1 job misbehaving want able kill while letting others complete... , goes 1 have 1 job, parallel jobs, few more jobs in sequence, more parallel jobs...
example code
this dummy code outlines concept of i'm trying do.
#!/usr/bin/env python import time import multiprocessing def open_zoo_cages(): print('opening zoo cages...') def crossing_road(animal, sleep_time): print('an ' + animal + ' crossing road') in range(5): print("it's wide road " + animal + " cross...") time.sleep(sleep_time) print('the ' + animal + ' across.') def aardvark(): crossing_road('aardvark', 2) def badger(): crossing_road('badger', 4) def cougar(): crossing_road('cougar', 3) def clean_the_road(): print('cleaning off road of animal droppings...') def print_exit_code(process): print(process.name + " exit code: " + str(process.exitcode)) def main(): # run single job must finish before running jobs in parallel open_zoo_cages() # run jobs in parallel amos = multiprocessing.process(name='aardvark amos', target=aardvark) betty = multiprocessing.process(name='badger betty', target=badger) carl = multiprocessing.process(name='cougar carl', target=cougar) amos.start() betty.start() carl.start() amos.join() betty.join() carl.join() print_exit_code(amos) print_exit_code(betty) print_exit_code(carl) # run job (clean_the_road) if parallel jobs finished in # success. otherwise end in error. if amos.exitcode == 0 , betty.exitcode == 0 , carl.exitcode == 0: clean_the_road() else: sys.exit('not animals finished crossing') if __name__ == '__main__': main()
also, noted putting 1 of functions in python module doesn't change goes in ps
command column associated process.
output
fermion:workspace ross$ ./parallel_jobs.py opening zoo cages... aardvark crossing road it's wide road aardvark cross... badger crossing road it's wide road badger cross... cougar crossing road it's wide road cougar cross... it's wide road aardvark cross... it's wide road cougar cross... it's wide road aardvark cross... it's wide road badger cross... it's wide road cougar cross... it's wide road aardvark cross... it's wide road badger cross... it's wide road aardvark cross... it's wide road cougar cross... aardvark across. it's wide road badger cross... it's wide road cougar cross... cougar across. it's wide road badger cross... badger across. aardvark amos exit code: 0 badger betty exit code: 0 cougar carl exit code: 0 cleaning off road of animal droppings...
the nice easy answer, have each process open descriptive file handle, , use lsof.
f = open('/tmp/hippo.txt','w')
this give pid process
lsof | grep "hippo"
it's not pythonic answer, : )
my initial answer easy way, here incomplete tiny example of larger concept, adding signal handler class being called subprocess, allows issue kill -6 ... dump out info .... can use on demand dump out progress of how left process in given subprocess,
import signal class foo(): def __init__(self, name): self.myname = name signal.signal(signal.sigterm, self.my_callback) self.myqueue = queue.queue() def my_callback(self): logging.error("%s %s %s", self.myname, psutil.blah_getmypid(), len(self.myqueue))
or can this, think may want:
import multiprocess, time def foo(): time.sleep(60) if __name__ == "__main__": process = [ multiprocessing.process(name="a",target=foo), multiprocessing.process(name="b",target=foo), multiprocessing.process(name="c",target=foo), ] p in process: p.start() p in process: print(p.name, p.pid) p in process: p.join()
Comments
Post a Comment