首页 > 编程 > Python > 正文

python僵尸进程产生的原因

2020-01-04 17:02:09
字体:
来源:转载
供稿:网友

在 unix 或 unix-like 的系统中,当一个子进程退出后,它就会变成一个僵尸进程,如果父进程没有通过 wait 系统调用来读取这个子进程的退出状态的话,这个子进程就会一直维持僵尸进程状态。

Zombie process - Wikipedia 中是这样描述的:

On Unix and Unix-like computer operating systems, a zombie process or defunct process is a process that has completed execution (via the exit system call) but still has an entry in the process table: it is a process in the "Terminated state". This occurs for child processes, where the entry is still needed to allow the parent process to read its child's exit status: once the exit status is read via the wait system call, the zombie's entry is removed from the process table and it is said to be "reaped". A child process always first becomes a zombie before being removed from the resource table. In most cases, under normal system operation zombies are immediately waited on by their parent and then reaped by the system – processes that stay zombies for a long time are generally an error and cause a resource leak.

并且僵尸进程无法通过 kill 命令来清除。

本文将探讨如何手动制造一个僵尸进程以及清除僵尸进程的办法。

手动制造一个僵尸进程

为了便于后面讲解清除僵尸进程的方法,我们使用日常开发中经常使用的 multiprocessing 模块来制造僵尸进程(准确的来说是制造一个长时间维持僵尸进程状态的子进程):

$ cat test_a.pyfrom multiprocessing import Process, current_processimport loggingimport osimport timelogging.basicConfig(  level=logging.DEBUG,  format='%(asctime)-15s - %(levelname)s - %(message)s')def run():  logging.info('exit child process %s', current_process().pid)  os._exit(3)p = Process(target=run)p.start()time.sleep(100)

测试:

$ python/143471.html">python/116628.html">python test_a.py &[1] 10091$ 2017-07-20 21:28:14,792 - INFO - exit child process 10106$ ps aux |grep 10106mozillazg       10126  0.0 0.0 2434836  740 s006 R+  0:00.00 grep 10106mozillazg       10106  0.0 0.0    0   0 s006 Z   0:00.00 (Python)

可以看到,子进程 10091 变成了僵尸进程。

既然已经可以控制僵尸进程的产生了,那我们就可以进入下一步如何清除僵尸进程了。

清除僵尸进程有两种方法:

•第一种方法就是结束父进程。当父进程退出的时候僵尸进程随后也会被清除。
• 第二种方法就是通过 wait 调用来读取子进程退出状态。我们可以通过处理 SIGCHLD 信号,在处理程序中调用 wait 系统调用来清除僵尸进程。

处理 SIGCHLD 信号

子进程退出时系统会向父进程发送 SIGCHLD 信号,父进程可以通过注册 SIGCHLD 信号处理程序,在信号处理程序中调用 wait
系统调用来清理僵尸进程。 $ cat test_b.py

import errnofrom multiprocessing import Process, current_processimport loggingimport osimport signalimport timelogging.basicConfig(  level=logging.DEBUG,  format='%(asctime)-15s - %(levelname)s - %(message)s')def run():  exitcode = 3  logging.info('exit child process %s with exitcode %s',         current_process().pid, exitcode)  os._exit(exitcode)def wait_child(signum, frame):  logging.info('receive SIGCHLD')  try:    while True:      # -1 表示任意子进程      # os.WNOHANG 表示如果没有可用的需要 wait 退出状态的子进程,立即返回不阻塞      cpid, status = os.waitpid(-1, os.WNOHANG)      if cpid == 0:        logging.info('no child process was immediately available')        break      exitcode = status >> 8      logging.info('child process %s exit with exitcode %s', cpid, exitcode)  except OSError as e:    if e.errno == errno.ECHILD:      logging.error('current process has no existing unwaited-for child processes.')    else:      raise  logging.info('handle SIGCHLD end')signal.signal(signal.SIGCHLD, wait_child)p = Process(target=run)p.start()while True:  time.sleep(100)

效果:

$ python test_b.py &[1] 10159$ 2017-07-20 21:28:56,085 - INFO - exit child process 10174 with exitcode 32017-07-20 21:28:56,088 - INFO - receive SIGCHLD2017-07-20 21:28:56,089 - INFO - child process 10174 exit with exitcode 32017-07-20 21:28:56,090 - ERROR - current process has no existing unwaited-for child processes.2017-07-20 21:28:56,090 - INFO - handle SIGCHLD end$ ps aux |grep 10174mozillazg       10194  0.0 0.0 2432788  556 s006 R+  0:00.00 grep 10174

可以看到,子进程退出变成僵尸进程后,系统给父进程发送了 SIGCHLD 信号,我们在 SIGCHLD 信号的处理程序中通过 os.waitpid 调用 wait 系统调用后阻止了子进程一直处于僵尸进程状态,从而实现了清除僵尸进程的效果。

 
发表评论 共有条评论
用户名: 密码:
验证码: 匿名发表