PG: How to restart bgworker
postmaster 的代码路径
->ServerLoop
->WaitEventSetWait
->process_pm_child_exit
->waitpid(-1, )
->CleanupBackgroundWorker()
CleanupBackgroundWorker
的细节:
-
如果bgworker以 0 退出,则正常退出,不参与重启逻辑。
-
如果以 1 退出,进入后续的正常重启逻辑
-
否则,视为系统级 crash, 重启整个实例。
关键变量:
RegisteredBgWorker.rw_crashed_at
非 0 视为崩溃HaveCrashedWorker
postmaster 在本轮循环是是否检测到了 bgworker 崩溃StartWorkerNeeded
此刻是否需要重启 bgworker 。bgworker 支持设置重启间隔,所以 postmaster 在每次循环中,不总会重启所有崩溃的的 bgworker
appendix
waitpid
(Generated from AI, haven’t been confirmed by myself)
while ((pid = waitpid(-1, &exitstatus, WNOHANG)) > 0)
If multiple child processes have already terminated before this loop runs:
- The loop will collect them one by one, in the order determined by the operating system’s process table
- Each iteration of the loop will retrieve one zombie process
- The loop will continue until all terminated child processes have been collected
When a child process terminates in a Unix/Linux system, it doesn’t immediately disappear from the system. Instead, it enters a “zombie” state (sometimes called a “defunct” state). In this state:
- The process has finished execution
- Most resources have been freed
- But an entry in the process table is kept to allow the parent to retrieve the child’s exit status
This zombie state persists until the parent process “reaps” the child by calling wait()
or waitpid()
to collect its exit status.