I have a situations where I need to kill a process after some time. I start the process and then
case <-time.After(timeout):
if err := cmd.Process.Kill(); err != nil {
return 0, fmt.Errorf("Failed to kill process: %v", err)
}
kills the process. But it only kills the parent process not the 5-10 child processes that main process starts. I also tried creating a process group and then doing
syscall.Kill(-cmd.Process.Pid, syscall.SIGKILL)
to kill all the process, but not working. Need help from experts now.
cmd := exec.Command(execPath, args...)
cmd.Dir = workingDir
cmd.Env = env
//create a new process group
cmd.SysProcAttr = &syscall.SysProcAttr{Setpgid: true}
and then I am trying:
if err := syscall.Kill(-cmd.Process.Pid, syscall.SIGKILL); err != nil {
return 0, fmt.Errorf("Failed to kill process: %v", cmd.Process.Pid, err)
@Justin_Israel is right. I put together a little test program and the group was created (and killable as a set) when I set cmd.SysProcAttr.Setpgid=false (the default). Setting that to true creates a unique process group for the pid, which means it’s not killable with its parent.
@dlclark, I assume it worked because the process was created as a process group leader and then it goes and creates a bunch of child processes which are also inheriting that process group. Then killing the pgid of the process that was started would take down all the children that were in that group. But like you said after having performed a test… killing that main program (the launcher of the first process) wouldn’t take down the child because the child started a new group.
@dlclark@Justin_Israel Well it passed the tests which are basically running a process that creates 1-2 child processes. I still need to test it in production (after code review etc.). But I will like to learn more, isn’t cmd.SysProcAttr.Setpgid=true for seeting a group id so that you can kill the whole group instead of orphaning the child processes? Or there is lack of understanding from my side?
@Tamas_Gulacsi you suggest using *os.Process to kill the process related to command, right? If I am understanding right, my way still has possibilities to spare the child processes?
As I understood, you spawn a process (son), which will spawn several children (grandchildren); you want to kill the son, and want it to take down its children (the grandchildren), too.
For this, you have to call os.Exec with Cmd.Sysproc.Setpgid = true, as you found.
What I don’t know, how is the grandchildren spawned? If it is controlled by you, then watch out to not set Setpgid for them, as they need to be in its parent’s group, not in a new.
Well the children are not spawned by me, this is a build server and it spawns additional processes for tests on the fly, test postgres db etc. So generally it is like one main process and then 5-6 children + grand children processes.
You can’t kill them if they don’t want you to find them. Systemd’s usage of
cgroups is for this reason: they can’t escape the cgroup, even with
demonization (double fork).
So if you want to be sure, use cgroups.
Seehttps://github.com/kawamuray/cgrun for example.
But if I have run a process such as cmd := exec.Command("/bin/sh", "-c", "watch date > date.txt")
instead of killing the process on timeout, it just returns with success and the process move to PPID = 1. Which is problematic in my case. I want the child processes to kill not the parent one and in my case there is a 5-6 level deep process tree.
On Linux, the trick is to kill the whole process group, not just a process of it. This is achieved by sending a KILL signal to the negative PID of the process group’s leader.
So if you start a new process with Setpgid=true, then THAT child process will become its own process group’s leader, and also independent of its parent’s process group! So in this case killing tha parent won’t kill this child.
You’re saying this is a 5-6 level deep process tree - if you want to kill all of it, kill the process group’s leader, and ensure that neither child created a separate process group!