Killing child process on timeout in Go code

I have a situations where I need to kill a process after some time. I start the process and then

case <-time.After(timeout):
		if err := cmd.Process.Kill(); err != nil {
			return 0, fmt.Errorf("Failed to kill process: %v", err)
		}

kills the process. But it only kills the parent process not the 5-10 child processes that main process starts. I also tried creating a process group and then doing

syscall.Kill(-cmd.Process.Pid, syscall.SIGKILL)

to kill all the process, but not working. Need help from experts now.

Can you not trap the kill signal in the parent process, and have it reap it’s children, before terminating itself?

What was your code to create the process group? I’ve had success with that method in the past.

@dlclark:

cmd := exec.Command(execPath, args...)
cmd.Dir = workingDir
cmd.Env = env

//create a new process group
cmd.SysProcAttr = &syscall.SysProcAttr{Setpgid: true}

and then I am trying:

	if err := syscall.Kill(-cmd.Process.Pid, syscall.SIGKILL); err != nil {
		return 0, fmt.Errorf("Failed to kill process: %v", cmd.Process.Pid, err)

Ah, you want to kill the negative of the process group ID to syscall.Kill(), not the process ID:

pgid, err := syscall.Getpgid(cmd.Process.Pid)
if err == nil {
    if err := syscall.Kill(-pgid, syscall.SIGKILL); err != nil {
        ...
    }
}
2 Likes

If the kill is being done from the same parent process that created the
group, wouldn’t the pid and pgid be the same?

@dlclark: This one worked, I learned something today :smile:

@Justin_Israel is right. I put together a little test program and the group was created (and killable as a set) when I set cmd.SysProcAttr.Setpgid=false (the default). Setting that to true creates a unique process group for the pid, which means it’s not killable with its parent.

@VarunKSaini well that’s kind of surprising to me, but glad it worked. :smile:

@dlclark, I assume it worked because the process was created as a process group leader and then it goes and creates a bunch of child processes which are also inheriting that process group. Then killing the pgid of the process that was started would take down all the children that were in that group. But like you said after having performed a test… killing that main program (the launcher of the first process) wouldn’t take down the child because the child started a new group.

@dlclark @Justin_Israel Well it passed the tests which are basically running a process that creates 1-2 child processes. I still need to test it in production (after code review etc.). But I will like to learn more, isn’t cmd.SysProcAttr.Setpgid=true for seeting a group id so that you can kill the whole group instead of orphaning the child processes? Or there is lack of understanding from my side?

See https://github.com/tgulacsi/go/tree/master/proc

You should start a new process group for your subprocess, which will start child’s. But you have to clear Setpgid for those childs!

@Tamas_Gulacsi you suggest using *os.Process to kill the process related to command, right? If I am understanding right, my way still has possibilities to spare the child processes?

As I understood, you spawn a process (son), which will spawn several children (grandchildren); you want to kill the son, and want it to take down its children (the grandchildren), too.

For this, you have to call os.Exec with Cmd.Sysproc.Setpgid = true, as you found.

What I don’t know, how is the grandchildren spawned? If it is controlled by you, then watch out to not set Setpgid for them, as they need to be in its parent’s group, not in a new.

Well the children are not spawned by me, this is a build server and it spawns additional processes for tests on the fly, test postgres db etc. So generally it is like one main process and then 5-6 children + grand children processes.

You can’t kill them if they don’t want you to find them. Systemd’s usage of
cgroups is for this reason: they can’t escape the cgroup, even with
demonization (double fork).

So if you want to be sure, use cgroups.
Seehttps://github.com/kawamuray/cgrun for example.

Varun Saini forum@golangbridge.org ezt írta (időpont: 2015. okt. 4., V
22:51):

Thanks for the pointer. But I will see if it works with the group id, if not then I have not implement the complex one anyway :smile:

Hey @Tamas_Gulacsi,

Question - So I tried to do same as you are doing in your https://github.com/tgulacsi/go/proc/procutils.go

But if I have run a process such as
cmd := exec.Command("/bin/sh", "-c", "watch date > date.txt")

instead of killing the process on timeout, it just returns with success and the process move to PPID = 1. Which is problematic in my case. I want the child processes to kill not the parent one and in my case there is a 5-6 level deep process tree.

Any help is appreciated.

Can you share some code?

On Linux, the trick is to kill the whole process group, not just a process of it. This is achieved by sending a KILL signal to the negative PID of the process group’s leader.

So if you start a new process with Setpgid=true, then THAT child process will become its own process group’s leader, and also independent of its parent’s process group! So in this case killing tha parent won’t kill this child.

You’re saying this is a 5-6 level deep process tree - if you want to kill all of it, kill the process group’s leader, and ensure that neither child created a separate process group!

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.