[os/exec]How to make sure that Wait() always complete?

jlory · January 9, 2019, 4:59pm

Hello, I have a weird issue where I spawn a process and the cmd.Wait() method never returns even though the spawned Linux process is dead, I saw this from the doc:

If any of c.Stdin, c.Stdout or c.Stderr are not an *os.File, Wait also waits for the respective I/O loop copying to or from the process to complete.

Wait waits for the command to exit and waits for any copying to stdin or copying from stdout or stderr to complete.

In my code I have this:

cmd.Stdout = os.Stdout
cmd.Stderr = os.Stdout

How can I make sure that those goroutines that do the I/O copy always returns when the process dies even if the copy is no completed? To add a bit more context I think that the subprocess forks() itself when it crashes… so it might create an issue with the stderr / stdout.

Thanks,

johandalabacka · January 10, 2019, 7:07am

Hi. Do you have any example code?

jlory · January 10, 2019, 7:05pm

It looks like something like that:

// Start the server
err = srv.Start()
if err != nil {
	return
}

// Setup traps for Signals
// TODO: maybe we need a larger buffer for the channel?
signalCh := make(chan os.Signal, 10)
signal.Notify(signalCh, syscall.SIGINT, syscall.SIGHUP, syscall.SIGTERM, syscall.SIGUSR1, syscall.SIGCHLD)

go func() {
	for signal := range signalCh {
		switch signal {
		case syscall.SIGINT:
			srv.sendSignal(syscall.SIGINT)
		case syscall.SIGHUP:
			srv.sendSignal(syscall.SIGHUP)
		case syscall.SIGTERM:
			srv.sendSignal(syscall.SIGTERM)
		case syscall.SIGUSR1:
			srv.sendSignal(syscall.SIGUSR1)
		default:
			logger.Warn(fmt.Sprintf("received an unhandled signal: %v", signal.String()))
		}
	}
}()

// Wait for the server to crash / exit gracefuly
err = srv.cmd.Wait()
if err != nil {
    // do A
} else {
    // do B
}

srv.cmd.Wait() nevers returns even though the spawned process crashed. ( I saw it on Linux using ps ). After reading the Wait() doc since os.Stdout is a *File it shouldn’t block so it’s probably a different issue.

johandalabacka · January 10, 2019, 7:42pm

What happens if you run the command with CombinedOutput() ? Does this return if program crashes?

jlory · January 10, 2019, 8:16pm

It’s very hard to reproduce actually, I’ve seen it 3 times, and it’s not something I’m able to reproduce on my side, it works 99% of the time.

johandalabacka · January 10, 2019, 9:03pm

Hard to test any solutions then. One way to do it on unix systems is to get the pid from the commands Process field.

 pid := srv.Process.Pid

and then check if the process exists by sending a 0 signal to it. It doesn’t really send a signal but an error is returned if pid doesn’t exist

killErr := syscall.Kill(pid, syscall.Signal(0))
procExists := killErr == nil

system · April 10, 2019, 9:03pm

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.