Runtime: epollwait on fd 3 failed with 9; A Go bug?

Greetings, friends! I am brand new to the forum and so I’ll be glad to accept pointers on how to get around and be a good participant. I’ve come with an issue and to ask for your advice.

I’m running go version go1.14.1 linux/amd64 on Ubuntu 18.04.5 LTS with stock kernel 5.8.0-41-generic. I build with plain old go build and no special environment. My program raises a runtime error on the runtime stack for epollwait() failing with EBADF. I could be misreading the backtraces, but I don’t think this arises from the program code goroutines. Normally I don’t even see a runtime stack section in the panic() dump.

untime: epollwait on fd 3 failed with 9      
fatal error: runtime: netpoll failed
                                           
runtime stack:                        
runtime.throw(0xa18a04, 0x17)      
        /usr/local/go/src/runtime/panic.go:1114 +0x72
runtime.netpoll(0x0, 0x0)                     
        /usr/local/go/src/runtime/netpoll_epoll.go:123 +0x363
runtime.findrunnable(0xc000034000, 0x0)    
        /usr/local/go/src/runtime/proc.go:2126 +0xc60
runtime.schedule()                                     
        /usr/local/go/src/runtime/proc.go:2520 +0x2fc
runtime.park_m(0xc000001680)
        /usr/local/go/src/runtime/proc.go:2690 +0x9d
runtime.mcall(0x0)
        /usr/local/go/src/runtime/asm_amd64.s:318 +0x5b
                                                     
goroutine 1 [select]:               
net/http.(*Transport).getConn(0xc0000eb540, 0xc00012abd0, 0x0, 0xc0000d86c0, 0x5, 0xc0000d8700, 0x13, 0x0, 0x0, 0x0, ...)
        /usr/local/go/src/net/http/transport.go:1291 +0x57b
net/http.(*Transport).roundTrip(0xc0000eb540, 0xc0000c5300, 0xc0000a6d60, 0xc00011b1f0, 0x40e488)
        /usr/local/go/src/net/http/transport.go:552 +0x726                         
net/http.(*Transport).RoundTrip(0xc0000eb540, 0xc0000c5300, 0xc0000eb540, 0x0, 0x0)
        /usr/local/go/src/net/http/roundtrip.go:17 +0x35                            
net/http.send(0xc0000c5300, 0xad9e40, 0xc0000eb540, 0x0, 0x0, 0x0, 0xc0000b4140, 0xc, 0x1, 0x0)
        /usr/local/go/src/net/http/client.go:252 +0x43e
net/http.(*Client).send(0xc00012a900, 0xc0000c5300, 0x0, 0x0, 0x0, 0xc0000b4140, 0x0, 0x1, 0xd)
        /usr/local/go/src/net/http/client.go:176 +0xfa                                      
[... down through Hashicorp Vault client and application code ...]
main.main()
        /home/jmarks1/projects/20200124-loadvac/src/load_vac/load_vac.go:31 +0x53

goroutine 19 [chan receive]:
net/http.(*persistConn).addTLS(0xc0000c7440, 0xc0000d8700, 0xf, 0x0, 0xc0000d8710, 0x3)
        /usr/local/go/src/net/http/transport.go:1459 +0x1d3
net/http.(*Transport).dialConn(0xc0000eb540, 0xae4d40, 0xc0000b6de0, 0x0, 0xc0000d86c0, 0x5, 0xc0000d8700, 0x13, 0x0, 0xc0000c7440, ...)
        /usr/local/go/src/net/http/transport.go:1529 +0x1c5d
net/http.(*Transport).dialConnFor(0xc0000eb540, 0xc0000e0580)
        /usr/local/go/src/net/http/transport.go:1365 +0xc6
created by net/http.(*Transport).queueForDial
        /usr/local/go/src/net/http/transport.go:1334 +0x3fe

goroutine 5 [runnable]:
encoding/base64.(*Encoding).Decode(0xc0000ba000, 0xc000397200, 0x5d6, 0x5d6, 0xc00046601c, 0x7ca, 0x9e5, 0x0, 0x200, 0x0)
        /usr/local/go/src/encoding/base64/base64.go:471 +0x744
encoding/pem.Decode(0xc000466000, 0x801, 0xa01, 0x801, 0xa01, 0x0, 0x0)
        /usr/local/go/src/encoding/pem/pem.go:168 +0x766
crypto/x509.(*CertPool).AppendCertsFromPEM(0xc00007e780, 0xc000466000, 0x801, 0xa01, 0xa01)
        /usr/local/go/src/crypto/x509/cert_pool.go:131 +0x64
crypto/x509.loadSystemRoots(0x0, 0x7f1f0936ca88, 0xc000117628)
        /usr/local/go/src/crypto/x509/root_unix.go:75 +0x504
[... through X509 and associated code ...]
crypto/tls.(*Conn).clientHandshake(0xc000050e00, 0x0, 0x0)
        /usr/local/go/src/crypto/tls/handshake_client.go:206 +0x5ef
crypto/tls.(*Conn).Handshake(0xc000050e00, 0x0, 0x0)
        /usr/local/go/src/crypto/tls/conn.go:1340 +0xcc
net/http.(*persistConn).addTLS.func2(0x0, 0xc000050e00, 0xc0000200f0, 0xc0000740c0)
        /usr/local/go/src/net/http/transport.go:1453 +0x42
created by net/http.(*persistConn).addTLS
        /usr/local/go/src/net/http/transport.go:1449 +0x1aa

This is a Hashicorp Vault client program and this only seems to happen when attempting to send specific data to Vault, and only against one Vault server in particular. I speculate the server version may not like the input and perhaps … the server thread/process is crashing or something … and dropping the connection midflight. (I don’t know for sure nor have direct access to the server.)

The interesting thing is that I cannot recover() from this particular panic. In the simple main function, the deferred routine does not get invoked when the panic occurs. If I uncomment the annotated panic line, which circumvents the epollwait() panic, this panic is recovered.

func main() {
        defer func() {
                fmt.Fprintln(os.Stderr, "DEFERRED FUNC RUNNING")

                if rval := recover(); rval != nil {
                        errQuit(errors.Errorf("%s", rval), "Failed")
                }
        }()

        // If I uncomment the following line the deferred method does run
        // panic("fooooo")

        if err := cmd.RootCmd.Execute(); err != nil {
                errQuit(err, "Failed")
        }
}

Does this “magic unrecoverable panic” look like a Golang bug, or just a kind of program bug? Why is the panic unrecoverable? Is there something I’m just doing plain wrong that you can see from these snippets?

Thanks very much for your help!

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.