Goroutines and I/O

Mikhail_Bolshakov · July 19, 2023, 6:57am

Recently, I’ve been asked during an interview “How many goroutines would you create to send 1M http requests on the machine with 16 CPU?”. So, after some time thinking I realized I don’t know an answer because mostly it depends on many different reasons:

latency of the requests (how many goroutines will be parking waiting)
size of paylod and response (CPU loading needed for marshaling/unmarshling)
etc.

so, the answer on top of my head is “it depends” and “needs to be tested”

I’d appreciate to hear your reasoning about this question.

skillian · July 19, 2023, 11:13am

~~Is it a trick question? It’s my understanding that a new goroutine is created for each request (assuming we’re using the standard library’s HTTP server).~~

EDIT: Oops. Sending requests; not receiving

NobbZ · July 19, 2023, 12:50pm

The question was to “send requests”, not to “receive requests”.

Anyway, for “how many requests would you do in parallel” questions, I usually start of with the following assumptions:

There is infinite bandwidth available (network and disk!)
There is zero latency
There is no connection limit on the remote
We have infinite file descriptors
We fully own the CPU

Based on this assumptions I usually start with 2 to 3 times numprocs and tweak them based on experiments and observations.

I do similar considerations for any other “how large should the pool be” kind of questions.

Explaining the chain of thoughts, the foundational assumptions and the reasoning for the first number is usually much more important to the interviewer than the concrete answer you give.

If I were the interviewer and asking the question, and you’d just answer “32 to 48, then tweak”, that’d be a negative. I’d perhaps try to push you to give me some reasoning, but if I have to ask literally “why do you think thats appropriate”, that be a no go.

And if the interviewer is really just interested in you saying the number as it is written on their questionaire, then the job wasn’t worth your time doing the interview anyway…

Mikhail_Bolshakov · July 19, 2023, 1:39pm

It was a questionaire for 15min and 7 questions provided by a recruiter just to be preliminary verified by tech lead to start discussing tech interview That’s because I was confused and started digging trying to realize what I’m missing.
Thank you for thorough explanation

Metalymph · July 20, 2023, 11:17am

Hi @Mikhail_Bolshakov, IMHO, this question’s answer is: we have not enough informations, because coroutines scheduler is not deterministic, and we aren’t talking about a fixed algorithm where we could calculate more or less the correct number (divide-et-impera for instance). Doing so much assumptions is far from the reality. Theory is for fixed algorithms where you can impose abstracts concept. Here you should AT LEAST know the time spent in scheduling on that CPU, preparing and sending the request, receiving the result and discard it…is totally useless and no real engineer would ask something like this in a serious question.

NobbZ · July 20, 2023, 11:46am

Yeah, for a pre-screening this question is definitely underspecified and far too open ended.

geosoft1 · July 20, 2023, 3:36pm

To send 1M concurent http requests you will need exactly 1M goroutines.
Whether the necessary resources, such as bandwidth, processor, memory, are available or not is another discussion. Anyway, the bandwidth depends on how much information you send, the memory probably depends on how much information you receive, the number of processors does not matter as they are reused (goroutines are not threads!). The number of descriptors in the operating system must be increased to support 1M requests and so on. But the number of goroutines is always equal to the number of concurrent requests.
If we are not talking about concurrent requests, but 1M requests in a time interval, then probably some calculations need to be done in accordance with what I said above.

Mikhail_Bolshakov · July 20, 2023, 7:43pm

@geosoft1 I don’t think so. Goroutine isn’t just an async call, it isn’t just an abstraction like promise in js . They aren’t free. They have physical parameters such as stack size. So you have to pay attention on what environment (cpu, memory etc) your code is running on when dealing with goroutines. While your assumption can be reasonable in theory, it is inpractical in my perspective.

caster · July 28, 2023, 1:34am

16 goroutines are enough. Add the request to the task queue.

falco467 · July 28, 2023, 11:50am

Some parameters which might influence your decision:

What is a “Request”?
1. A single small UDP packet fire & forget will be mostly bound by the throughput of your network-interface.
2. If it is a TCP HTTPS-Request your routine will have to wait for a response (handshake) and depending on the target servers that may take a long time and will be limited by the number of open connections/outbound ports.
3. If the request includes data (e.g. a file from your drive) you will have additional I/O boundaries (number of open files, random disk access speed)
4. If all requests go to the same Server they may reuse a single HTTPS/2 connection for all traffic
What is your environment?
1. Are you CPU, memory or network I/O bound?
2. Are you in a pay-by-use cloud environment? What is the most economical utilization, are spikes in cpu/traffic very expensive?
What is your goal?
1. Is this a one-off script which just should run as fast as possible?
2. Is this a service which should be easy to maintain, debug and start/stop and run regularly?
3. Are errors/reconnects a big factor?

Depending on all of these the optimal number or routines might range from a few to actually 1M.

falco467 · July 28, 2023, 12:20pm

Some limiting factors:

A single go-routine currently uses a minimum stack size of 2KB. It is likely that your actual code will also allocate some additional memory on the heap (for e.g. JSON serialization or similar) for each goroutine. This would mean 1M go-routines could easily require 2-4 GB or RAM (should be ok for an average environment)

Most OS will limit the number of open connections in various ways. For TCP/IP there is usually a limit of open ports per interface. This is about 28K on many modern systems. There is usually an additional limit per process (e.g. ulimit for number of open file-descriptors) which will by default be around 1000. So without changing the OS configuration you will have a maximum of 1000 concurrent connections on Linux.

So depending on the system you should probably not create more than 1000 goroutines, because they might start failing with “maximum num of file descriptors reached” or even drop packets.

If you increase the limits you are still bound by the 28K connections from a single IP address. So if all 1M requests use a single outbound address, this could be your upper limit for the number of goroutines.

So reasonable answers could probably be 32 (16 cores + hyper-threading = 32 concurrent threads), or 1000 or around 28K (depending on the configuration)

drornir.dev · August 9, 2023, 1:24am

I think, in general, if you don’t know the answer, you want to start exploring.

I would start with writing a 1M routines implementation, with not limits, and see if and how it breaks.

Generally speaking, in my expreince, the first wall you will hit is rate limiting on the server you’re making the requests to.

system · November 7, 2023, 1:24am

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.