Tracking progress of sequences of tasks

I am writing a Go program that performs a sequence of tasks that take up to a few minutes to complete. Some of the tasks are themselves composed of multiple parallel sub-tasks, and the overall task is not considered “complete” until the individual sub-tasks complete.

I would like to build a library that allows these individual tasks to report progress, which should then be aggregated into a form so that an external observer can get some representation of the overall progress of the whole pipeline, e.g. in the form of a float between 0 and 1.

Now the individual tasks are not aware of where in the overall pipeline they fit, so while they can report on their own progress, to compute overall progress one needs to know how many tasks have already been completed and how many are still to come. Furthermore, some tasks are expected to take longer than other tasks, and there should be some way to provide some manual expectations and use that in computation of overall progress.

Also, when computing the progress of a set of parallel sub-tasks, each sub-task can report on its own progress but the sub-tasks may not all complete at the same time, so I need a way to sensibly compute overall progress in this situation.

There are many more issues to think through. Does anything exist already in this space?

Check it:

package main

import (
        "log"
        "sync"
)

func RootTask(task int, wg *sync.WaitGroup) {
        for i := 0; i < 5; i++ {
                SubTask(task, i)
        }
        log.Println("Task", task, "finished with all sub tasks!")
        wg.Done()
        return
}

func SubTask(root int, i int) {
        log.Println("sub Task", i, "of root", root, "Running")

}
func main() {
        var wg sync.WaitGroup
        for i := 0; i < 5; i++ {
                wg.Add(1)
                go RootTask(i, &wg)
        }
        wg.Wait()
        log.Println("All root tasks finished!")
}

Yup I know about wait groups and goroutines - what I’m looking for is the ability to track progress of the overall pipeline (e.g. if SubTask takes several minutes to complete, I want to get a 0…100 progress indicator from main showing progress towards completion)

Instead of using a WaitGroup (like in the example in a reply), use a channel where you send something (empty struct will do) on completion of a subtask. If you know how many steps there, you can easily calculate the advancement by just receiving from the channel.

I don’t know of any package already in this space, but I would have done something like a

type ProgressTracker interface {
  func NewTask(weight float64) Task
  func Current() float64
  func Done() bool
}

type Task interface {
  Progress(v float64) // 0.0 to 1.0, 1.0 indicates the task is done
  Done() // because floats suck, maybe better to have a real indicator when we are done
}

Assuming you have a couple of tasks of differing weights (that is, expected duration) and routines to handle them, you would do something like

p := NewProgressTracker()

t1 := p.NewTask(5.0)
go doSomeTask(t1, ...)

t2 := p.NewTask(2.0)
go doSomeOtherTask(t2, ...)

for !p.Done() {
  fmt.Println("Current progress: %.1f %%", p.Current() * 100)
  time.Sleep(time.Second)
}

The actual task routines would do something like

func doSomeTask(t Task, ...) {
  defer t.Done()
  for {
    // work itensifies
    t.Progress(...)
  }
}

The implementation of Progress and Task left as exercises, of course. :slight_smile: The Progress.Current() method would just sum the individual progresses multiplied by the weight divided by the total, the Done() method is an && of all the task Done()s.

One could imagine other interfaces, involving contexts and waitgroups and stuff. Maybe you could register a task that waits for a waitgroup, so the actual task doesn’t need to understand the Task interface if it’s short lived. Maybe the Progress should include waitgroup handling so you can Wait() for it to be done.

There could be other implementations of Task, like those who have a total size and implement Reader and Writer to track progress of reads and writes. And so on.

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.