Improving OCR Accuracy for 2-Digit CAPTCHA Recognition in Go

Hello Go community,

I am working on a bot that needs to pass a simple CAPTCHA verification system. The system presents a 2-digit number as an image every 15 minutes, and I need to capture this number and input it correctly.

Using Go, I take a screenshot of the specific region on the screen where the CAPTCHA appears and then apply OCR (Optical Character Recognition) libraries to extract the digits. However, the OCR results are not consistent:

  • Sometimes only one digit is detected,
  • Sometimes digits are misread,
  • And sometimes no text is recognized at all.
    Although perfect accuracy isn’t required, I would like to improve the OCR performance as much as possible.

To tackle this problem, I’ve also consulted with an AI assistant to get suggestions on preprocessing and OCR best practices, but I’d love to get more practical advice from real-world Go developers.

My questions:

  • What are the best practices to preprocess the images before feeding them to the OCR engine to improve recognition accuracy? (e.g., binarization, noise removal, resizing, thresholding)
  • Are there recommended OCR libraries in Go that perform better for simple numeric captchas?
  • Would training a custom OCR model or using ML-based approaches be beneficial for such a task?
  • Any tips or example projects would be highly appreciated!

Below, I’m sharing three functions I use for image processing, along with a few sample result images. In the sample images I shared, the OCR correctly reads 52 as 52, but misreads 25 as 28 and 43 as 45.

Thank you in advance for your help!

func gorselIsle(dosyaYolu string,) (string, error) {
	img, err := imaging.Open(dosyaYolu)
	if err != nil {
		return "", fmt.Errorf("görsel açma hatası: %v", err)
	}

	img = imaging.Resize(img, img.Bounds().Dx()*10, 0, imaging.Lanczos)
	// img = imaging.Grayscale(img)
	
	img = imaging.AdjustContrast(img, 100)


	img = pikselleriFiltrele(img)

	tempYol := "temp_grayscale.jpg"
	err = imaging.Save(img, tempYol)
	if err != nil {
		return "", fmt.Errorf("görsel kaydetme hatası: %v", err)
	}
	return tempYol, nil
}
func pikselleriFiltrele(img image.Image) *image.NRGBA {
	bounds := img.Bounds()
	filtered := imaging.New(bounds.Dx(), bounds.Dy(), image.White)


	tolerans := 25

	hedefR, hedefG, hedefB := uint32(0), uint32(0), uint32(0)

	for y := bounds.Min.Y; y < bounds.Max.Y; y++ {
		for x := bounds.Min.X; x < bounds.Max.X; x++ {
			r, g, b, _ := img.At(x, y).RGBA()

			// RGBA 16-bit formatında (0-65535), 8-bit'e çevirelim
			r8 := r >> 8
			g8 := g >> 8
			b8 := b >> 8

			// Tolerans içinde mi kontrolü
			if absDiff(r8, hedefR) <= tolerans &&
				absDiff(g8, hedefG) <= tolerans &&
				absDiff(b8, hedefB) <= tolerans {
				// Hedef renge yakınsa: siyah
				filtered.Set(x, y, image.Black)
			}
		}
	}

	return filtered
}
func metinTani(dosyaYolu  string) (string, error) {
	// Tesseract OCR işlemi
	
	
    cmd = exec.Command("tesseract", dosyaYolu, "stdout", "--psm", "8", "-l", "digits", "-c", "tessedit_char_whitelist=0123456789")

	var out strings.Builder
	var stderr strings.Builder
	cmd.Stdout = &out
	cmd.Stderr = &stderr

	err := cmd.Run()
	if err != nil {
		return "", fmt.Errorf("tesseract hatası: %v - %s", err, stderr.String())
	}

	return strings.TrimSpace(out.String()), nil
}