Deskewing an image with background (Python)

I am working on a project where I am doing OCR on text on a label. My job is to deskew the image to make it readable with tesseract.

this one

I have been using this approach, that greyscales and thresholds the picture, gets the coordinates of the black pixels, draws a minAreaRect around it and then corrects the skew by the skew angle of this rectangle. This works on blindtext images, but not on images with background, like the presented image. There, it calculates a skew angle of 0.0 and does not rotate the image. (Expected result: 17°)

black pixels in the background

I suspect this happens because there are black pixels in the background. Because of them the minAreaRect goes around the whole picture, thus leading to a skew angle of 0.

I tried doing a background removal, but couldn't find a method that works well enough so that only the label with the text is left

Another approach I tried was clustering the pixels through k-means-clustering. But even when choosing a good k manually, the cluster with the text still contains parts of the background.

See here.

Not to mention that I would still need another method that goes through all the clusters and uses some sort of heuristic to determine which cluster is text and which is background, which would cost a lot of runtime.

What is the best way to deskew an image that has background?


Solution 1:

You can try deep learning based natural scene text detection methods. With these methods you can get rotated bounding boxes for each text. Based on these get rotated bounding rectangle covering all boxes. Then use the 4 corners of that rectangle to correct the image.

RRPN_plusplus

Based on sample image RRPN_plusplus seems to do quite well on extreme angles.

enter image description here

EAST

Pyimagesearch has a tutorial with EAST scene text detector. Though not sure east will do good with extreme angles.

https://www.pyimagesearch.com/2018/08/20/opencv-text-detection-east-text-detector/

enter image description here

Image from, https://github.com/argman/EAST.

These should help you find recent better repos and methods,

  • https://github.com/topics/scene-text-detection
  • https://paperswithcode.com/task/scene-text-detection
  • https://paperswithcode.com/task/curved-text-detection