Improving threshold result for Tesseract

up vote
2
down vote

favorite

I am kind of stuck with this problem, and I know there are so many questions about it on stack overflow but in my case. Nothing gives the expected result.

The Context:

Am using Android OpenCV along with Tesseract so I can read the MRZ area in the passport. When the camera is started I pass the input frame to an AsyncTask, the frame is processed, the MRZ area is extracted succesfully, I pass the extracted MRZ area to a function prepareForOCR(inputImage) that takes the MRZ area as gray Mat and Will output a bitmap with the thresholded image that I will pass to Tesseract.

The problem:

The problem is while thresholding the Image, I use adaptive thresholding with blockSize = 13 and C = 15, but the result given is not always the same depending on the lighting of the image and the conditions in general from which the frame is taken.

What I have tried:

First I am resizing the image to a specific size (871,108) so the input image is always the same and not dependant on which phone is used.
After resizing, I try with different BlockSize and C values

//toOcr contains the extracted MRZ area

Bitmap toOCRBitmap = Bitmap.createBitmap(bitmap);

Mat inputFrame = new Mat();

Mat toOcr = new Mat();

Utils.bitmapToMat(toOCRBitmap, inputFrame);

Imgproc.cvtColor(inputFrame, inputFrame, Imgproc.COLOR_BGR2GRAY);

TesseractResult lastResult = null;

for (int B = 11; B < 70; B++) {

    for (int C = 11; C < 70; C++){

        if (IsPrime(B) && IsPrime(C)){

            Imgproc.adaptiveThreshold(inputFrame, toOcr, 255, Imgproc.ADAPTIVE_THRESH_GAUSSIAN_C, Imgproc.THRESH_BINARY, B ,C);

            Bitmap toOcrBitmap = OpenCVHelper.getBitmap(toOcr);

            TesseractResult result = TesseractInstance.extractFrame(toOcrBitmap, "ocrba");

            if (result.getMeanConfidence()> 70) {

                if (MrzParser.tryParse(result.getText())){

                    Log.d("Main2Activity", "Best result with " + B + " : " + C);

                    return result;

                }

            }

        }

    }

}

Using the code below, the thresholded result image is a black on white image which gives a confidence greater than 70, I can't really post the whole image for privacy reasons, but here's a clipped one and a dummy password one.

Clipped image

From the web

Using the MrzParser.tryParse function which adds checks for the character position and its validity within the MRZ, am able to correct some occurences like a name containing a 8 instead of B, and get a good result but it takes so much time, which is normal because am thresholding almost 255 images in the loop, adding to that the Tesseract call.

I already tried getting a list of C and B values which occurs the most but the results are different.

The question:

Is there a way to define a C and blocksize value so that it s always giving the same result, maybe adding more OpenCV calls so The input image like increasing contrast and so on, I searched the web for 2 weeks now I can't find a viable solution, this is the only one that is giving accurate results

edited Nov 15 at 9:02

Eric

2,34632850

asked Oct 25 at 14:52

Reda

2282415

have you experimented with simple shareholding?
– Nikiforos
Nov 7 at 14:46

ShareHolding ? yes I experienced with simple thresholding if that's what you mean :D, but it never gives the same result, unless the image taken by all the users is the same in terms of lighting and contrast and passport texture. So simple threshold is not a viable solution.
– Reda
Nov 8 at 0:00

haha yes I mean thresholding sorry. It would be helpful to post some image examples of yours. For instance, what exact result are you getting and it is not satisfying?
– Nikiforos
Nov 8 at 8:54

add a comment |

up vote
2
down vote

favorite

I am kind of stuck with this problem, and I know there are so many questions about it on stack overflow but in my case. Nothing gives the expected result.

The Context:

The problem:

What I have tried:

First I am resizing the image to a specific size (871,108) so the input image is always the same and not dependant on which phone is used.
After resizing, I try with different BlockSize and C values

//toOcr contains the extracted MRZ area

Bitmap toOCRBitmap = Bitmap.createBitmap(bitmap);

Mat inputFrame = new Mat();

Mat toOcr = new Mat();

Utils.bitmapToMat(toOCRBitmap, inputFrame);

Imgproc.cvtColor(inputFrame, inputFrame, Imgproc.COLOR_BGR2GRAY);

TesseractResult lastResult = null;

for (int B = 11; B < 70; B++) {

    for (int C = 11; C < 70; C++){

        if (IsPrime(B) && IsPrime(C)){

            Imgproc.adaptiveThreshold(inputFrame, toOcr, 255, Imgproc.ADAPTIVE_THRESH_GAUSSIAN_C, Imgproc.THRESH_BINARY, B ,C);

            Bitmap toOcrBitmap = OpenCVHelper.getBitmap(toOcr);

            TesseractResult result = TesseractInstance.extractFrame(toOcrBitmap, "ocrba");

            if (result.getMeanConfidence()> 70) {

                if (MrzParser.tryParse(result.getText())){

                    Log.d("Main2Activity", "Best result with " + B + " : " + C);

                    return result;

                }

            }

        }

    }

}

Clipped image

From the web

I already tried getting a list of C and B values which occurs the most but the results are different.

The question:

edited Nov 15 at 9:02

Eric

2,34632850

asked Oct 25 at 14:52

Reda

2282415

have you experimented with simple shareholding?
– Nikiforos
Nov 7 at 14:46

ShareHolding ? yes I experienced with simple thresholding if that's what you mean :D, but it never gives the same result, unless the image taken by all the users is the same in terms of lighting and contrast and passport texture. So simple threshold is not a viable solution.
– Reda
Nov 8 at 0:00

haha yes I mean thresholding sorry. It would be helpful to post some image examples of yours. For instance, what exact result are you getting and it is not satisfying?
– Nikiforos
Nov 8 at 8:54

add a comment |

up vote
2
down vote

favorite

I am kind of stuck with this problem, and I know there are so many questions about it on stack overflow but in my case. Nothing gives the expected result.

The Context:

The problem:

What I have tried:

First I am resizing the image to a specific size (871,108) so the input image is always the same and not dependant on which phone is used.
After resizing, I try with different BlockSize and C values

//toOcr contains the extracted MRZ area

Bitmap toOCRBitmap = Bitmap.createBitmap(bitmap);

Mat inputFrame = new Mat();

Mat toOcr = new Mat();

Utils.bitmapToMat(toOCRBitmap, inputFrame);

Imgproc.cvtColor(inputFrame, inputFrame, Imgproc.COLOR_BGR2GRAY);

TesseractResult lastResult = null;

for (int B = 11; B < 70; B++) {

    for (int C = 11; C < 70; C++){

        if (IsPrime(B) && IsPrime(C)){

            Imgproc.adaptiveThreshold(inputFrame, toOcr, 255, Imgproc.ADAPTIVE_THRESH_GAUSSIAN_C, Imgproc.THRESH_BINARY, B ,C);

            Bitmap toOcrBitmap = OpenCVHelper.getBitmap(toOcr);

            TesseractResult result = TesseractInstance.extractFrame(toOcrBitmap, "ocrba");

            if (result.getMeanConfidence()> 70) {

                if (MrzParser.tryParse(result.getText())){

                    Log.d("Main2Activity", "Best result with " + B + " : " + C);

                    return result;

                }

            }

        }

    }

}

Clipped image

From the web

I already tried getting a list of C and B values which occurs the most but the results are different.

The question:

edited Nov 15 at 9:02

Eric

2,34632850

asked Oct 25 at 14:52

Reda

2282415

I am kind of stuck with this problem, and I know there are so many questions about it on stack overflow but in my case. Nothing gives the expected result.

The Context:

The problem:

What I have tried:

First I am resizing the image to a specific size (871,108) so the input image is always the same and not dependant on which phone is used.
After resizing, I try with different BlockSize and C values

//toOcr contains the extracted MRZ area

Bitmap toOCRBitmap = Bitmap.createBitmap(bitmap);

Mat inputFrame = new Mat();

Mat toOcr = new Mat();

Utils.bitmapToMat(toOCRBitmap, inputFrame);

Imgproc.cvtColor(inputFrame, inputFrame, Imgproc.COLOR_BGR2GRAY);

TesseractResult lastResult = null;

for (int B = 11; B < 70; B++) {

    for (int C = 11; C < 70; C++){

        if (IsPrime(B) && IsPrime(C)){

            Imgproc.adaptiveThreshold(inputFrame, toOcr, 255, Imgproc.ADAPTIVE_THRESH_GAUSSIAN_C, Imgproc.THRESH_BINARY, B ,C);

            Bitmap toOcrBitmap = OpenCVHelper.getBitmap(toOcr);

            TesseractResult result = TesseractInstance.extractFrame(toOcrBitmap, "ocrba");

            if (result.getMeanConfidence()> 70) {

                if (MrzParser.tryParse(result.getText())){

                    Log.d("Main2Activity", "Best result with " + B + " : " + C);

                    return result;

                }

            }

        }

    }

}

Clipped image

From the web

I already tried getting a list of C and B values which occurs the most but the results are different.

The question:

android opencv ocr tesseract opencv4android

edited Nov 15 at 9:02

Eric

2,34632850

asked Oct 25 at 14:52

Reda

2282415

edited Nov 15 at 9:02

Eric

2,34632850

asked Oct 25 at 14:52

Reda

2282415

edited Nov 15 at 9:02

Eric

2,34632850

edited Nov 15 at 9:02

Eric

2,34632850

edited Nov 15 at 9:02

Eric

2,34632850

asked Oct 25 at 14:52

Reda

2282415

asked Oct 25 at 14:52

Reda

2282415

asked Oct 25 at 14:52

Reda

2282415

have you experimented with simple shareholding?
– Nikiforos
Nov 7 at 14:46

ShareHolding ? yes I experienced with simple thresholding if that's what you mean :D, but it never gives the same result, unless the image taken by all the users is the same in terms of lighting and contrast and passport texture. So simple threshold is not a viable solution.
– Reda
Nov 8 at 0:00

haha yes I mean thresholding sorry. It would be helpful to post some image examples of yours. For instance, what exact result are you getting and it is not satisfying?
– Nikiforos
Nov 8 at 8:54

add a comment |

have you experimented with simple shareholding?
– Nikiforos
Nov 7 at 14:46

ShareHolding ? yes I experienced with simple thresholding if that's what you mean :D, but it never gives the same result, unless the image taken by all the users is the same in terms of lighting and contrast and passport texture. So simple threshold is not a viable solution.
– Reda
Nov 8 at 0:00

haha yes I mean thresholding sorry. It would be helpful to post some image examples of yours. For instance, what exact result are you getting and it is not satisfying?
– Nikiforos
Nov 8 at 8:54

have you experimented with simple shareholding?
– Nikiforos
Nov 7 at 14:46

ShareHolding ? yes I experienced with simple thresholding if that's what you mean :D, but it never gives the same result, unless the image taken by all the users is the same in terms of lighting and contrast and passport texture. So simple threshold is not a viable solution.
– Reda
Nov 8 at 0:00

haha yes I mean thresholding sorry. It would be helpful to post some image examples of yours. For instance, what exact result are you getting and it is not satisfying?
– Nikiforos
Nov 8 at 8:54

add a comment |

1 Answer
1

active

oldest

votes

up vote
2
down vote

+50

You can use a clustering algorithm to cluster the pixels based on color. The characters are dark and there is a good contrast in the MRZ region, so a clustering method will most probably give you a good segmentation if you apply it to the MRZ region.

Here I demonstrate it with MRZ regions obtained from sample images that can be found on the internet.

I use color images, apply some smoothing, convert to Lab color space, then cluster the a, b channel data using kmeans (k=2). The code is in python but you can easily adapt it to java. Due to the randomized nature of the kmeans algorithm, the segmented characters will have label 0 or 1. You can easily sort it out by inspecting cluster centers. The cluster-center corresponding to characters should have a dark value in the color space you are using.
I just used the Lab color space here. You can use RGB, HSV or even GRAY and see which one is better for you.

After segmenting like this, I think you can even find good values for B and C of your adaptive-threshold using the properties of the stroke width of the characters (if you think the adaptive-threshold gives a better quality output).

import cv2

import numpy as np



im = cv2.imread('mrz1.png')

# convert to Lab

lab = cv2.cvtColor(cv2.GaussianBlur(im, (3, 3), 1), cv2.COLOR_BGR2Lab)



im32f = np.array(im[:, :, 1:3], dtype=np.float32)

k = 2 # 2 clusters

term_crit = (cv2.TERM_CRITERIA_EPS, 30, 0.1)

ret, labels, centers = cv2.kmeans(im32f.reshape([im.shape[0]*im.shape[1], -1]), 

                                  k, None, term_crit, 10, 0)

# segmented image

labels = labels.reshape([im.shape[0], im.shape[1]]) * 255

Some results:

1-out

2-out

3-out

4-out

edited Nov 11 at 11:16

answered Nov 11 at 10:31

dhanushka

7,65421931

Thanks for your time, how does your algorithm handle images with varying contrast on them, because the images are taken from the phone camera, sometimes the left part of the MRZ is dark and the right part have a good contrast. I can't provide a real life example because am not in front of my computer right now. Just place some shadows on the image and see the result.
– Reda
Nov 11 at 14:24

@Reda If the clustering doesn't work on these images, you can try a shadow detection and removal method as pre-processing. Since the clustering uses the a and b (chroma) channels, there's a chance that it performs better for these regions. You can even try increasing the cluster count, say to 3.
– dhanushka
Nov 12 at 14:47

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f52992237%2fimproving-threshold-result-for-tesseract%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

up vote
2
down vote

+50

Here I demonstrate it with MRZ regions obtained from sample images that can be found on the internet.

import cv2

import numpy as np



im = cv2.imread('mrz1.png')

# convert to Lab

lab = cv2.cvtColor(cv2.GaussianBlur(im, (3, 3), 1), cv2.COLOR_BGR2Lab)



im32f = np.array(im[:, :, 1:3], dtype=np.float32)

k = 2 # 2 clusters

term_crit = (cv2.TERM_CRITERIA_EPS, 30, 0.1)

ret, labels, centers = cv2.kmeans(im32f.reshape([im.shape[0]*im.shape[1], -1]), 

                                  k, None, term_crit, 10, 0)

# segmented image

labels = labels.reshape([im.shape[0], im.shape[1]]) * 255

Some results:

1-out

2-out

3-out

4-out

edited Nov 11 at 11:16

answered Nov 11 at 10:31

dhanushka

7,65421931

Thanks for your time, how does your algorithm handle images with varying contrast on them, because the images are taken from the phone camera, sometimes the left part of the MRZ is dark and the right part have a good contrast. I can't provide a real life example because am not in front of my computer right now. Just place some shadows on the image and see the result.
– Reda
Nov 11 at 14:24

@Reda If the clustering doesn't work on these images, you can try a shadow detection and removal method as pre-processing. Since the clustering uses the a and b (chroma) channels, there's a chance that it performs better for these regions. You can even try increasing the cluster count, say to 3.
– dhanushka
Nov 12 at 14:47

add a comment |

up vote
2
down vote

+50

Here I demonstrate it with MRZ regions obtained from sample images that can be found on the internet.

import cv2

import numpy as np



im = cv2.imread('mrz1.png')

# convert to Lab

lab = cv2.cvtColor(cv2.GaussianBlur(im, (3, 3), 1), cv2.COLOR_BGR2Lab)



im32f = np.array(im[:, :, 1:3], dtype=np.float32)

k = 2 # 2 clusters

term_crit = (cv2.TERM_CRITERIA_EPS, 30, 0.1)

ret, labels, centers = cv2.kmeans(im32f.reshape([im.shape[0]*im.shape[1], -1]), 

                                  k, None, term_crit, 10, 0)

# segmented image

labels = labels.reshape([im.shape[0], im.shape[1]]) * 255

Some results:

1-out

2-out

3-out

4-out

edited Nov 11 at 11:16

answered Nov 11 at 10:31

dhanushka

7,65421931

Thanks for your time, how does your algorithm handle images with varying contrast on them, because the images are taken from the phone camera, sometimes the left part of the MRZ is dark and the right part have a good contrast. I can't provide a real life example because am not in front of my computer right now. Just place some shadows on the image and see the result.
– Reda
Nov 11 at 14:24

@Reda If the clustering doesn't work on these images, you can try a shadow detection and removal method as pre-processing. Since the clustering uses the a and b (chroma) channels, there's a chance that it performs better for these regions. You can even try increasing the cluster count, say to 3.
– dhanushka
Nov 12 at 14:47

add a comment |

up vote
2
down vote

+50

up vote
2
down vote

+50

Here I demonstrate it with MRZ regions obtained from sample images that can be found on the internet.

import cv2

import numpy as np



im = cv2.imread('mrz1.png')

# convert to Lab

lab = cv2.cvtColor(cv2.GaussianBlur(im, (3, 3), 1), cv2.COLOR_BGR2Lab)



im32f = np.array(im[:, :, 1:3], dtype=np.float32)

k = 2 # 2 clusters

term_crit = (cv2.TERM_CRITERIA_EPS, 30, 0.1)

ret, labels, centers = cv2.kmeans(im32f.reshape([im.shape[0]*im.shape[1], -1]), 

                                  k, None, term_crit, 10, 0)

# segmented image

labels = labels.reshape([im.shape[0], im.shape[1]]) * 255

Some results:

1-out

2-out

3-out

4-out

edited Nov 11 at 11:16

answered Nov 11 at 10:31

dhanushka

7,65421931

Here I demonstrate it with MRZ regions obtained from sample images that can be found on the internet.

import cv2

import numpy as np



im = cv2.imread('mrz1.png')

# convert to Lab

lab = cv2.cvtColor(cv2.GaussianBlur(im, (3, 3), 1), cv2.COLOR_BGR2Lab)



im32f = np.array(im[:, :, 1:3], dtype=np.float32)

k = 2 # 2 clusters

term_crit = (cv2.TERM_CRITERIA_EPS, 30, 0.1)

ret, labels, centers = cv2.kmeans(im32f.reshape([im.shape[0]*im.shape[1], -1]), 

                                  k, None, term_crit, 10, 0)

# segmented image

labels = labels.reshape([im.shape[0], im.shape[1]]) * 255

Some results:

1-out

2-out

3-out

4-out

edited Nov 11 at 11:16

answered Nov 11 at 10:31

dhanushka

7,65421931

edited Nov 11 at 11:16

answered Nov 11 at 10:31

dhanushka

7,65421931

answered Nov 11 at 10:31

dhanushka

7,65421931

answered Nov 11 at 10:31

dhanushka

7,65421931

Thanks for your time, how does your algorithm handle images with varying contrast on them, because the images are taken from the phone camera, sometimes the left part of the MRZ is dark and the right part have a good contrast. I can't provide a real life example because am not in front of my computer right now. Just place some shadows on the image and see the result.
– Reda
Nov 11 at 14:24

@Reda If the clustering doesn't work on these images, you can try a shadow detection and removal method as pre-processing. Since the clustering uses the a and b (chroma) channels, there's a chance that it performs better for these regions. You can even try increasing the cluster count, say to 3.
– dhanushka
Nov 12 at 14:47

add a comment |

Thanks for your time, how does your algorithm handle images with varying contrast on them, because the images are taken from the phone camera, sometimes the left part of the MRZ is dark and the right part have a good contrast. I can't provide a real life example because am not in front of my computer right now. Just place some shadows on the image and see the result.
– Reda
Nov 11 at 14:24

@Reda If the clustering doesn't work on these images, you can try a shadow detection and removal method as pre-processing. Since the clustering uses the a and b (chroma) channels, there's a chance that it performs better for these regions. You can even try increasing the cluster count, say to 3.
– dhanushka
Nov 12 at 14:47

Thanks for your time, how does your algorithm handle images with varying contrast on them, because the images are taken from the phone camera, sometimes the left part of the MRZ is dark and the right part have a good contrast. I can't provide a real life example because am not in front of my computer right now. Just place some shadows on the image and see the result.
– Reda
Nov 11 at 14:24

@Reda If the clustering doesn't work on these images, you can try a shadow detection and removal method as pre-processing. Since the clustering uses the a and b (chroma) channels, there's a chance that it performs better for these regions. You can even try increasing the cluster count, say to 3.
– dhanushka
Nov 12 at 14:47

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

Some of your past answers have not been well-received, and you're in danger of being blocked from answering.

Please pay close attention to the following guidance:

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

D4KTbf9r YVd ehaeGukDDoecJniAZ5kfq2jK,hbG0pUUjOC2oBhTC,CQbShDttKMzP6uGK2qWzsaUZWTK,kC,X

搜尋此網誌

Vfrdtyky