Get the “Blank” out of here
Most document scanners today include as apart of their driver a blank page removal tool. These tools vary slightly they may have specific algorithms that detects blank pages not only by the amount of white on the page but also possibly by how a page relates to other pages in the batch. Some times this is problematic when you have backsides of documents with very little text. The other approach is to measure the resulting image file size, under a certain number of kilobytes you can likely spot a blank page, this has the same problem of removing pages that have very little text which often occur on the back side of documents. The final and most accurate way is to measure the amount of black or color pixels on the page and set a threshold at a small percent like 1% or 2% that could consider the page blank, this approach is the most accurate but requires you to know your documents beforehand and may be problematic with greyscale scans or contrast settings that make blank pages slightly gray. The other approach would be to have imaging or OCR software remove the pages for you.
Some, not most OCR applications have the ability to also detect blank pages, they use a combination of pixel detection and the presence of text. This might slow down your OCR process but is a useful tool if it is available. More likely you can purchase a full-on imaging application that has very robust blank page removal tools akin to what you would find in a scan driver but usually with more options.
Organizations such as service bureaus often combine methods to ensure that no blanks make it through. Blank page detection tools are very accurate and very useful that you can start using today.
Labels: blank page removal, blank pages, Scanning

0 Comments:
Post a Comment
<< Home