Playing tricks on your images
Up-Sampling is the process of taking an image at a lower resolution and increasing it to a higher resolution. The technology basically increases the resolution of an image then fills in new empty pixels with predicted values from the original image. For data capture and OCR up-sampling is usually done from 150 DPI and 200 DPI to 300 DPI. Up-sampling technologies have become very impressive and useful. I will recommend up-sampling often over working with the source lower resolution. But lets talk about the facts and how and when you should consider up-sampling.
Up-sampling should be considered on documents that have a low amount of noise such as watermarks, spills, stains, stamps, speckling. Essentially documents that are a good quality and scan but low resolution. You should also avoid doing up-sampling on documents with close spacing of elements and text crowding. In these two above scenarios it's better to work with the source image as-is and work around the problems
The bigger the gap between the source resolution and the desired resolution, the more risk of fragments exist after up-sampling. For example 150 DPI to 300 DPI will not yield the quality that 150 DPI to 200 DPI will. This is why going crazy and up-sampling to the highest possible resolution is not a good idea. It's like taking a very small image and trying to zoom in as far as you can to get detail, you probably wont. Trying to trick the system will only hurt you. Up-sampling from 150 DPI to 200 DPI then again to 300 DPI would not be better then just converting to 150 DPI to 300 DPI. In fact this would be a pretty big mistake. Essentially what you do when you do this is magnify the mistakes created during up-sampling as they get propagated now twice over. These will likely decrease you quality and can result in such things as bloated characters, fuzzy characters, or an abundance of speckling. The goal is to do as few conversions on the document as possible.
I will always defer to a proper scan over any image techniques, but when you do not have control of the image scan one of the image tools to consider is up-sampling. Uneducated use of the technology is unsafe as is true with all advanced technologies, but if you stick with the facts, and pick a great technology you will be successful.
Labels: best practices, Data Capture, image quality, OCR, scan, up-sampling

0 Comments:
Post a Comment
<< Home