Document's say “Cheese” - Digital Photo OCR
When you take a photograph of a document there is the potential of several different focal points, a table, a finger, the floor. Some of these focal points can be easily be mistaken for the flat surface of a document. The OCR engine has to determine which layer or focal point is the actual document and what it's borders are. The way the do this is color detection primarily. Because in a document scan there is only one focal point, as the document is the entirety of the image, the OCR engine does not need to guess and make any modification to the image to find it. This increases the accuracy of both document analysis and character reading. The next challenge is perspective.
A digital photograph of a document should be taken head on. Think about the LCD screen on your camera as being on the same plane as the piece of paper. Any variation to this causes problems with distortion where for example the top portion of the document from left-edge to right-edge has a shorter distance than the bottom portion. There are some capture applications out there for the iPhone and other mobile devices that force you to line the document up in brackets. This forces the capture to focus only on the document and know by virtue of the guide where the borders are, but lining it up is very time consuming. That gets to the final point, time.
It actually would take you much more time to capture 10 page document with a digital photograph than with a ADF or sheet-fed document scanner. Because the quality of the photo is so important in running OCR on a digital photograph It requires a lot of conscious effort on no shaking, lining up the document, and placing the document on a surface that does not contain many layers or focal points. Because of this additional effort it's actually not saving any time.
I am a fan of blooming technology as well, but for acquiring paper images and converting them there is not better way then a portable or traditional document scanner. In time digital photographs of documents will become a popular way to capture single page documents for one-off processing, but as long as paper exists so will the reality of document scanners.
Labels: Accuracy, Digital Photography, OCR, Scanning

0 Comments:
Post a Comment
<< Home