Wednesday, October 14, 2009

Why hot folder's are so HOT

We are all guilty of over complicating things. In technology products over complication results in more features then you will ever use and less money you could use, other times over complication creates new problems in business processes. End-users, vendors, and technologist are all commonly trying to add too many elements to automation projects. One of the areas where over complication occurs the most in data capture and OCR integrations is when it comes to passing images and results from one step to another.

Most organizations when it comes to passing images from a capture application to a data capture application ask for a connector specifically written to incorporate the chosen imagines applications API to pass images to the chosen Data Capture applications API. Most organizations similarly when considering export form OCR and Data capture processes want a special connector to their repository or ECM product. I'm not sure what to blame, the warm and fuzzies that come from the realization that a OCR vendor has spent specific effort to develop these connectors, or the faith that somehow connectors are more efficient. What I do know is that in most all cases connectors are overkill and simply not necessary, why? Because there are hot folders, and they are amazingly powerful and simple.

A hot folder ( sometimes called a watch folder ) is a directory virtual or real that is setup to be a staging or queue for applications to put data in and take data from in real-time. The best thing about hot folders is they are free! Most all imaging, data capture, and content management applications support hot folders. If they don't you have every right to ask why. When an image capture application scans documents they can scan those documents to a directory. The data capture application can automatically read images as soon as they appear in this directory and process them. Data capture and OCR results can be automatically exported to another directory that a content management application can automatically pick up from. That is two folders vs. two pricey connectors.

You may think that you are losing functionality such as tracking and security, but there are numerous ways in window to monitor folder activity and protect folder security. You might be surprised that many “connectors” out there are actually just a hot folder with a settings dialog. It's a hot folder in disguise.

So when it comes to deciding how to get files from one application process to another, first consider hot folders and try your best to disprove their validity. If you can't, you just saved a bundle of money and probably picked the most efficient method for your OCR solution.

Labels: , , , , , , ,

Bookmark and Share
posted by Chris Riley at

5 Comments:

Blogger The Simplist said...

If you want an ECM infrastructure that truly exhibits efficient scaling and asynchronous event driven approach is the way to go.

It tends to work naturally for ECM use-cases, especially data capture and is way better than hub-n-spoke models generally employed by the data capture leaders (Kofax/Captiva etc.).

Cheers,

Zubin - CTO - ImageWork Technologies

October 14, 2009 10:37 AM  
Blogger The Lego Master said...

This post has been removed by the author.

October 14, 2009 10:54 AM  
Blogger The Lego Master said...

Zubin

Thank you again for a great comment. I completely agree with you, however I believe that hot folders can be asynchronous very easily by deploying many and deploying dynamically. For the largest services bureaus this is preferred because it is easiest to maintain without additional overhead and very efficient. I also can't agree with you on the leaders of data capture. Both products you mention are simply OEMing two of the four engines out there ABBYY or Oce now Open Text. Kofax and Captiva are more or less boxed and great for not very complex processing in enterprise. This is where is see their play but both have been losing share in the last two years. Some of the newer data capture solutions are more technically advanced. My preference for high volume processing is to stick with the cores. Thanks Again!

Chris Riley

October 14, 2009 2:16 PM  
Blogger The Simplist said...

Chris,

My comment was not towards OCR engines in general but data capture vendors and a conversation about that would be remiss without Kofax being discussed.

I was specifically trying to differentiate between a fully distributed vs. hub-n-spoke model. For example Kofax has Ascent Capture Server in the middle handling job states etc.

What are some of the newer capture solutions you would consider better for complex or high volume processing?

Cheers,

Zubin - CTO - ImageWork Technologies

October 15, 2009 7:12 PM  
Blogger The Lego Master said...

Zubin,

I would be happy to share offline the various new technologies out there. Please contact me at chris.riley@livinganalytics.com. Surprisingly who Kofax OEMs ABBYY, has a competitive product in data capture that typically is more accurate. Ascent Capture like you say is the value!

October 16, 2009 8:05 AM  

Post a Comment

<< Home