Receipt OCR: Machine + human processing

Different levels of autonomy

Receipt OCR, like self-driving car has different levels of autonomy (by machine), which requires different levels of human intervention and attentiveness. A higher level of autonomy provided by the machine will reduce the level of human intervention required to perform a task.. The table below shows the level of driving automation published by Society of Automotive Engineers (SAE International).

Level Name What it means
0 No Driving Automation Driver is in complete control at all times
1 Driver Assistance Driver is in complete control with some vehicle assistance like cruise control or lane guidance.
2 Partial Driving Automation Driver is in complete control and can occasionally let vehicle take over the steering wheel and speed under limited set of condition like limited access highway.
3 Conditional Driving Automation Vehicle is in complete control in some situations with the expectation that the human driver will respond appropriately to a request to intervene
4 High Driving Automation Vehicle is in complete control for the entire trip, even when human driver does not respond appropriately when vehicle request human driver to intervene
5 Full Driving Automation "Look ma, no driver!"

The hype vs the reality

There have been enormous amount of breakthroughs and achievement in the field of Machine Learning and Artificial Intelligence. Even, the news and media have been raving that the future of self-driving cars is near. And that we will soon be able to fall asleep while driving our cars to work, and have the cars drop us home when we have a bit too much to drink after work. But the reality is we are still far from the hype. Recently, Elon Musk has clarified that Tesla could only have a level 4 autonomous system by the end of 2019. So, we still have to wait for a long time before we can impress our mom with a "Look ma, no driver!" self-driving car. And we are here to talk about why? A commercially viable 100% fully autonomous car is hard and very very expensive.

The law of diminishing return

The law of diminishing return states that if one input in the production is increased while others are fixed, a point will eventually be reached at which additions of the input yield progressively smaller, or diminishing return in output. Receipt OCR suffers the same law. For a commercially viable product, the cost and investment can be too much to achieve 100% machine with zero human intervention receipt OCR processing. Taggun achieves 82% accuracy in receipt OCR and we are always testing new ways to improve the accuracy and extract more information. However machine does have its limitation, adding some sort of human intervention is not only a cost effective way but a necessary step to complete the user tasks.

Different levels of autonomous in OCR solution

Taggun's API enables companies to implement different levels of automation with OCR. Level 2 and 3 are most common in the industry. Smart Receipts integrates with Taggun to help users reduce their effort to capture information from a receipt to submit as expense reports. There are also other companies like Amazon Mechanical Turk that can help outsource the tasks to manually transcribe receipts in the background, enabling level 3 to achieve higher level autonomous and accuracy to process receipts. Taggun is not affiliated with these companies, but they are worth considering when implementing an OCR solution to achieve higher level of automation for your users. See the list of crowd sourcing companies to perform receipt transcription tasks here.

Level Name What it means
0 No OCR Automation User to complete all fields in a form.
1 OCR Assistance User to complete all fields in a form with some assistance like auto-complete or receipt validation feedback
2 Partial OCR Automation Machine OCR to fill in the fields in a form, while user has to validate and correct any machine errors to complete the form
3 Partial OCR + Outsource Automation Machine OCR to fill in the fields while the task to correct any machine errors is outsource for human processing in the background. User to validate the overall result to complete the form.
4 Full OCR Automation Machine OCR to fill in the fields and validate the overall results with minimum user involvement