1 receipt transcription engine. 2 OCR Providers.

Receipt Transcription Engine

Strictly speaking, Taggun's receipt transcription engine is more of a Natural Language Processing (NLP) than an OCR engine. Taggun takes advantage of OCR Providers like Google Vision to perform image-to-text OCR. This allows Taggun to rely on the speed and accuracy of an external OCR provider to produce a raw text from an image. See example below:

Sample receipt

Sample Receipt

Raw Text output from OCR Provider
NYONYA
199 GRAND ST 
NEW YORK, NY 10013
Tel 212-334-6701
2/3/2017 8:23 PM
1167
DINE IN
Yoke. TBL: B7 GST: 1
2 Roti Cana 7.90
1 Spinach 9.95
Fu Yee
1 Beef Rendang 14.50
1 Cheng Lai stringray 19.95
1 Crispy Chk. Salad 14.95
2 Tiger Beer 10.50
White Rice 6.00
Sub total 
Tax 7.43 
Total 91.18

The raw text output is great, but it is useless for software integration because software is not able to consume the raw text as usable data. Taggun is laser focus on building the best receipt transcription engine to process these raw text to produce machine consumable output in JSON format. This allows software in expense management and digital loyalty space to integrate with Taggun easily. See example below:

JSON Output from Taggun

{
  "totalAmount": {
    "data": 91.18,
    "confidenceLevel": 0.9
  },
  "taxAmount": {
    "data": 7.43,
    "confidenceLevel": 0.9
  },
  "date": {
    "data": "2017-03-02T12:00:00.000Z",
    "confidenceLevel": 0.5324657925862506
  },
  // additional data...
}

Great News! +1 OCR Provider

In the past, I have received feedback that it is a business continuity risk if Taggun's receipt scanning API is solely relying on one single provider: Google Vision. And Great News! Taggun has successfully integrated with Microsoft Cognitive Computer Vision now. We don't have a dependency on a single OCR provider anymore. We can now offer an additional option to our customers to choose an OCR Provider between Google Vision and Microsoft Cognitive.

We also measured the accuracy between these two OCR providers by scanning 285 receipts. Google Vision produces a slightly better result at 83.45%. Microsoft Cognitive is not far behind at 81.42%. And this is a great testament to the robustness of Taggun's receipt transcription engine to be able to take in any OCR provider to produce the highest quality result.

Data Sovereignty Law

An additional benefit of integrating with Microsoft Cognitive Services is that it allows us to select the location of where the API instance is hosted in. This allows Taggun to offer a solution that is complying with the Data Sovereignty Law without sending and storing the data outside of the country.