Open Access Dataset

Free Receipt OCR Dataset

Free Receipt OCR Dataset

Humans in the Loop is excited to publish a new open access dataset for text processing on receipts. The segmentation is done manually by 12 Humans in the Loop trainees in the Democratic Republic of Congo as part of their trainings, using the Express Expense Sample receipt Dataset. The dataset consists of 192 images with a total of 3,839 bounding boxes, where each box has a different class.

This Receipts dataset is dedicated to the public domain by Humans in the Loop under CC0 1.0 license.

The image in the blog article Free Receipt OCR Dataset
The image in the blog article Free Receipt OCR Dataset

Dataset size

The dataset includes 192 images.

Classes

  1. Business name
  2. Business address
  3. Business phone
  4. Business other information
  5. Time and date
  6. Item information
  7. Subtotal
  8. Tax
  9. Total
  10. Other

Access the dataset by filling in the form below