Code

AWS Amazon Textract - Extract Text Forms Tables from Images and PDFs with ML

AWS Amazon Textract - Extract Text Forms Tables from Images and PDFs with ML

By
Cart 76 sales





Description

Amazon Textract uses a machine learning (ML) service that automatically extracts text, handwriting, and data from scanned documents. It goes beyond simple optical character recognition (OCR) to identify, understand, and extract data from forms and tables. Today, many companies manually extract data from scanned documents such as PDFs, Images (PNG | JPEG), Tables, and Forms, or through simple OCR software that requires manual configuration (which often must be updated when the form changes). To overcome these manual and expensive processes, Textract uses ML to read and process any type of document, accurately extracting text, handwriting, tables, and other data with no manual effort. Amazon Textract can extract the data in minutes instead of hours or days.

In addition you can leverage Receipt Analyze feature that can find the vendor name on a receipt even if it’s only indicated within a logo on the page without an explicit label called “vendor”. It can also find and extract item, quantity, and prices that are not labeled with column headers for line items.

As part of the AWS Free Tier, you can get started with Amazon Textract for free. The Free Tier lasts for 3 months, and new AWS customers can analyze up to 1,000 pages per month using Text Tasks and up to 100 pages per month using the Form or Table Tasks.

Online Demo


Features of Amazon Textract

  1. Powered By Amazon Web Services
  2. Support for documents in 6 Languages (EN | FR | DE | IT | PT | ES)
  3. Support for documents in PDF | PNG | JPEG formats
  4. Support for handwritten documents in English language
  5. Identify Key Value Pairs Automatically
  6. Identify Table Values Automatically
  7. Receipt Analysis for Summary and Item data
  8. Support for documents in PNG | JPEG formats up to 10MB in size
  9. Support for documents in PDF format up to 500MB in size
  10. Support for documents in PDF format up to 3000 pages
  11. No Machine Learning expertise required
  12. Extract data quickly & accurately
  13. No code or templates to maintain
  14. Lower document processing costs
  15. Image Scanner
  16. PDF Scanner
  17. Fully Responsive Interface
  18. Closely Monitor Estimated Spending for Textract Services
  19. Powerful Admin Panel
  20. Developed with PHP 7.4.x and Laravel 8.4.x
  21. Detailed and Comprehensive Documentation


Cloud Vendor Textract Prices


Notes

Please note, for the script to work correctly, you need to have valid AWS Account.

Latest Changes

13.02.2022
     - Complete redesign with Laravel Framework
     - Powerful Transcribe User & Admin Panels

20.06.2020 - v1.0.1
     - Update: Documentation 
     - Fix: Lambda function minor fix

08.05.2020 - v1.0.0
     - Initial Release



by
by
by
by
by
by