Banner Image

All Services

Programming & Development

OCR Model

$5/hr Starting at $25

? Objective: Client have 1million files which contains ingredient of chemical(s). Client wanted to retrieve ingredient name and their value(s) from files available in S3 Bucket. ? Analytical Approach: Retrieve the data from s3 bucket into EC2 and then perform OCR (Optical Character Recognition) to get the content from the files. Developed a model to fetch key value Pair(s) from all contents. For OCR, we used Imagemagick, ghostscript, pytesseract etc. library and for S3 bucket connectivity we used Boto3 package. We have also implemented multiprocessing to reduce the time for OCR execution. Tools Used – Python, Jupyter Notebook, Excel etc.

About

$5/hr Ongoing

Download Resume

? Objective: Client have 1million files which contains ingredient of chemical(s). Client wanted to retrieve ingredient name and their value(s) from files available in S3 Bucket. ? Analytical Approach: Retrieve the data from s3 bucket into EC2 and then perform OCR (Optical Character Recognition) to get the content from the files. Developed a model to fetch key value Pair(s) from all contents. For OCR, we used Imagemagick, ghostscript, pytesseract etc. library and for S3 bucket connectivity we used Boto3 package. We have also implemented multiprocessing to reduce the time for OCR execution. Tools Used – Python, Jupyter Notebook, Excel etc.

Skills & Expertise

ImageMagickMicrosoft ExcelOpticalOptical Character Recognition (OCR)Python

0 Reviews

This Freelancer has not received any feedback.