Optimalisasi Image Thresholding pada Optical Character Recognition Pada Sistem Digitalisasi dan Pencarian Dokumen

PDF

Published: Mar 21, 2020

DOI: https://doi.org/10.33322/petir.v13i1.659

Keywords:

OCR Image Processing Document Digitalization

Ridwan Rismanto

Politeknik Negeri Malang

Arief Prasetyo

Dyah Ayu Irawati

Abstract

The administration activity in an institute is largerly done by using a paper based mailing and document as a media. Therefore, a great effort needs to be performed in the case of management and archiving, in the form of providing storage space through the categorizing system. Digitalization of document by scanning it into a digital image is one of the solution to reduce the effort to perform the work of archiving and categorizing such document. It also provide searching feature in the form of metadata, that is manually written during the digitalization process. The metadata can contains the title of document, summary, or category. The needs to manually input this metadata can be solved by utilizing Optical Character Recognition (OCR) that converts any text in the document into readable text storing in the database system. This research focused on the implementation of the OCR system to extract text in the scanned document image and performing optimization of the pre-processing stage which is Image Thresholding. The aim of the optimization is to increase OCR accuracy by tuning threshold value of given value sets, and resulting 0.6 as the best thresholding value. Experiment performed by processing text extraction towards several scanned document and achieving accuration rate of 92.568%.

Downloads

Download data is not yet available.

How to Cite

Rismanto, R., Prasetyo, A., & Irawati, D. A. (2020). Optimalisasi Image Thresholding pada Optical Character Recognition Pada Sistem Digitalisasi dan Pencarian Dokumen. PETIR, 13(1), 1–11. https://doi.org/10.33322/petir.v13i1.659

Issue

Vol. 13 No. 1 (2020): PETIR (Jurnal Pengkajian Dan Penerapan Teknik Informatika)

Section

Articles

Article Sidebar

Main Article Content

Abstract

Downloads

Article Details