Optimal Training Dataset Preparation for AI-Supported Multilanguage Real-Time OCRs Using Visual Methods
-
Published:2023-12-08
Issue:24
Volume:13
Page:13107
-
ISSN:2076-3417
-
Container-title:Applied Sciences
-
language:en
-
Short-container-title:Applied Sciences
Author:
Biró Attila123ORCID, Szilágyi Sándor Miklós1ORCID, Szilágyi László45ORCID
Affiliation:
1. Department of Electrical Engineering and Information Technology, George Emil Palade University of Medicine, Pharmacy, Science, and Technology of Targu Mures, Str. Nicolae Iorga, Nr. 1, 540088 Targu Mures, Romania 2. Department of Physiotherapy, University of Malaga, 29071 Malaga, Spain 3. Biomedical Research Institute of Malaga (IBIMA), 29590 Malaga, Spain 4. Physiological Controls Research Center, Óbuda University, Bécsi út 96/B, 1034 Budapest, Hungary 5. Computational Intelligence Research Group, Sapientia Hungarian University of Transylvania, 540485 Targu Mures, Romania
Abstract
In the realm of multilingual, AI-powered, real-time optical character recognition systems, this research explores the creation of an optimal, vocabulary-based training dataset. This comprehensive endeavor seeks to encompass a range of criteria: comprehensive language representation, high-quality and diverse data, balanced datasets, contextual understanding, domain-specific adaptation, robustness and noise tolerance, and scalability and extensibility. The approach aims to leverage techniques like convolutional neural networks, recurrent neural networks, convolutional recurrent neural networks, and single visual models for scene text recognition. While focusing on English, Hungarian, and Japanese as representative languages, the proposed methodology can be extended to any existing or even synthesized languages. The development of accurate, efficient, and versatile OCR systems is at the core of this research, offering societal benefits by bridging global communication gaps, ensuring reliability in diverse environments, and demonstrating the adaptability of AI to evolving needs. This work not only mirrors the state of the art in the field but also paves new paths for future innovation, accentuating the importance of sustained research in advancing AI’s potential to shape societal development.
Funder
ITware, Hungary University of Malaga Department of Electrical Engineering and Information Technology of George Emil Palade University of Medicine, Pharmacy, Science, and Technology of Targu Mures Consolidator Excellence Researcher Program of Óbuda University, Budapest, Hungary
Subject
Fluid Flow and Transfer Processes,Computer Science Applications,Process Chemistry and Technology,General Engineering,Instrumentation,General Materials Science
Reference57 articles.
1. Biró, A., Jánosi-Rancz, K.T., Szilágyi, L., Cuesta-Vargas, A.I., Martín-Martín, J., and Szilágyi, S.M. (2022). Visual Object Detection with DETR to Support Video-Diagnosis Using Conference Tools. Appl. Sci., 12. 2. Medical Informatics and Digital Health Multilingual Ontology (MIMO): A tool to improve international collaborations;Benis;Int. J. Med. Inform.,2022 3. Shah, S.R., Kaushik, A., Sharma, S., and Shah, J. (2020). Opinion-Mining on Marglish and Devanagari Comments of YouTube Cookery Channels Using Parametric and Non-Parametric Learning Models. Big Data Cogn. Comput., 4. 4. Shah, S.R., and Kaushik, A. (2019). Sentiment Analysis on Indian Indigenous Languages: A Review on Multilingual Opinion Mining. arXiv. 5. OCR Studymate;Pathak;Int. J. Res. Appl. Sci. Eng. Technol.,2022
|
|