Please use this identifier to cite or link to this item: https://hdl.handle.net/20.500.13087/364
Title: A Multilingual Handwritten Character Dataset: T-H-E Dataset
Authors: Bartos, Gaye Ediboğlu
Hoşcan, Yaşar
Kauer, Andras
Hajnal, Éva Nagyné
Keywords: public dataset
handwritten character dataset
offline character recognition
OCR
multilingual
Issue Date: 2020
Publisher: Budapest Tech
Abstract: The absence of handwritten special Latin character datasets prompted the creation of the T-H-E Dataset (Turkish-Hungarian-English handwritten character dataset) contributing to the recognition of multilingual handwritten texts. This paper represents a public-domain dataset including handwritten Turkish, Hungarian and English characters collected from 200 participants. The T-H-E Dataset is formed from 78 different letters represented in 156000 binary characters including both the upper and lower-case versions. The dataset can be downloaded from the web in six different versions enabling users to combine the different alphabets for different recognition purposes. The evaluation of the dataset is carried out by applying the same deep learning architecture on the T-H-E dataset and the EMNIST dataset. The dataset is publicly available at https://github.com/bartosgaye/thedataset.
URI: https://hdl.handle.net/20.500.13087/364
ISSN: 1785-8860
Appears in Collections:WoS İndeksli Yayınlar Koleksiyonu

Show full item record

CORE Recommender

Page view(s)

54
checked on Oct 3, 2022

Google ScholarTM

Check


Items in GCRIS Repository are protected by copyright, with all rights reserved, unless otherwise indicated.