• Dans Document Analysis and Recognition – ICDAR 2021
  • Éditeur : Springer International Publishing
  • Pages : 507-522

Résumé

There is today several approaches for automatic handwritten document analysis. HTR achieve in particular convincing results both in layout analysis and text recognition, but also in more up-to-date requests like name entity-recognition, script identification or manuscript datation. These systems are trained and evaluated with large open and specialized databases. Manual annotation and proofreading of handwritten documents is a key step to train such systems. However, it is a time-consuming task, especially when the formats required by the systems display considerable variations, or when the interfaces do not manage several level of information. We propose a new modular and collaborative interface online, ready-to-use, for multilevel annotation and quick-view solution for handwritten and printed documents, including for right-to-left languages. This interface undertakes the creation of customized projects, and the management, the conversion and the export of data in the different formats and standards of the state-of-the-art. It includes automated tasks for layout analysis and text lines extraction with high level fine-tuning capacities. We present this new interface through the case study of the creation of a database for Armenian, an under-resourced language with specific paleographical issues.

Partager sur les réseaux sociaux

Publications de chercheur

Publications aux éditions de l’École

Sur les mêmes thématiques

Applications, éditions et jeux de données