ICFHR2018 Competition on Automated Text Recognition on a READ Dataset. Strauß, T., Leifert, G., Labahn, R., Hodel, T., & Mühlberger, G. In Ecology, pages 477–482, August, 2018. 🏷️ /unread、historical documents、Task analysis、Text recognition、Training、Computational modeling、Optical imaging、automated text recognition、Training data、Data models、fast adaptation、few shot learning
doi  abstract   bibtex   
We summarize the results of a competition on Automated Text Recognition targeting the effective adaptation of recognition engines to essentially new data. The task consists in achieving a minimum character error rate on a previously unknown text corpus from which only a few pages are available for adjusting an already pre-trained recognition engine. This issue addresses a frequent application scenario where only a small amount of task-specific training data is available, because producing this data usually requires much effort. We present the results of five submission. They show that the task is a challenging issue but for certain documents 16 pages of transcription are sufficient to adapt a pre-trained recognition system. 【摘要翻译】我们总结了自动文本识别竞赛的结果,该竞赛的目标是使识别引擎有效适应基本新数据。比赛的任务是在以前未知的文本语料库上实现最小的字符错误率,而该语料库只有几页可供调整已预先训练好的识别引擎。这个问题针对的是一个经常出现的应用场景,即只有少量特定任务的训练数据可用,因为生成这些数据通常需要很大的努力。我们介绍了五次提交的结果。这些结果表明,这项任务是一个具有挑战性的问题,但对于某些文件来说,16 页转录数据足以调整预先训练好的识别系统。
@inproceedings{strauss2018a,
	title = {{ICFHR2018} {Competition} on {Automated} {Text} {Recognition} on a {READ} {Dataset}},
	shorttitle = {{ICFHR2018} 阅读数据集文本自动识别竞赛},
	doi = {10.1109/ICFHR-2018.2018.00089},
	abstract = {We summarize the results of a competition on Automated Text Recognition targeting the effective adaptation of recognition engines to essentially new data. The task consists in achieving a minimum character error rate on a previously unknown text corpus from which only a few pages are available for adjusting an already pre-trained recognition engine. This issue addresses a frequent application scenario where only a small amount of task-specific training data is available, because producing this data usually requires much effort. We present the results of five submission. They show that the task is a challenging issue but for certain documents 16 pages of transcription are sufficient to adapt a pre-trained recognition system.

【摘要翻译】我们总结了自动文本识别竞赛的结果,该竞赛的目标是使识别引擎有效适应基本新数据。比赛的任务是在以前未知的文本语料库上实现最小的字符错误率,而该语料库只有几页可供调整已预先训练好的识别引擎。这个问题针对的是一个经常出现的应用场景,即只有少量特定任务的训练数据可用,因为生成这些数据通常需要很大的努力。我们介绍了五次提交的结果。这些结果表明,这项任务是一个具有挑战性的问题,但对于某些文件来说,16 页转录数据足以调整预先训练好的识别系统。},
	language = {en},
	booktitle = {Ecology},
	author = {Strauß, Tobias and Leifert, Gundram and Labahn, Roger and Hodel, Tobias and Mühlberger, Günter},
	month = aug,
	year = {2018},
	note = {🏷️ /unread、historical documents、Task analysis、Text recognition、Training、Computational modeling、Optical imaging、automated text recognition、Training data、Data models、fast adaptation、few shot learning},
	keywords = {/unread, Computational modeling, Data models, Optical imaging, Task analysis, Text recognition, Training, Training data, automated text recognition, fast adaptation, few shot learning, historical documents},
	pages = {477--482},
}

Downloads: 0