ICFHR2018 Competition on Automated Text Recognition on a READ Dataset

ICFHR2018 Competition on Automated Text Recognition on a READ Dataset. Strauß, T., Leifert, G., Labahn, R., Hodel, T., & Mühlberger, G. In Ecology, pages 477–482, August, 2018. 🏷️ /unread、historical documents、Task analysis、Text recognition、Training、Computational modeling、Optical imaging、automated text recognition、Training data、Data models、fast adaptation、few shot learning
doi abstract bibtex

We summarize the results of a competition on Automated Text Recognition targeting the effective adaptation of recognition engines to essentially new data. The task consists in achieving a minimum character error rate on a previously unknown text corpus from which only a few pages are available for adjusting an already pre-trained recognition engine. This issue addresses a frequent application scenario where only a small amount of task-specific training data is available, because producing this data usually requires much effort. We present the results of five submission. They show that the task is a challenging issue but for certain documents 16 pages of transcription are sufficient to adapt a pre-trained recognition system. 【摘要翻译】我们总结了自动文本识别竞赛的结果，该竞赛的目标是使识别引擎有效适应基本新数据。比赛的任务是在以前未知的文本语料库上实现最小的字符错误率，而该语料库只有几页可供调整已预先训练好的识别引擎。这个问题针对的是一个经常出现的应用场景，即只有少量特定任务的训练数据可用，因为生成这些数据通常需要很大的努力。我们介绍了五次提交的结果。这些结果表明，这项任务是一个具有挑战性的问题，但对于某些文件来说，16 页转录数据足以调整预先训练好的识别系统。

@inproceedings{strauss2018a,
	title = {{ICFHR2018} {Competition} on {Automated} {Text} {Recognition} on a {READ} {Dataset}},
	shorttitle = {{ICFHR2018} 阅读数据集文本自动识别竞赛},
	doi = {10.1109/ICFHR-2018.2018.00089},
	abstract = {We summarize the results of a competition on Automated Text Recognition targeting the effective adaptation of recognition engines to essentially new data. The task consists in achieving a minimum character error rate on a previously unknown text corpus from which only a few pages are available for adjusting an already pre-trained recognition engine. This issue addresses a frequent application scenario where only a small amount of task-specific training data is available, because producing this data usually requires much effort. We present the results of five submission. They show that the task is a challenging issue but for certain documents 16 pages of transcription are sufficient to adapt a pre-trained recognition system.

【摘要翻译】我们总结了自动文本识别竞赛的结果，该竞赛的目标是使识别引擎有效适应基本新数据。比赛的任务是在以前未知的文本语料库上实现最小的字符错误率，而该语料库只有几页可供调整已预先训练好的识别引擎。这个问题针对的是一个经常出现的应用场景，即只有少量特定任务的训练数据可用，因为生成这些数据通常需要很大的努力。我们介绍了五次提交的结果。这些结果表明，这项任务是一个具有挑战性的问题，但对于某些文件来说，16 页转录数据足以调整预先训练好的识别系统。},
	language = {en},
	booktitle = {Ecology},
	author = {Strauß, Tobias and Leifert, Gundram and Labahn, Roger and Hodel, Tobias and Mühlberger, Günter},
	month = aug,
	year = {2018},
	note = {🏷️ /unread、historical documents、Task analysis、Text recognition、Training、Computational modeling、Optical imaging、automated text recognition、Training data、Data models、fast adaptation、few shot learning},
	keywords = {/unread, Computational modeling, Data models, Optical imaging, Task analysis, Text recognition, Training, Training data, automated text recognition, fast adaptation, few shot learning, historical documents},
	pages = {477--482},
}

Downloads: 0

{"_id":"EYG8KHRwkvStfE4RY","bibbaseid":"strau-leifert-labahn-hodel-mhlberger-icfhr2018competitiononautomatedtextrecognitiononareaddataset-2018","author_short":["Strauß, T.","Leifert, G.","Labahn, R.","Hodel, T.","Mühlberger, G."],"bibdata":{"bibtype":"inproceedings","type":"inproceedings","title":"ICFHR2018 Competition on Automated Text Recognition on a READ Dataset","shorttitle":"ICFHR2018 阅读数据集文本自动识别竞赛","doi":"10.1109/ICFHR-2018.2018.00089","abstract":"We summarize the results of a competition on Automated Text Recognition targeting the effective adaptation of recognition engines to essentially new data. The task consists in achieving a minimum character error rate on a previously unknown text corpus from which only a few pages are available for adjusting an already pre-trained recognition engine. This issue addresses a frequent application scenario where only a small amount of task-specific training data is available, because producing this data usually requires much effort. We present the results of five submission. They show that the task is a challenging issue but for certain documents 16 pages of transcription are sufficient to adapt a pre-trained recognition system. 【摘要翻译】我们总结了自动文本识别竞赛的结果，该竞赛的目标是使识别引擎有效适应基本新数据。比赛的任务是在以前未知的文本语料库上实现最小的字符错误率，而该语料库只有几页可供调整已预先训练好的识别引擎。这个问题针对的是一个经常出现的应用场景，即只有少量特定任务的训练数据可用，因为生成这些数据通常需要很大的努力。我们介绍了五次提交的结果。这些结果表明，这项任务是一个具有挑战性的问题，但对于某些文件来说，16 页转录数据足以调整预先训练好的识别系统。","language":"en","booktitle":"Ecology","author":[{"propositions":[],"lastnames":["Strauß"],"firstnames":["Tobias"],"suffixes":[]},{"propositions":[],"lastnames":["Leifert"],"firstnames":["Gundram"],"suffixes":[]},{"propositions":[],"lastnames":["Labahn"],"firstnames":["Roger"],"suffixes":[]},{"propositions":[],"lastnames":["Hodel"],"firstnames":["Tobias"],"suffixes":[]},{"propositions":[],"lastnames":["Mühlberger"],"firstnames":["Günter"],"suffixes":[]}],"month":"August","year":"2018","note":"🏷️ /unread、historical documents、Task analysis、Text recognition、Training、Computational modeling、Optical imaging、automated text recognition、Training data、Data models、fast adaptation、few shot learning","keywords":"/unread, Computational modeling, Data models, Optical imaging, Task analysis, Text recognition, Training, Training data, automated text recognition, fast adaptation, few shot learning, historical documents","pages":"477–482","bibtex":"@inproceedings{strauss2018a,\n\ttitle = {{ICFHR2018} {Competition} on {Automated} {Text} {Recognition} on a {READ} {Dataset}},\n\tshorttitle = {{ICFHR2018} 阅读数据集文本自动识别竞赛},\n\tdoi = {10.1109/ICFHR-2018.2018.00089},\n\tabstract = {We summarize the results of a competition on Automated Text Recognition targeting the effective adaptation of recognition engines to essentially new data. The task consists in achieving a minimum character error rate on a previously unknown text corpus from which only a few pages are available for adjusting an already pre-trained recognition engine. This issue addresses a frequent application scenario where only a small amount of task-specific training data is available, because producing this data usually requires much effort. We present the results of five submission. They show that the task is a challenging issue but for certain documents 16 pages of transcription are sufficient to adapt a pre-trained recognition system.\n\n【摘要翻译】我们总结了自动文本识别竞赛的结果，该竞赛的目标是使识别引擎有效适应基本新数据。比赛的任务是在以前未知的文本语料库上实现最小的字符错误率，而该语料库只有几页可供调整已预先训练好的识别引擎。这个问题针对的是一个经常出现的应用场景，即只有少量特定任务的训练数据可用，因为生成这些数据通常需要很大的努力。我们介绍了五次提交的结果。这些结果表明，这项任务是一个具有挑战性的问题，但对于某些文件来说，16 页转录数据足以调整预先训练好的识别系统。},\n\tlanguage = {en},\n\tbooktitle = {Ecology},\n\tauthor = {Strauß, Tobias and Leifert, Gundram and Labahn, Roger and Hodel, Tobias and Mühlberger, Günter},\n\tmonth = aug,\n\tyear = {2018},\n\tnote = {🏷️ /unread、historical documents、Task analysis、Text recognition、Training、Computational modeling、Optical imaging、automated text recognition、Training data、Data models、fast adaptation、few shot learning},\n\tkeywords = {/unread, Computational modeling, Data models, Optical imaging, Task analysis, Text recognition, Training, Training data, automated text recognition, fast adaptation, few shot learning, historical documents},\n\tpages = {477--482},\n}\n\n","author_short":["Strauß, T.","Leifert, G.","Labahn, R.","Hodel, T.","Mühlberger, G."],"key":"strauss2018a","id":"strauss2018a","bibbaseid":"strau-leifert-labahn-hodel-mhlberger-icfhr2018competitiononautomatedtextrecognitiononareaddataset-2018","role":"author","urls":{},"keyword":["/unread","Computational modeling","Data models","Optical imaging","Task analysis","Text recognition","Training","Training data","automated text recognition","fast adaptation","few shot learning","historical documents"],"metadata":{"authorlinks":{}}},"bibtype":"inproceedings","biburl":"https://api.zotero.org/groups/2386895/collections/4YE3UGQK/items?format=bibtex&limit=100","dataSources":["dFoKSNLtBNTwMTH2t","mxnMczcXFksD2jZuz"],"keywords":["/unread","computational modeling","data models","optical imaging","task analysis","text recognition","training","training data","automated text recognition","fast adaptation","few shot learning","historical documents"],"search_terms":["icfhr2018","competition","automated","text","recognition","read","dataset","strauß","leifert","labahn","hodel","mühlberger"],"title":"ICFHR2018 Competition on Automated Text Recognition on a READ Dataset","year":2018}