var bibbase_data = {"data":"\"Loading..\"\n\n
\n\n \n\n \n\n \n \n\n \n\n \n \n\n \n\n \n
\n generated by\n \n \"bibbase.org\"\n\n \n
\n \n\n
\n\n \n\n\n
\n\n Excellent! Next you can\n create a new website with this list, or\n embed it in an existing web page by copying & pasting\n any of the following snippets.\n\n
\n JavaScript\n (easiest)\n
\n \n <script src=\"https://bibbase.org/show?bib=https%3A%2F%2Fapi.zotero.org%2Fgroups%2F2386895%2Fcollections%2F7PPRTB2H%2Fitems%3Fformat%3Dbibtex%26limit%3D100&jsonp=1&group0=author_short&jsonp=1\"></script>\n \n
\n\n PHP\n
\n \n <?php\n $contents = file_get_contents(\"https://bibbase.org/show?bib=https%3A%2F%2Fapi.zotero.org%2Fgroups%2F2386895%2Fcollections%2F7PPRTB2H%2Fitems%3Fformat%3Dbibtex%26limit%3D100&jsonp=1&group0=author_short\");\n print_r($contents);\n ?>\n \n
\n\n iFrame\n (not recommended)\n
\n \n <iframe src=\"https://bibbase.org/show?bib=https%3A%2F%2Fapi.zotero.org%2Fgroups%2F2386895%2Fcollections%2F7PPRTB2H%2Fitems%3Fformat%3Dbibtex%26limit%3D100&jsonp=1&group0=author_short\"></iframe>\n \n
\n\n

\n For more details see the documention.\n

\n
\n
\n\n
\n\n This is a preview! To use this list on your own web site\n or create a new web site from it,\n create a free account. The file will be added\n and you will be able to edit it in the File Manager.\n We will show you instructions once you've created your account.\n
\n\n
\n\n

To the site owner:

\n\n

Action required! Mendeley is changing its\n API. In order to keep using Mendeley with BibBase past April\n 14th, you need to:\n

    \n
  1. renew the authorization for BibBase on Mendeley, and
  2. \n
  3. update the BibBase URL\n in your page the same way you did when you initially set up\n this page.\n
  4. \n
\n

\n\n

\n \n \n Fix it now\n

\n
\n\n
\n\n\n
\n \n \n
\n
\n  \n Aguilar, S.\n \n \n (1)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n \n A Named Entity Recognition Model for Medieval Latin Charters.\n \n \n \n \n\n\n \n Chastang, P.; Aguilar, S. T.; and Tannier, X.\n\n\n \n\n\n\n Digital Humanities Quarterly, 15(4). 2021.\n \n\n\n\n
\n\n\n\n \n \n \"APaper\n  \n \n\n \n\n \n link\n  \n \n\n bibtex\n \n\n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@article{chastang_named_2021,\n\ttitle = {A {Named} {Entity} {Recognition} {Model} for {Medieval} {Latin} {Charters}},\n\tvolume = {15},\n\tissn = {1938-4122},\n\turl = {http://www.digitalhumanities.org/dhq/vol/15/4/000574/000574.html},\n\tnumber = {4},\n\tjournal = {Digital Humanities Quarterly},\n\tauthor = {Chastang, Pierre and Aguilar, Sergio Torres and Tannier, Xavier},\n\tyear = {2021},\n}\n\n
\n
\n\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Akhtar, N.\n \n \n (2)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n \n A Comprehensive Overview of Large Language Models.\n \n \n \n \n\n\n \n Naveed, H.; Khan, A. U.; Qiu, S.; Saqib, M.; Anwar, S.; Usman, M.; Akhtar, N.; Barnes, N.; and Mian, A.\n\n\n \n\n\n\n December 2023.\n arXiv:2307.06435 [cs]\n\n\n\n
\n\n\n\n \n \n \"APaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n \n \n \n \n \n \n \n \n \n \n\n\n\n
\n
@misc{naveed2023,\n\ttitle = {A {Comprehensive} {Overview} of {Large} {Language} {Models}},\n\turl = {http://arxiv.org/abs/2307.06435},\n\tdoi = {10.48550/arXiv.2307.06435},\n\tabstract = {Large Language Models (LLMs) have recently demonstrated remarkable capabilities in natural language processing tasks and beyond. This success of LLMs has led to a large influx of research contributions in this direction. These works encompass diverse topics such as architectural innovations, better training strategies, context length improvements, fine-tuning, multi-modal LLMs, robotics, datasets, benchmarking, efficiency, and more. With the rapid development of techniques and regular breakthroughs in LLM research, it has become considerably challenging to perceive the bigger picture of the advances in this direction. Considering the rapidly emerging plethora of literature on LLMs, it is imperative that the research community is able to benefit from a concise yet comprehensive overview of the recent developments in this field. This article provides an overview of the existing literature on a broad range of LLM-related concepts. Our self-contained comprehensive overview of LLMs discusses relevant background concepts along with covering the advanced topics at the frontier of research in LLMs. This review article is intended to not only provide a systematic survey but also a quick comprehensive reference for the researchers and practitioners to draw insights from extensive informative summaries of the existing works to advance the LLM research.},\n\turldate = {2024-02-19},\n\tpublisher = {arXiv},\n\tauthor = {Naveed, Humza and Khan, Asad Ullah and Qiu, Shi and Saqib, Muhammad and Anwar, Saeed and Usman, Muhammad and Akhtar, Naveed and Barnes, Nick and Mian, Ajmal},\n\tmonth = dec,\n\tyear = {2023},\n\tnote = {arXiv:2307.06435 [cs]},\n\tkeywords = {Computer Science - Computation and Language, LLM, Large Language Model, Overview, Überblick},\n}\n\n
\n
\n\n\n
\n Large Language Models (LLMs) have recently demonstrated remarkable capabilities in natural language processing tasks and beyond. This success of LLMs has led to a large influx of research contributions in this direction. These works encompass diverse topics such as architectural innovations, better training strategies, context length improvements, fine-tuning, multi-modal LLMs, robotics, datasets, benchmarking, efficiency, and more. With the rapid development of techniques and regular breakthroughs in LLM research, it has become considerably challenging to perceive the bigger picture of the advances in this direction. Considering the rapidly emerging plethora of literature on LLMs, it is imperative that the research community is able to benefit from a concise yet comprehensive overview of the recent developments in this field. This article provides an overview of the existing literature on a broad range of LLM-related concepts. Our self-contained comprehensive overview of LLMs discusses relevant background concepts along with covering the advanced topics at the frontier of research in LLMs. This review article is intended to not only provide a systematic survey but also a quick comprehensive reference for the researchers and practitioners to draw insights from extensive informative summaries of the existing works to advance the LLM research.\n
\n\n\n
\n\n\n
\n \n\n \n \n \n \n \n \n A Comprehensive Overview of Large Language Models.\n \n \n \n \n\n\n \n Naveed, H.; Khan, A. U.; Qiu, S.; Saqib, M.; Anwar, S.; Usman, M.; Akhtar, N.; Barnes, N.; and Mian, A.\n\n\n \n\n\n\n December 2023.\n arXiv:2307.06435 [cs]\n\n\n\n
\n\n\n\n \n \n \"APaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n \n \n \n \n \n \n \n \n \n \n\n\n\n
\n
@misc{naveed2023a,\n\ttitle = {A {Comprehensive} {Overview} of {Large} {Language} {Models}},\n\turl = {http://arxiv.org/abs/2307.06435},\n\tdoi = {10.48550/arXiv.2307.06435},\n\tabstract = {Large Language Models (LLMs) have recently demonstrated remarkable capabilities in natural language processing tasks and beyond. This success of LLMs has led to a large influx of research contributions in this direction. These works encompass diverse topics such as architectural innovations, better training strategies, context length improvements, fine-tuning, multi-modal LLMs, robotics, datasets, benchmarking, efficiency, and more. With the rapid development of techniques and regular breakthroughs in LLM research, it has become considerably challenging to perceive the bigger picture of the advances in this direction. Considering the rapidly emerging plethora of literature on LLMs, it is imperative that the research community is able to benefit from a concise yet comprehensive overview of the recent developments in this field. This article provides an overview of the existing literature on a broad range of LLM-related concepts. Our self-contained comprehensive overview of LLMs discusses relevant background concepts along with covering the advanced topics at the frontier of research in LLMs. This review article is intended to not only provide a systematic survey but also a quick comprehensive reference for the researchers and practitioners to draw insights from extensive informative summaries of the existing works to advance the LLM research.},\n\turldate = {2024-02-19},\n\tpublisher = {arXiv},\n\tauthor = {Naveed, Humza and Khan, Asad Ullah and Qiu, Shi and Saqib, Muhammad and Anwar, Saeed and Usman, Muhammad and Akhtar, Naveed and Barnes, Nick and Mian, Ajmal},\n\tmonth = dec,\n\tyear = {2023},\n\tnote = {arXiv:2307.06435 [cs]},\n\tkeywords = {Computer Science - Computation and Language, LLM, Large Language Model, Overview, Überblick},\n}\n\n
\n
\n\n\n
\n Large Language Models (LLMs) have recently demonstrated remarkable capabilities in natural language processing tasks and beyond. This success of LLMs has led to a large influx of research contributions in this direction. These works encompass diverse topics such as architectural innovations, better training strategies, context length improvements, fine-tuning, multi-modal LLMs, robotics, datasets, benchmarking, efficiency, and more. With the rapid development of techniques and regular breakthroughs in LLM research, it has become considerably challenging to perceive the bigger picture of the advances in this direction. Considering the rapidly emerging plethora of literature on LLMs, it is imperative that the research community is able to benefit from a concise yet comprehensive overview of the recent developments in this field. This article provides an overview of the existing literature on a broad range of LLM-related concepts. Our self-contained comprehensive overview of LLMs discusses relevant background concepts along with covering the advanced topics at the frontier of research in LLMs. This review article is intended to not only provide a systematic survey but also a quick comprehensive reference for the researchers and practitioners to draw insights from extensive informative summaries of the existing works to advance the LLM research.\n
\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Almazan, J.\n \n \n (1)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n \n Word Spotting and Recognition with Embedded Attributes.\n \n \n \n \n\n\n \n Almazan, J.; Gordo, A.; Fornes, A.; and Valveny, E.\n\n\n \n\n\n\n IEEE Transactions on Pattern Analysis and Machine Intelligence, 36(12): 2552–2566. December 2014.\n \n\n\n\n
\n\n\n\n \n \n \"WordPaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@article{almazanWordSpottingRecognition2014,\n\ttitle = {Word {Spotting} and {Recognition} with {Embedded} {Attributes}},\n\tvolume = {36},\n\tissn = {0162-8828, 2160-9292},\n\turl = {http://ieeexplore.ieee.org/document/6857995/},\n\tdoi = {10.1109/TPAMI.2014.2339814},\n\tnumber = {12},\n\turldate = {2023-11-17},\n\tjournal = {IEEE Transactions on Pattern Analysis and Machine Intelligence},\n\tauthor = {Almazan, Jon and Gordo, Albert and Fornes, Alicia and Valveny, Ernest},\n\tmonth = dec,\n\tyear = {2014},\n\tpages = {2552--2566},\n}\n\n
\n
\n\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Amodei, D.\n \n \n (1)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n \n Language Models are Unsupervised Multitask Learners.\n \n \n \n \n\n\n \n Radford, A.; Wu, J.; Child, R.; Luan, D.; Amodei, D.; and Sutskever, I.\n\n\n \n\n\n\n In 2019. \n \n\n\n\n
\n\n\n\n \n \n \"LanguagePaper\n  \n \n\n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@inproceedings{radford_language_2019,\n\ttitle = {Language {Models} are {Unsupervised} {Multitask} {Learners}},\n\turl = {https://www.semanticscholar.org/paper/Language-Models-are-Unsupervised-Multitask-Learners-Radford-Wu/9405cc0d6169988371b2755e573cc28650d14dfe},\n\tabstract = {Natural language processing tasks, such as question answering, machine translation, reading comprehension, and summarization, are typically approached with supervised learning on taskspecific datasets. We demonstrate that language models begin to learn these tasks without any explicit supervision when trained on a new dataset of millions of webpages called WebText. When conditioned on a document plus questions, the answers generated by the language model reach 55 F1 on the CoQA dataset matching or exceeding the performance of 3 out of 4 baseline systems without using the 127,000+ training examples. The capacity of the language model is essential to the success of zero-shot task transfer and increasing it improves performance in a log-linear fashion across tasks. Our largest model, GPT-2, is a 1.5B parameter Transformer that achieves state of the art results on 7 out of 8 tested language modeling datasets in a zero-shot setting but still underfits WebText. Samples from the model reflect these improvements and contain coherent paragraphs of text. These findings suggest a promising path towards building language processing systems which learn to perform tasks from their naturally occurring demonstrations.},\n\turldate = {2023-02-02},\n\tauthor = {Radford, Alec and Wu, Jeff and Child, Rewon and Luan, D. and Amodei, Dario and Sutskever, Ilya},\n\tyear = {2019},\n}\n\n
\n
\n\n\n
\n Natural language processing tasks, such as question answering, machine translation, reading comprehension, and summarization, are typically approached with supervised learning on taskspecific datasets. We demonstrate that language models begin to learn these tasks without any explicit supervision when trained on a new dataset of millions of webpages called WebText. When conditioned on a document plus questions, the answers generated by the language model reach 55 F1 on the CoQA dataset matching or exceeding the performance of 3 out of 4 baseline systems without using the 127,000+ training examples. The capacity of the language model is essential to the success of zero-shot task transfer and increasing it improves performance in a log-linear fashion across tasks. Our largest model, GPT-2, is a 1.5B parameter Transformer that achieves state of the art results on 7 out of 8 tested language modeling datasets in a zero-shot setting but still underfits WebText. Samples from the model reflect these improvements and contain coherent paragraphs of text. These findings suggest a promising path towards building language processing systems which learn to perform tasks from their naturally occurring demonstrations.\n
\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Androutsopoulos, I.\n \n \n (1)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n \n Restoring and attributing ancient texts using deep neural networks.\n \n \n \n \n\n\n \n Assael, Y.; Sommerschield, T.; Shillingford, B.; Bordbar, M.; Pavlopoulos, J.; Chatzipanagiotou, M.; Androutsopoulos, I.; Prag, J.; and de Freitas, N.\n\n\n \n\n\n\n Nature, 603(7900): 280–283. March 2022.\n Number: 7900 Publisher: Nature Publishing Group\n\n\n\n
\n\n\n\n \n \n \"RestoringPaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n \n \n \n \n \n \n\n\n\n
\n
@article{assael_restoring_2022,\n\ttitle = {Restoring and attributing ancient texts using deep neural networks},\n\tvolume = {603},\n\tcopyright = {2022 The Author(s)},\n\tissn = {1476-4687},\n\turl = {https://www.nature.com/articles/s41586-022-04448-z/},\n\tdoi = {10.1038/s41586-022-04448-z},\n\tabstract = {Ancient history relies on disciplines such as epigraphy—the study of inscribed texts known as inscriptions—for evidence of the thought, language, society and history of past civilizations1. However, over the centuries, many inscriptions have been damaged to the point of illegibility, transported far from their original location and their date of writing is steeped in uncertainty. Here we present Ithaca, a deep neural network for the textual restoration, geographical attribution and chronological attribution of ancient Greek inscriptions. Ithaca is designed to assist and expand the historian’s workflow. The architecture of Ithaca focuses on collaboration, decision support and interpretability. While Ithaca alone achieves 62\\% accuracy when restoring damaged texts, the use of Ithaca by historians improved their accuracy from 25\\% to 72\\%, confirming the synergistic effect of this research tool. Ithaca can attribute inscriptions to their original location with an accuracy of 71\\% and can date them to less than 30 years of their ground-truth ranges, redating key texts of Classical Athens and contributing to topical debates in ancient history. This research shows how models such as Ithaca can unlock the cooperative potential between artificial intelligence and historians, transformationally impacting the way that we study and write about one of the most important periods in human history.},\n\tlanguage = {en},\n\tnumber = {7900},\n\turldate = {2022-09-28},\n\tjournal = {Nature},\n\tauthor = {Assael, Yannis and Sommerschield, Thea and Shillingford, Brendan and Bordbar, Mahyar and Pavlopoulos, John and Chatzipanagiotou, Marita and Androutsopoulos, Ion and Prag, Jonathan and de Freitas, Nando},\n\tmonth = mar,\n\tyear = {2022},\n\tnote = {Number: 7900\nPublisher: Nature Publishing Group},\n\tkeywords = {Archaeology, Computer science, History},\n\tpages = {280--283},\n}\n\n
\n
\n\n\n
\n Ancient history relies on disciplines such as epigraphy—the study of inscribed texts known as inscriptions—for evidence of the thought, language, society and history of past civilizations1. However, over the centuries, many inscriptions have been damaged to the point of illegibility, transported far from their original location and their date of writing is steeped in uncertainty. Here we present Ithaca, a deep neural network for the textual restoration, geographical attribution and chronological attribution of ancient Greek inscriptions. Ithaca is designed to assist and expand the historian’s workflow. The architecture of Ithaca focuses on collaboration, decision support and interpretability. While Ithaca alone achieves 62% accuracy when restoring damaged texts, the use of Ithaca by historians improved their accuracy from 25% to 72%, confirming the synergistic effect of this research tool. Ithaca can attribute inscriptions to their original location with an accuracy of 71% and can date them to less than 30 years of their ground-truth ranges, redating key texts of Classical Athens and contributing to topical debates in ancient history. This research shows how models such as Ithaca can unlock the cooperative potential between artificial intelligence and historians, transformationally impacting the way that we study and write about one of the most important periods in human history.\n
\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Anwar, S.\n \n \n (2)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n \n A Comprehensive Overview of Large Language Models.\n \n \n \n \n\n\n \n Naveed, H.; Khan, A. U.; Qiu, S.; Saqib, M.; Anwar, S.; Usman, M.; Akhtar, N.; Barnes, N.; and Mian, A.\n\n\n \n\n\n\n December 2023.\n arXiv:2307.06435 [cs]\n\n\n\n
\n\n\n\n \n \n \"APaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n \n \n \n \n \n \n \n \n \n \n\n\n\n
\n
@misc{naveed2023,\n\ttitle = {A {Comprehensive} {Overview} of {Large} {Language} {Models}},\n\turl = {http://arxiv.org/abs/2307.06435},\n\tdoi = {10.48550/arXiv.2307.06435},\n\tabstract = {Large Language Models (LLMs) have recently demonstrated remarkable capabilities in natural language processing tasks and beyond. This success of LLMs has led to a large influx of research contributions in this direction. These works encompass diverse topics such as architectural innovations, better training strategies, context length improvements, fine-tuning, multi-modal LLMs, robotics, datasets, benchmarking, efficiency, and more. With the rapid development of techniques and regular breakthroughs in LLM research, it has become considerably challenging to perceive the bigger picture of the advances in this direction. Considering the rapidly emerging plethora of literature on LLMs, it is imperative that the research community is able to benefit from a concise yet comprehensive overview of the recent developments in this field. This article provides an overview of the existing literature on a broad range of LLM-related concepts. Our self-contained comprehensive overview of LLMs discusses relevant background concepts along with covering the advanced topics at the frontier of research in LLMs. This review article is intended to not only provide a systematic survey but also a quick comprehensive reference for the researchers and practitioners to draw insights from extensive informative summaries of the existing works to advance the LLM research.},\n\turldate = {2024-02-19},\n\tpublisher = {arXiv},\n\tauthor = {Naveed, Humza and Khan, Asad Ullah and Qiu, Shi and Saqib, Muhammad and Anwar, Saeed and Usman, Muhammad and Akhtar, Naveed and Barnes, Nick and Mian, Ajmal},\n\tmonth = dec,\n\tyear = {2023},\n\tnote = {arXiv:2307.06435 [cs]},\n\tkeywords = {Computer Science - Computation and Language, LLM, Large Language Model, Overview, Überblick},\n}\n\n
\n
\n\n\n
\n Large Language Models (LLMs) have recently demonstrated remarkable capabilities in natural language processing tasks and beyond. This success of LLMs has led to a large influx of research contributions in this direction. These works encompass diverse topics such as architectural innovations, better training strategies, context length improvements, fine-tuning, multi-modal LLMs, robotics, datasets, benchmarking, efficiency, and more. With the rapid development of techniques and regular breakthroughs in LLM research, it has become considerably challenging to perceive the bigger picture of the advances in this direction. Considering the rapidly emerging plethora of literature on LLMs, it is imperative that the research community is able to benefit from a concise yet comprehensive overview of the recent developments in this field. This article provides an overview of the existing literature on a broad range of LLM-related concepts. Our self-contained comprehensive overview of LLMs discusses relevant background concepts along with covering the advanced topics at the frontier of research in LLMs. This review article is intended to not only provide a systematic survey but also a quick comprehensive reference for the researchers and practitioners to draw insights from extensive informative summaries of the existing works to advance the LLM research.\n
\n\n\n
\n\n\n
\n \n\n \n \n \n \n \n \n A Comprehensive Overview of Large Language Models.\n \n \n \n \n\n\n \n Naveed, H.; Khan, A. U.; Qiu, S.; Saqib, M.; Anwar, S.; Usman, M.; Akhtar, N.; Barnes, N.; and Mian, A.\n\n\n \n\n\n\n December 2023.\n arXiv:2307.06435 [cs]\n\n\n\n
\n\n\n\n \n \n \"APaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n \n \n \n \n \n \n \n \n \n \n\n\n\n
\n
@misc{naveed2023a,\n\ttitle = {A {Comprehensive} {Overview} of {Large} {Language} {Models}},\n\turl = {http://arxiv.org/abs/2307.06435},\n\tdoi = {10.48550/arXiv.2307.06435},\n\tabstract = {Large Language Models (LLMs) have recently demonstrated remarkable capabilities in natural language processing tasks and beyond. This success of LLMs has led to a large influx of research contributions in this direction. These works encompass diverse topics such as architectural innovations, better training strategies, context length improvements, fine-tuning, multi-modal LLMs, robotics, datasets, benchmarking, efficiency, and more. With the rapid development of techniques and regular breakthroughs in LLM research, it has become considerably challenging to perceive the bigger picture of the advances in this direction. Considering the rapidly emerging plethora of literature on LLMs, it is imperative that the research community is able to benefit from a concise yet comprehensive overview of the recent developments in this field. This article provides an overview of the existing literature on a broad range of LLM-related concepts. Our self-contained comprehensive overview of LLMs discusses relevant background concepts along with covering the advanced topics at the frontier of research in LLMs. This review article is intended to not only provide a systematic survey but also a quick comprehensive reference for the researchers and practitioners to draw insights from extensive informative summaries of the existing works to advance the LLM research.},\n\turldate = {2024-02-19},\n\tpublisher = {arXiv},\n\tauthor = {Naveed, Humza and Khan, Asad Ullah and Qiu, Shi and Saqib, Muhammad and Anwar, Saeed and Usman, Muhammad and Akhtar, Naveed and Barnes, Nick and Mian, Ajmal},\n\tmonth = dec,\n\tyear = {2023},\n\tnote = {arXiv:2307.06435 [cs]},\n\tkeywords = {Computer Science - Computation and Language, LLM, Large Language Model, Overview, Überblick},\n}\n\n
\n
\n\n\n
\n Large Language Models (LLMs) have recently demonstrated remarkable capabilities in natural language processing tasks and beyond. This success of LLMs has led to a large influx of research contributions in this direction. These works encompass diverse topics such as architectural innovations, better training strategies, context length improvements, fine-tuning, multi-modal LLMs, robotics, datasets, benchmarking, efficiency, and more. With the rapid development of techniques and regular breakthroughs in LLM research, it has become considerably challenging to perceive the bigger picture of the advances in this direction. Considering the rapidly emerging plethora of literature on LLMs, it is imperative that the research community is able to benefit from a concise yet comprehensive overview of the recent developments in this field. This article provides an overview of the existing literature on a broad range of LLM-related concepts. Our self-contained comprehensive overview of LLMs discusses relevant background concepts along with covering the advanced topics at the frontier of research in LLMs. This review article is intended to not only provide a systematic survey but also a quick comprehensive reference for the researchers and practitioners to draw insights from extensive informative summaries of the existing works to advance the LLM research.\n
\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Assael, Y.\n \n \n (1)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n \n Restoring and attributing ancient texts using deep neural networks.\n \n \n \n \n\n\n \n Assael, Y.; Sommerschield, T.; Shillingford, B.; Bordbar, M.; Pavlopoulos, J.; Chatzipanagiotou, M.; Androutsopoulos, I.; Prag, J.; and de Freitas, N.\n\n\n \n\n\n\n Nature, 603(7900): 280–283. March 2022.\n Number: 7900 Publisher: Nature Publishing Group\n\n\n\n
\n\n\n\n \n \n \"RestoringPaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n \n \n \n \n \n \n\n\n\n
\n
@article{assael_restoring_2022,\n\ttitle = {Restoring and attributing ancient texts using deep neural networks},\n\tvolume = {603},\n\tcopyright = {2022 The Author(s)},\n\tissn = {1476-4687},\n\turl = {https://www.nature.com/articles/s41586-022-04448-z/},\n\tdoi = {10.1038/s41586-022-04448-z},\n\tabstract = {Ancient history relies on disciplines such as epigraphy—the study of inscribed texts known as inscriptions—for evidence of the thought, language, society and history of past civilizations1. However, over the centuries, many inscriptions have been damaged to the point of illegibility, transported far from their original location and their date of writing is steeped in uncertainty. Here we present Ithaca, a deep neural network for the textual restoration, geographical attribution and chronological attribution of ancient Greek inscriptions. Ithaca is designed to assist and expand the historian’s workflow. The architecture of Ithaca focuses on collaboration, decision support and interpretability. While Ithaca alone achieves 62\\% accuracy when restoring damaged texts, the use of Ithaca by historians improved their accuracy from 25\\% to 72\\%, confirming the synergistic effect of this research tool. Ithaca can attribute inscriptions to their original location with an accuracy of 71\\% and can date them to less than 30 years of their ground-truth ranges, redating key texts of Classical Athens and contributing to topical debates in ancient history. This research shows how models such as Ithaca can unlock the cooperative potential between artificial intelligence and historians, transformationally impacting the way that we study and write about one of the most important periods in human history.},\n\tlanguage = {en},\n\tnumber = {7900},\n\turldate = {2022-09-28},\n\tjournal = {Nature},\n\tauthor = {Assael, Yannis and Sommerschield, Thea and Shillingford, Brendan and Bordbar, Mahyar and Pavlopoulos, John and Chatzipanagiotou, Marita and Androutsopoulos, Ion and Prag, Jonathan and de Freitas, Nando},\n\tmonth = mar,\n\tyear = {2022},\n\tnote = {Number: 7900\nPublisher: Nature Publishing Group},\n\tkeywords = {Archaeology, Computer science, History},\n\tpages = {280--283},\n}\n\n
\n
\n\n\n
\n Ancient history relies on disciplines such as epigraphy—the study of inscribed texts known as inscriptions—for evidence of the thought, language, society and history of past civilizations1. However, over the centuries, many inscriptions have been damaged to the point of illegibility, transported far from their original location and their date of writing is steeped in uncertainty. Here we present Ithaca, a deep neural network for the textual restoration, geographical attribution and chronological attribution of ancient Greek inscriptions. Ithaca is designed to assist and expand the historian’s workflow. The architecture of Ithaca focuses on collaboration, decision support and interpretability. While Ithaca alone achieves 62% accuracy when restoring damaged texts, the use of Ithaca by historians improved their accuracy from 25% to 72%, confirming the synergistic effect of this research tool. Ithaca can attribute inscriptions to their original location with an accuracy of 71% and can date them to less than 30 years of their ground-truth ranges, redating key texts of Classical Athens and contributing to topical debates in ancient history. This research shows how models such as Ithaca can unlock the cooperative potential between artificial intelligence and historians, transformationally impacting the way that we study and write about one of the most important periods in human history.\n
\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Aubry, M.\n \n \n (1)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n \n Unsupervised Layered Image Decomposition into Object Prototypes.\n \n \n \n \n\n\n \n Monnier, T.; Vincent, E.; Ponce, J.; and Aubry, M.\n\n\n \n\n\n\n August 2021.\n arXiv:2104.14575 [cs]\n\n\n\n
\n\n\n\n \n \n \"UnsupervisedPaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n \n \n\n\n\n
\n
@misc{monnier_unsupervised_2021,\n\ttitle = {Unsupervised {Layered} {Image} {Decomposition} into {Object} {Prototypes}},\n\turl = {http://arxiv.org/abs/2104.14575},\n\tdoi = {10.48550/arXiv.2104.14575},\n\tabstract = {We present an unsupervised learning framework for decomposing images into layers of automatically discovered object models. Contrary to recent approaches that model image layers with autoencoder networks, we represent them as explicit transformations of a small set of prototypical images. Our model has three main components: (i) a set of object prototypes in the form of learnable images with a transparency channel, which we refer to as sprites; (ii) differentiable parametric functions predicting occlusions and transformation parameters necessary to instantiate the sprites in a given image; (iii) a layered image formation model with occlusion for compositing these instances into complete images including background. By jointly learning the sprites and occlusion/transformation predictors to reconstruct images, our approach not only yields accurate layered image decompositions, but also identifies object categories and instance parameters. We first validate our approach by providing results on par with the state of the art on standard multi-object synthetic benchmarks (Tetrominoes, Multi-dSprites, CLEVR6). We then demonstrate the applicability of our model to real images in tasks that include clustering (SVHN, GTSRB), cosegmentation (Weizmann Horse) and object discovery from unfiltered social network images. To the best of our knowledge, our approach is the first layered image decomposition algorithm that learns an explicit and shared concept of object type, and is robust enough to be applied to real images.},\n\turldate = {2022-09-30},\n\tpublisher = {arXiv},\n\tauthor = {Monnier, Tom and Vincent, Elliot and Ponce, Jean and Aubry, Mathieu},\n\tmonth = aug,\n\tyear = {2021},\n\tnote = {arXiv:2104.14575 [cs]},\n\tkeywords = {Computer Science - Computer Vision and Pattern Recognition},\n}\n\n
\n
\n\n\n
\n We present an unsupervised learning framework for decomposing images into layers of automatically discovered object models. Contrary to recent approaches that model image layers with autoencoder networks, we represent them as explicit transformations of a small set of prototypical images. Our model has three main components: (i) a set of object prototypes in the form of learnable images with a transparency channel, which we refer to as sprites; (ii) differentiable parametric functions predicting occlusions and transformation parameters necessary to instantiate the sprites in a given image; (iii) a layered image formation model with occlusion for compositing these instances into complete images including background. By jointly learning the sprites and occlusion/transformation predictors to reconstruct images, our approach not only yields accurate layered image decompositions, but also identifies object categories and instance parameters. We first validate our approach by providing results on par with the state of the art on standard multi-object synthetic benchmarks (Tetrominoes, Multi-dSprites, CLEVR6). We then demonstrate the applicability of our model to real images in tasks that include clustering (SVHN, GTSRB), cosegmentation (Weizmann Horse) and object discovery from unfiltered social network images. To the best of our knowledge, our approach is the first layered image decomposition algorithm that learns an explicit and shared concept of object type, and is robust enough to be applied to real images.\n
\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Barnes, N.\n \n \n (2)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n \n A Comprehensive Overview of Large Language Models.\n \n \n \n \n\n\n \n Naveed, H.; Khan, A. U.; Qiu, S.; Saqib, M.; Anwar, S.; Usman, M.; Akhtar, N.; Barnes, N.; and Mian, A.\n\n\n \n\n\n\n December 2023.\n arXiv:2307.06435 [cs]\n\n\n\n
\n\n\n\n \n \n \"APaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n \n \n \n \n \n \n \n \n \n \n\n\n\n
\n
@misc{naveed2023,\n\ttitle = {A {Comprehensive} {Overview} of {Large} {Language} {Models}},\n\turl = {http://arxiv.org/abs/2307.06435},\n\tdoi = {10.48550/arXiv.2307.06435},\n\tabstract = {Large Language Models (LLMs) have recently demonstrated remarkable capabilities in natural language processing tasks and beyond. This success of LLMs has led to a large influx of research contributions in this direction. These works encompass diverse topics such as architectural innovations, better training strategies, context length improvements, fine-tuning, multi-modal LLMs, robotics, datasets, benchmarking, efficiency, and more. With the rapid development of techniques and regular breakthroughs in LLM research, it has become considerably challenging to perceive the bigger picture of the advances in this direction. Considering the rapidly emerging plethora of literature on LLMs, it is imperative that the research community is able to benefit from a concise yet comprehensive overview of the recent developments in this field. This article provides an overview of the existing literature on a broad range of LLM-related concepts. Our self-contained comprehensive overview of LLMs discusses relevant background concepts along with covering the advanced topics at the frontier of research in LLMs. This review article is intended to not only provide a systematic survey but also a quick comprehensive reference for the researchers and practitioners to draw insights from extensive informative summaries of the existing works to advance the LLM research.},\n\turldate = {2024-02-19},\n\tpublisher = {arXiv},\n\tauthor = {Naveed, Humza and Khan, Asad Ullah and Qiu, Shi and Saqib, Muhammad and Anwar, Saeed and Usman, Muhammad and Akhtar, Naveed and Barnes, Nick and Mian, Ajmal},\n\tmonth = dec,\n\tyear = {2023},\n\tnote = {arXiv:2307.06435 [cs]},\n\tkeywords = {Computer Science - Computation and Language, LLM, Large Language Model, Overview, Überblick},\n}\n\n
\n
\n\n\n
\n Large Language Models (LLMs) have recently demonstrated remarkable capabilities in natural language processing tasks and beyond. This success of LLMs has led to a large influx of research contributions in this direction. These works encompass diverse topics such as architectural innovations, better training strategies, context length improvements, fine-tuning, multi-modal LLMs, robotics, datasets, benchmarking, efficiency, and more. With the rapid development of techniques and regular breakthroughs in LLM research, it has become considerably challenging to perceive the bigger picture of the advances in this direction. Considering the rapidly emerging plethora of literature on LLMs, it is imperative that the research community is able to benefit from a concise yet comprehensive overview of the recent developments in this field. This article provides an overview of the existing literature on a broad range of LLM-related concepts. Our self-contained comprehensive overview of LLMs discusses relevant background concepts along with covering the advanced topics at the frontier of research in LLMs. This review article is intended to not only provide a systematic survey but also a quick comprehensive reference for the researchers and practitioners to draw insights from extensive informative summaries of the existing works to advance the LLM research.\n
\n\n\n
\n\n\n
\n \n\n \n \n \n \n \n \n A Comprehensive Overview of Large Language Models.\n \n \n \n \n\n\n \n Naveed, H.; Khan, A. U.; Qiu, S.; Saqib, M.; Anwar, S.; Usman, M.; Akhtar, N.; Barnes, N.; and Mian, A.\n\n\n \n\n\n\n December 2023.\n arXiv:2307.06435 [cs]\n\n\n\n
\n\n\n\n \n \n \"APaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n \n \n \n \n \n \n \n \n \n \n\n\n\n
\n
@misc{naveed2023a,\n\ttitle = {A {Comprehensive} {Overview} of {Large} {Language} {Models}},\n\turl = {http://arxiv.org/abs/2307.06435},\n\tdoi = {10.48550/arXiv.2307.06435},\n\tabstract = {Large Language Models (LLMs) have recently demonstrated remarkable capabilities in natural language processing tasks and beyond. This success of LLMs has led to a large influx of research contributions in this direction. These works encompass diverse topics such as architectural innovations, better training strategies, context length improvements, fine-tuning, multi-modal LLMs, robotics, datasets, benchmarking, efficiency, and more. With the rapid development of techniques and regular breakthroughs in LLM research, it has become considerably challenging to perceive the bigger picture of the advances in this direction. Considering the rapidly emerging plethora of literature on LLMs, it is imperative that the research community is able to benefit from a concise yet comprehensive overview of the recent developments in this field. This article provides an overview of the existing literature on a broad range of LLM-related concepts. Our self-contained comprehensive overview of LLMs discusses relevant background concepts along with covering the advanced topics at the frontier of research in LLMs. This review article is intended to not only provide a systematic survey but also a quick comprehensive reference for the researchers and practitioners to draw insights from extensive informative summaries of the existing works to advance the LLM research.},\n\turldate = {2024-02-19},\n\tpublisher = {arXiv},\n\tauthor = {Naveed, Humza and Khan, Asad Ullah and Qiu, Shi and Saqib, Muhammad and Anwar, Saeed and Usman, Muhammad and Akhtar, Naveed and Barnes, Nick and Mian, Ajmal},\n\tmonth = dec,\n\tyear = {2023},\n\tnote = {arXiv:2307.06435 [cs]},\n\tkeywords = {Computer Science - Computation and Language, LLM, Large Language Model, Overview, Überblick},\n}\n\n
\n
\n\n\n
\n Large Language Models (LLMs) have recently demonstrated remarkable capabilities in natural language processing tasks and beyond. This success of LLMs has led to a large influx of research contributions in this direction. These works encompass diverse topics such as architectural innovations, better training strategies, context length improvements, fine-tuning, multi-modal LLMs, robotics, datasets, benchmarking, efficiency, and more. With the rapid development of techniques and regular breakthroughs in LLM research, it has become considerably challenging to perceive the bigger picture of the advances in this direction. Considering the rapidly emerging plethora of literature on LLMs, it is imperative that the research community is able to benefit from a concise yet comprehensive overview of the recent developments in this field. This article provides an overview of the existing literature on a broad range of LLM-related concepts. Our self-contained comprehensive overview of LLMs discusses relevant background concepts along with covering the advanced topics at the frontier of research in LLMs. This review article is intended to not only provide a systematic survey but also a quick comprehensive reference for the researchers and practitioners to draw insights from extensive informative summaries of the existing works to advance the LLM research.\n
\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Barnes, P.\n \n \n (1)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n \n Model Cards for Model Reporting.\n \n \n \n \n\n\n \n Mitchell, M.; Wu, S.; Zaldivar, A.; Barnes, P.; Vasserman, L.; Hutchinson, B.; Spitzer, E.; Raji, I. D.; and Gebru, T.\n\n\n \n\n\n\n Proceedings of the Conference on Fairness, Accountability, and Transparency,220–229. January 2019.\n arXiv: 1810.03993\n\n\n\n
\n\n\n\n \n \n \"ModelPaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n \n \n \n \n\n\n\n
\n
@article{mitchell_model_2019,\n\ttitle = {Model {Cards} for {Model} {Reporting}},\n\turl = {http://arxiv.org/abs/1810.03993},\n\tdoi = {10.1145/3287560.3287596},\n\tabstract = {Trained machine learning models are increasingly used to perform high-impact tasks in areas such as law enforcement, medicine, education, and employment. In order to clarify the intended use cases of machine learning models and minimize their usage in contexts for which they are not well suited, we recommend that released models be accompanied by documentation detailing their performance characteristics. In this paper, we propose a framework that we call model cards, to encourage such transparent model reporting. Model cards are short documents accompanying trained machine learning models that provide benchmarked evaluation in a variety of conditions, such as across different cultural, demographic, or phenotypic groups (e.g., race, geographic location, sex, Fitzpatrick skin type) and intersectional groups (e.g., age and race, or sex and Fitzpatrick skin type) that are relevant to the intended application domains. Model cards also disclose the context in which models are intended to be used, details of the performance evaluation procedures, and other relevant information. While we focus primarily on human-centered machine learning models in the application fields of computer vision and natural language processing, this framework can be used to document any trained machine learning model. To solidify the concept, we provide cards for two supervised models: One trained to detect smiling faces in images, and one trained to detect toxic comments in text. We propose model cards as a step towards the responsible democratization of machine learning and related AI technology, increasing transparency into how well AI technology works. We hope this work encourages those releasing trained machine learning models to accompany model releases with similar detailed evaluation numbers and other relevant documentation.},\n\turldate = {2022-01-24},\n\tjournal = {Proceedings of the Conference on Fairness, Accountability, and Transparency},\n\tauthor = {Mitchell, Margaret and Wu, Simone and Zaldivar, Andrew and Barnes, Parker and Vasserman, Lucy and Hutchinson, Ben and Spitzer, Elena and Raji, Inioluwa Deborah and Gebru, Timnit},\n\tmonth = jan,\n\tyear = {2019},\n\tnote = {arXiv: 1810.03993},\n\tkeywords = {Computer Science - Artificial Intelligence, Computer Science - Machine Learning},\n\tpages = {220--229},\n}\n\n
\n
\n\n\n
\n Trained machine learning models are increasingly used to perform high-impact tasks in areas such as law enforcement, medicine, education, and employment. In order to clarify the intended use cases of machine learning models and minimize their usage in contexts for which they are not well suited, we recommend that released models be accompanied by documentation detailing their performance characteristics. In this paper, we propose a framework that we call model cards, to encourage such transparent model reporting. Model cards are short documents accompanying trained machine learning models that provide benchmarked evaluation in a variety of conditions, such as across different cultural, demographic, or phenotypic groups (e.g., race, geographic location, sex, Fitzpatrick skin type) and intersectional groups (e.g., age and race, or sex and Fitzpatrick skin type) that are relevant to the intended application domains. Model cards also disclose the context in which models are intended to be used, details of the performance evaluation procedures, and other relevant information. While we focus primarily on human-centered machine learning models in the application fields of computer vision and natural language processing, this framework can be used to document any trained machine learning model. To solidify the concept, we provide cards for two supervised models: One trained to detect smiling faces in images, and one trained to detect toxic comments in text. We propose model cards as a step towards the responsible democratization of machine learning and related AI technology, increasing transparency into how well AI technology works. We hope this work encourages those releasing trained machine learning models to accompany model releases with similar detailed evaluation numbers and other relevant documentation.\n
\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Barrere, K.\n \n \n (1)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n A Light Transformer-Based Architecture for Handwritten Text Recognition.\n \n \n \n\n\n \n Barrere, K.; Soullard, Y.; Lemaitre, A.; and Coüasnon, B.\n\n\n \n\n\n\n In Uchida, S.; Barney, E.; and Eglin, V., editor(s), Document Analysis Systems, of Lecture Notes in Computer Science, pages 275–290, Cham, 2022. Springer International Publishing\n \n\n\n\n
\n\n\n\n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n \n \n \n \n \n \n \n \n \n \n\n\n\n
\n
@inproceedings{barrere_light_2022,\n\taddress = {Cham},\n\tseries = {Lecture {Notes} in {Computer} {Science}},\n\ttitle = {A {Light} {Transformer}-{Based} {Architecture} for {Handwritten} {Text} {Recognition}},\n\tisbn = {978-3-031-06555-2},\n\tdoi = {10.1007/978-3-031-06555-2_19},\n\tabstract = {Transformer models have been showing ground-breaking results in the domain of natural language processing. More recently, they started to gain interest in many others fields as in computer vision. Traditional Transformer models typically require a significant amount of training data to achieve satisfactory results. However, in the domain of handwritten text recognition, annotated data acquisition remains costly resulting in small datasets compared to those commonly used to train a Transformer-based model. Hence, training Transformer models able to transcribe handwritten text from images remains challenging. We propose a light encoder-decoder Transformer-based architecture for handwriting text recognition, containing a small number of parameters compared to traditional Transformer architectures. We trained our architecture using a hybrid loss, combining the well-known connectionist temporal classification with the cross-entropy. Experiments are conducted on the well-known IAM dataset with and without the use of additional synthetic data. We show that our network reaches state-of-the-art results in both cases, compared with other larger Transformer-based models.},\n\tlanguage = {en},\n\tbooktitle = {Document {Analysis} {Systems}},\n\tpublisher = {Springer International Publishing},\n\tauthor = {Barrere, Killian and Soullard, Yann and Lemaitre, Aurélie and Coüasnon, Bertrand},\n\teditor = {Uchida, Seiichi and Barney, Elisa and Eglin, Véronique},\n\tyear = {2022},\n\tkeywords = {Handwritten text recognition, Hybrid loss, Light network, Neural networks, Transformer},\n\tpages = {275--290},\n}\n\n
\n
\n\n\n
\n Transformer models have been showing ground-breaking results in the domain of natural language processing. More recently, they started to gain interest in many others fields as in computer vision. Traditional Transformer models typically require a significant amount of training data to achieve satisfactory results. However, in the domain of handwritten text recognition, annotated data acquisition remains costly resulting in small datasets compared to those commonly used to train a Transformer-based model. Hence, training Transformer models able to transcribe handwritten text from images remains challenging. We propose a light encoder-decoder Transformer-based architecture for handwriting text recognition, containing a small number of parameters compared to traditional Transformer architectures. We trained our architecture using a hybrid loss, combining the well-known connectionist temporal classification with the cross-entropy. Experiments are conducted on the well-known IAM dataset with and without the use of additional synthetic data. We show that our network reaches state-of-the-art results in both cases, compared with other larger Transformer-based models.\n
\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Boente, W.\n \n \n (1)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n \n The Adaptability of a Transformer-Based OCR Model for Historical Documents.\n \n \n \n \n\n\n \n Ströbel, P. B.; Hodel, T.; Boente, W.; and Volk, M.\n\n\n \n\n\n\n In Coustaty, M.; and Fornés, A., editor(s), Document Analysis and Recognition – ICDAR 2023 Workshops, volume 14193, pages 34–48. Springer Nature Switzerland, Cham, 2023.\n Series Title: Lecture Notes in Computer Science\n\n\n\n
\n\n\n\n \n \n \"ThePaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@incollection{coustaty_adaptability_2023,\n\taddress = {Cham},\n\ttitle = {The {Adaptability} of a {Transformer}-{Based} {OCR} {Model} for {Historical} {Documents}},\n\tvolume = {14193},\n\tisbn = {978-3-031-41497-8 978-3-031-41498-5},\n\turl = {https://link.springer.com/10.1007/978-3-031-41498-5_3},\n\tlanguage = {en},\n\turldate = {2023-10-17},\n\tbooktitle = {Document {Analysis} and {Recognition} – {ICDAR} 2023 {Workshops}},\n\tpublisher = {Springer Nature Switzerland},\n\tauthor = {Ströbel, Phillip Benjamin and Hodel, Tobias and Boente, Walter and Volk, Martin},\n\teditor = {Coustaty, Mickael and Fornés, Alicia},\n\tyear = {2023},\n\tdoi = {10.1007/978-3-031-41498-5_3},\n\tnote = {Series Title: Lecture Notes in Computer Science},\n\tpages = {34--48},\n}\n\n
\n
\n\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Bongard, J.\n \n \n (1)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n How the Body Shapes the Way We Think: A New View of Intelligence.\n \n \n \n\n\n \n Pfeifer, R.; and Bongard, J.\n\n\n \n\n\n\n 2007.\n \n\n\n\n
\n\n\n\n \n\n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@book{pfeifer_how_2007,\n\ttitle = {How the {Body} {Shapes} the {Way} {We} {Think}: {A} {New} {View} of {Intelligence}},\n\tisbn = {978-0-262-16239-5},\n\tabstract = {On Embodiment in AI-development},\n\tauthor = {Pfeifer, Rolf and Bongard, Josh},\n\tyear = {2007},\n}\n\n
\n
\n\n\n
\n On Embodiment in AI-development\n
\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Bordbar, M.\n \n \n (1)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n \n Restoring and attributing ancient texts using deep neural networks.\n \n \n \n \n\n\n \n Assael, Y.; Sommerschield, T.; Shillingford, B.; Bordbar, M.; Pavlopoulos, J.; Chatzipanagiotou, M.; Androutsopoulos, I.; Prag, J.; and de Freitas, N.\n\n\n \n\n\n\n Nature, 603(7900): 280–283. March 2022.\n Number: 7900 Publisher: Nature Publishing Group\n\n\n\n
\n\n\n\n \n \n \"RestoringPaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n \n \n \n \n \n \n\n\n\n
\n
@article{assael_restoring_2022,\n\ttitle = {Restoring and attributing ancient texts using deep neural networks},\n\tvolume = {603},\n\tcopyright = {2022 The Author(s)},\n\tissn = {1476-4687},\n\turl = {https://www.nature.com/articles/s41586-022-04448-z/},\n\tdoi = {10.1038/s41586-022-04448-z},\n\tabstract = {Ancient history relies on disciplines such as epigraphy—the study of inscribed texts known as inscriptions—for evidence of the thought, language, society and history of past civilizations1. However, over the centuries, many inscriptions have been damaged to the point of illegibility, transported far from their original location and their date of writing is steeped in uncertainty. Here we present Ithaca, a deep neural network for the textual restoration, geographical attribution and chronological attribution of ancient Greek inscriptions. Ithaca is designed to assist and expand the historian’s workflow. The architecture of Ithaca focuses on collaboration, decision support and interpretability. While Ithaca alone achieves 62\\% accuracy when restoring damaged texts, the use of Ithaca by historians improved their accuracy from 25\\% to 72\\%, confirming the synergistic effect of this research tool. Ithaca can attribute inscriptions to their original location with an accuracy of 71\\% and can date them to less than 30 years of their ground-truth ranges, redating key texts of Classical Athens and contributing to topical debates in ancient history. This research shows how models such as Ithaca can unlock the cooperative potential between artificial intelligence and historians, transformationally impacting the way that we study and write about one of the most important periods in human history.},\n\tlanguage = {en},\n\tnumber = {7900},\n\turldate = {2022-09-28},\n\tjournal = {Nature},\n\tauthor = {Assael, Yannis and Sommerschield, Thea and Shillingford, Brendan and Bordbar, Mahyar and Pavlopoulos, John and Chatzipanagiotou, Marita and Androutsopoulos, Ion and Prag, Jonathan and de Freitas, Nando},\n\tmonth = mar,\n\tyear = {2022},\n\tnote = {Number: 7900\nPublisher: Nature Publishing Group},\n\tkeywords = {Archaeology, Computer science, History},\n\tpages = {280--283},\n}\n\n
\n
\n\n\n
\n Ancient history relies on disciplines such as epigraphy—the study of inscribed texts known as inscriptions—for evidence of the thought, language, society and history of past civilizations1. However, over the centuries, many inscriptions have been damaged to the point of illegibility, transported far from their original location and their date of writing is steeped in uncertainty. Here we present Ithaca, a deep neural network for the textual restoration, geographical attribution and chronological attribution of ancient Greek inscriptions. Ithaca is designed to assist and expand the historian’s workflow. The architecture of Ithaca focuses on collaboration, decision support and interpretability. While Ithaca alone achieves 62% accuracy when restoring damaged texts, the use of Ithaca by historians improved their accuracy from 25% to 72%, confirming the synergistic effect of this research tool. Ithaca can attribute inscriptions to their original location with an accuracy of 71% and can date them to less than 30 years of their ground-truth ranges, redating key texts of Classical Athens and contributing to topical debates in ancient history. This research shows how models such as Ithaca can unlock the cooperative potential between artificial intelligence and historians, transformationally impacting the way that we study and write about one of the most important periods in human history.\n
\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Bowen, J.\n \n \n (1)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n \n Turing’s Genius – Defining an apt microcosm.\n \n \n \n \n\n\n \n Bowen, J.; Trickett, T.; Green, J. B. A.; and Lomas, A.\n\n\n \n\n\n\n In July 2018. BCS Learning & Development\n \n\n\n\n
\n\n\n\n \n \n \"Turing’sPaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@inproceedings{bowen_turings_2018,\n\ttitle = {Turing’s {Genius} – {Defining} an apt microcosm},\n\turl = {https://www.scienceopen.com/hosted-document?doi=10.14236/ewic/EVA2018.31},\n\tdoi = {10.14236/ewic/EVA2018.31},\n\tabstract = {Alan Turing (1912–1954) is widely acknowledged as a genius. As well as codebreaking during World War II and taking a pioneering role in computer hardware design and software after the War, he also wrote three important foundational papers in the fields of theoretical computer science, artificial intelligence, and mathematical biology. He has been called the father of computer science, but he also admired by mathematicians, philosophers, and perhaps more surprisingly biologists, for his wide-ranging ideas. His influence stretches from scientific to cultural and even political impact. For all these reasons, he was a true polymath. This paper considers the genius of Turing from various angles, both scientific and artistic. The four authors provide position statements on how Turing has influenced and inspired their work, together with short biographies, as a starting point for a panel session and visual music performance.},\n\turldate = {2023-09-27},\n\tpublisher = {BCS Learning \\& Development},\n\tauthor = {Bowen, Jonathan and Trickett, Terry and Green, Jeremy B. A. and Lomas, Andy},\n\tmonth = jul,\n\tyear = {2018},\n}\n\n
\n
\n\n\n
\n Alan Turing (1912–1954) is widely acknowledged as a genius. As well as codebreaking during World War II and taking a pioneering role in computer hardware design and software after the War, he also wrote three important foundational papers in the fields of theoretical computer science, artificial intelligence, and mathematical biology. He has been called the father of computer science, but he also admired by mathematicians, philosophers, and perhaps more surprisingly biologists, for his wide-ranging ideas. His influence stretches from scientific to cultural and even political impact. For all these reasons, he was a true polymath. This paper considers the genius of Turing from various angles, both scientific and artistic. The four authors provide position statements on how Turing has influenced and inspired their work, together with short biographies, as a starting point for a panel session and visual music performance.\n
\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Brée, S.\n \n \n (1)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n Recognition and Information Extraction in Historical Handwritten Tables: Toward Understanding Early $$20\\textasciicircum\\th\\$$Century Paris Census.\n \n \n \n\n\n \n Constum, T.; Kempf, N.; Paquet, T.; Tranouez, P.; Chatelain, C.; Brée, S.; and Merveille, F.\n\n\n \n\n\n\n In Uchida, S.; Barney, E.; and Eglin, V., editor(s), Document Analysis Systems, pages 143–157, Cham, 2022. Springer International Publishing\n \n\n\n\n
\n\n\n\n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n\n\n\n
\n
@inproceedings{constumRecognitionInformationExtraction2022,\n\taddress = {Cham},\n\ttitle = {Recognition and {Information} {Extraction} in {Historical} {Handwritten} {Tables}: {Toward} {Understanding} {Early} \\$\\$20{\\textasciicircum}\\{th\\}\\$\\${Century} {Paris} {Census}},\n\tisbn = {978-3-031-06555-2},\n\tshorttitle = {Recognition and {Information} {Extraction} in {Historical} {Handwritten} {Tables}},\n\tdoi = {10.1007/978-3-031-06555-2_10},\n\tabstract = {We aim to build a vast database (up to 9 million individuals) from the handwritten tabular nominal census of Paris of 1926, 1931 and 1936, each composed of about 100,000 handwritten simple pages in a tabular format. We created a complete pipeline that goes from the scan of double pages to text prediction while minimizing the need for segmentation labels. We describe how weighted finite state transducers, writer specialization and self-training further improved our results. We also introduce through this communication two annotated datasets for handwriting recognition that are now publicly available, and an open-source toolkit to apply WFST on CTC lattices.},\n\tlanguage = {en},\n\tbooktitle = {Document {Analysis} {Systems}},\n\tpublisher = {Springer International Publishing},\n\tauthor = {Constum, Thomas and Kempf, Nicolas and Paquet, Thierry and Tranouez, Pierrick and Chatelain, Clément and Brée, Sandra and Merveille, François},\n\teditor = {Uchida, Seiichi and Barney, Elisa and Eglin, Véronique},\n\tyear = {2022},\n\tkeywords = {Document layout analysis, Handwriting recognition, Self-training, Semi-supervised learning, Table analysis, WFST, handwritten text recognition, table recognition},\n\tpages = {143--157},\n}\n\n
\n
\n\n\n
\n We aim to build a vast database (up to 9 million individuals) from the handwritten tabular nominal census of Paris of 1926, 1931 and 1936, each composed of about 100,000 handwritten simple pages in a tabular format. We created a complete pipeline that goes from the scan of double pages to text prediction while minimizing the need for segmentation labels. We describe how weighted finite state transducers, writer specialization and self-training further improved our results. We also introduce through this communication two annotated datasets for handwriting recognition that are now publicly available, and an open-source toolkit to apply WFST on CTC lattices.\n
\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Camps, J.\n \n \n (2)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n Handling Heavily Abbreviated Manuscripts: HTR Engines vs Text Normalisation Approaches.\n \n \n \n\n\n \n Camps, J.; Vidal-Gorène, C.; and Vernet, M.\n\n\n \n\n\n\n In Barney Smith, E. H.; and Pal, U., editor(s), Document Analysis and Recognition – ICDAR 2021 Workshops, of Lecture Notes in Computer Science, pages 306–316, Cham, 2021. Springer International Publishing\n \n\n\n\n
\n\n\n\n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n \n \n \n \n \n \n\n\n\n
\n
@inproceedings{campsHandlingHeavilyAbbreviated2021a,\n\taddress = {Cham},\n\tseries = {Lecture {Notes} in {Computer} {Science}},\n\ttitle = {Handling {Heavily} {Abbreviated} {Manuscripts}: {HTR} {Engines} vs {Text} {Normalisation} {Approaches}},\n\tisbn = {978-3-030-86159-9},\n\tshorttitle = {Handling {Heavily} {Abbreviated} {Manuscripts}},\n\tdoi = {10.1007/978-3-030-86159-9_21},\n\tabstract = {Although abbreviations are fairly common in handwritten sources, particularly in medieval and modern Western manuscripts, previous research dealing with computational approaches to their expansion is scarce. Yet abbreviations present particular challenges to computational approaches such as handwritten text recognition and natural language processing tasks. Often, pre-processing ultimately aims to lead from a digitised image of the source to a normalised text, which includes expansion of the abbreviations. We explore different setups to obtain such a normalised text, either directly, by training HTR engines on normalised (i.e., expanded, disabbreviated) text, or by decomposing the process into discrete steps, each making use of specialist models for recognition, word segmentation and normalisation. The case studies considered here are drawn from the medieval Latin tradition.},\n\tlanguage = {en},\n\tbooktitle = {Document {Analysis} and {Recognition} – {ICDAR} 2021 {Workshops}},\n\tpublisher = {Springer International Publishing},\n\tauthor = {Camps, Jean-Baptiste and Vidal-Gorène, Chahan and Vernet, Marguerite},\n\teditor = {Barney Smith, Elisa H. and Pal, Umapada},\n\tyear = {2021},\n\tkeywords = {Abbreviations, Handwritten text recognition, Medieval western manuscripts},\n\tpages = {306--316},\n}\n\n
\n
\n\n\n
\n Although abbreviations are fairly common in handwritten sources, particularly in medieval and modern Western manuscripts, previous research dealing with computational approaches to their expansion is scarce. Yet abbreviations present particular challenges to computational approaches such as handwritten text recognition and natural language processing tasks. Often, pre-processing ultimately aims to lead from a digitised image of the source to a normalised text, which includes expansion of the abbreviations. We explore different setups to obtain such a normalised text, either directly, by training HTR engines on normalised (i.e., expanded, disabbreviated) text, or by decomposing the process into discrete steps, each making use of specialist models for recognition, word segmentation and normalisation. The case studies considered here are drawn from the medieval Latin tradition.\n
\n\n\n
\n\n\n
\n \n\n \n \n \n \n \n Handling Heavily Abbreviated Manuscripts: HTR Engines vs Text Normalisation Approaches.\n \n \n \n\n\n \n Camps, J.; Vidal-Gorène, C.; and Vernet, M.\n\n\n \n\n\n\n In Barney Smith, E. H.; and Pal, U., editor(s), Document Analysis and Recognition – ICDAR 2021 Workshops, of Lecture Notes in Computer Science, pages 306–316, Cham, 2021. Springer International Publishing\n \n\n\n\n
\n\n\n\n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n \n \n \n \n \n \n \n \n \n \n\n\n\n
\n
@inproceedings{camps_handling_2021,\n\taddress = {Cham},\n\tseries = {Lecture {Notes} in {Computer} {Science}},\n\ttitle = {Handling {Heavily} {Abbreviated} {Manuscripts}: {HTR} {Engines} vs {Text} {Normalisation} {Approaches}},\n\tisbn = {978-3-030-86159-9},\n\tshorttitle = {Handling {Heavily} {Abbreviated} {Manuscripts}},\n\tdoi = {10.1007/978-3-030-86159-9_21},\n\tabstract = {Although abbreviations are fairly common in handwritten sources, particularly in medieval and modern Western manuscripts, previous research dealing with computational approaches to their expansion is scarce. Yet abbreviations present particular challenges to computational approaches such as handwritten text recognition and natural language processing tasks. Often, pre-processing ultimately aims to lead from a digitised image of the source to a normalised text, which includes expansion of the abbreviations. We explore different setups to obtain such a normalised text, either directly, by training HTR engines on normalised (i.e., expanded, disabbreviated) text, or by decomposing the process into discrete steps, each making use of specialist models for recognition, word segmentation and normalisation. The case studies considered here are drawn from the medieval Latin tradition.},\n\tlanguage = {en},\n\tbooktitle = {Document {Analysis} and {Recognition} – {ICDAR} 2021 {Workshops}},\n\tpublisher = {Springer International Publishing},\n\tauthor = {Camps, Jean-Baptiste and Vidal-Gorène, Chahan and Vernet, Marguerite},\n\teditor = {Barney Smith, Elisa H. and Pal, Umapada},\n\tyear = {2021},\n\tkeywords = {Abbreviation, Abbreviations, Handwritten text recognition, Medieval western manuscripts, Text Recognition},\n\tpages = {306--316},\n}\n\n
\n
\n\n\n
\n Although abbreviations are fairly common in handwritten sources, particularly in medieval and modern Western manuscripts, previous research dealing with computational approaches to their expansion is scarce. Yet abbreviations present particular challenges to computational approaches such as handwritten text recognition and natural language processing tasks. Often, pre-processing ultimately aims to lead from a digitised image of the source to a normalised text, which includes expansion of the abbreviations. We explore different setups to obtain such a normalised text, either directly, by training HTR engines on normalised (i.e., expanded, disabbreviated) text, or by decomposing the process into discrete steps, each making use of specialist models for recognition, word segmentation and normalisation. The case studies considered here are drawn from the medieval Latin tradition.\n
\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Cave, S.\n \n \n (1)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n \n Hopes and fears for intelligent machines in fiction and reality.\n \n \n \n \n\n\n \n Cave, S.; and Dihal, K.\n\n\n \n\n\n\n Nature Machine Intelligence, 1(2): 74–78. February 2019.\n Number: 2 Publisher: Nature Publishing Group\n\n\n\n
\n\n\n\n \n \n \"HopesPaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n \n \n \n \n \n \n \n \n\n\n\n
\n
@article{cave2019,\n\ttitle = {Hopes and fears for intelligent machines in fiction and reality},\n\tvolume = {1},\n\tcopyright = {2019 Springer Nature Limited},\n\tissn = {2522-5839},\n\turl = {https://www.nature.com/articles/s42256-019-0020-9},\n\tdoi = {10.1038/s42256-019-0020-9},\n\tabstract = {This paper categorizes some of the fundamental hopes and fears expressed in imaginings of artificial intelligence (AI), based on a survey of 300 fictional and non-fictional works. The categories are structured into four dichotomies, each comprising a hope and a parallel fear, mediated by the notion of control. These are: the hope for much longer lives (‘immortality’) and the fear of losing one’s identity (‘inhumanity’); the hope for a life free of work (‘ease’), and the fear of becoming redundant (‘obsolescence’); the hope that AI can fulfil one’s desires (‘gratification’), alongside the fear that humans will become redundant to each other (‘alienation’); and the hope that AI offers power over others (‘dominance’), with the fear that it will turn against us (‘uprising’). This Perspective further argues that these perceptions of AI’s possibilities, which may be quite detached from the reality of the technology, can influence how it is developed, deployed and regulated.},\n\tlanguage = {en},\n\tnumber = {2},\n\turldate = {2024-01-23},\n\tjournal = {Nature Machine Intelligence},\n\tauthor = {Cave, Stephen and Dihal, Kanta},\n\tmonth = feb,\n\tyear = {2019},\n\tnote = {Number: 2\nPublisher: Nature Publishing Group},\n\tkeywords = {Cultural and media studies, Literature, Science, technology and society},\n\tpages = {74--78},\n}\n\n
\n
\n\n\n
\n This paper categorizes some of the fundamental hopes and fears expressed in imaginings of artificial intelligence (AI), based on a survey of 300 fictional and non-fictional works. The categories are structured into four dichotomies, each comprising a hope and a parallel fear, mediated by the notion of control. These are: the hope for much longer lives (‘immortality’) and the fear of losing one’s identity (‘inhumanity’); the hope for a life free of work (‘ease’), and the fear of becoming redundant (‘obsolescence’); the hope that AI can fulfil one’s desires (‘gratification’), alongside the fear that humans will become redundant to each other (‘alienation’); and the hope that AI offers power over others (‘dominance’), with the fear that it will turn against us (‘uprising’). This Perspective further argues that these perceptions of AI’s possibilities, which may be quite detached from the reality of the technology, can influence how it is developed, deployed and regulated.\n
\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Chagué, A.\n \n \n (1)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n \n HTR-United: Mutualisons la vérité de terrain!.\n \n \n \n \n\n\n \n Chagué, A.; Clérice, T.; and Romary, L.\n\n\n \n\n\n\n In DHNord2021-Publier, partager, réutiliser les données de la recherche: les data papers et leurs enjeux, 2021. \n \n\n\n\n
\n\n\n\n \n \n \"HTR-United:Paper\n  \n \n\n \n\n \n link\n  \n \n\n bibtex\n \n\n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@inproceedings{chague_htr-united_2021,\n\ttitle = {{HTR}-{United}: {Mutualisons} la vérité de terrain!},\n\tshorttitle = {{HTR}-{United}},\n\turl = {https://hal.science/hal-03398740/document},\n\turldate = {2023-10-27},\n\tbooktitle = {{DHNord2021}-{Publier}, partager, réutiliser les données de la recherche: les data papers et leurs enjeux},\n\tauthor = {Chagué, Alix and Clérice, Thibault and Romary, Laurent},\n\tyear = {2021},\n}\n\n
\n
\n\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Chang, M.\n \n \n (1)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n Artificial intelligence: a modern approach.\n \n \n \n\n\n \n Russell, S. J.; Norvig, P.; Chang, M.; Devlin, J.; Dragan, A.; Forsyth, D.; Goodfellow, I.; Malik, J.; Mansinghka, V.; Pearl, J.; and Wooldridge, M. J.\n\n\n \n\n\n\n of Pearson series in artificial intelligencePearson, Harlow, Fourth edition, global edition edition, 2022.\n \n\n\n\n
\n\n\n\n \n\n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@book{russell_artificial_2022,\n\taddress = {Harlow},\n\tedition = {Fourth edition, global edition},\n\tseries = {Pearson series in artificial intelligence},\n\ttitle = {Artificial intelligence: a modern approach},\n\tisbn = {978-1-292-40113-3},\n\tshorttitle = {Artificial intelligence},\n\tabstract = {"Updated edition of popular textbook on Artificial Intelligence. This edition specific looks at ways of keeping artificial intelligence under control"},\n\tlanguage = {eng},\n\tpublisher = {Pearson},\n\tauthor = {Russell, Stuart J. and Norvig, Peter and Chang, Ming-wei and Devlin, Jacob and Dragan, Anca and Forsyth, David and Goodfellow, Ian and Malik, Jitendra and Mansinghka, Vikas and Pearl, Judea and Wooldridge, Michael J.},\n\tyear = {2022},\n}\n\n
\n
\n\n\n
\n \"Updated edition of popular textbook on Artificial Intelligence. This edition specific looks at ways of keeping artificial intelligence under control\"\n
\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Chastang, P.\n \n \n (1)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n \n A Named Entity Recognition Model for Medieval Latin Charters.\n \n \n \n \n\n\n \n Chastang, P.; Aguilar, S. T.; and Tannier, X.\n\n\n \n\n\n\n Digital Humanities Quarterly, 15(4). 2021.\n \n\n\n\n
\n\n\n\n \n \n \"APaper\n  \n \n\n \n\n \n link\n  \n \n\n bibtex\n \n\n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@article{chastang_named_2021,\n\ttitle = {A {Named} {Entity} {Recognition} {Model} for {Medieval} {Latin} {Charters}},\n\tvolume = {15},\n\tissn = {1938-4122},\n\turl = {http://www.digitalhumanities.org/dhq/vol/15/4/000574/000574.html},\n\tnumber = {4},\n\tjournal = {Digital Humanities Quarterly},\n\tauthor = {Chastang, Pierre and Aguilar, Sergio Torres and Tannier, Xavier},\n\tyear = {2021},\n}\n\n
\n
\n\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Chatelain, C.\n \n \n (1)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n Recognition and Information Extraction in Historical Handwritten Tables: Toward Understanding Early $$20\\textasciicircum\\th\\$$Century Paris Census.\n \n \n \n\n\n \n Constum, T.; Kempf, N.; Paquet, T.; Tranouez, P.; Chatelain, C.; Brée, S.; and Merveille, F.\n\n\n \n\n\n\n In Uchida, S.; Barney, E.; and Eglin, V., editor(s), Document Analysis Systems, pages 143–157, Cham, 2022. Springer International Publishing\n \n\n\n\n
\n\n\n\n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n\n\n\n
\n
@inproceedings{constumRecognitionInformationExtraction2022,\n\taddress = {Cham},\n\ttitle = {Recognition and {Information} {Extraction} in {Historical} {Handwritten} {Tables}: {Toward} {Understanding} {Early} \\$\\$20{\\textasciicircum}\\{th\\}\\$\\${Century} {Paris} {Census}},\n\tisbn = {978-3-031-06555-2},\n\tshorttitle = {Recognition and {Information} {Extraction} in {Historical} {Handwritten} {Tables}},\n\tdoi = {10.1007/978-3-031-06555-2_10},\n\tabstract = {We aim to build a vast database (up to 9 million individuals) from the handwritten tabular nominal census of Paris of 1926, 1931 and 1936, each composed of about 100,000 handwritten simple pages in a tabular format. We created a complete pipeline that goes from the scan of double pages to text prediction while minimizing the need for segmentation labels. We describe how weighted finite state transducers, writer specialization and self-training further improved our results. We also introduce through this communication two annotated datasets for handwriting recognition that are now publicly available, and an open-source toolkit to apply WFST on CTC lattices.},\n\tlanguage = {en},\n\tbooktitle = {Document {Analysis} {Systems}},\n\tpublisher = {Springer International Publishing},\n\tauthor = {Constum, Thomas and Kempf, Nicolas and Paquet, Thierry and Tranouez, Pierrick and Chatelain, Clément and Brée, Sandra and Merveille, François},\n\teditor = {Uchida, Seiichi and Barney, Elisa and Eglin, Véronique},\n\tyear = {2022},\n\tkeywords = {Document layout analysis, Handwriting recognition, Self-training, Semi-supervised learning, Table analysis, WFST, handwritten text recognition, table recognition},\n\tpages = {143--157},\n}\n\n
\n
\n\n\n
\n We aim to build a vast database (up to 9 million individuals) from the handwritten tabular nominal census of Paris of 1926, 1931 and 1936, each composed of about 100,000 handwritten simple pages in a tabular format. We created a complete pipeline that goes from the scan of double pages to text prediction while minimizing the need for segmentation labels. We describe how weighted finite state transducers, writer specialization and self-training further improved our results. We also introduce through this communication two annotated datasets for handwriting recognition that are now publicly available, and an open-source toolkit to apply WFST on CTC lattices.\n
\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Chatzipanagiotou, M.\n \n \n (1)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n \n Restoring and attributing ancient texts using deep neural networks.\n \n \n \n \n\n\n \n Assael, Y.; Sommerschield, T.; Shillingford, B.; Bordbar, M.; Pavlopoulos, J.; Chatzipanagiotou, M.; Androutsopoulos, I.; Prag, J.; and de Freitas, N.\n\n\n \n\n\n\n Nature, 603(7900): 280–283. March 2022.\n Number: 7900 Publisher: Nature Publishing Group\n\n\n\n
\n\n\n\n \n \n \"RestoringPaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n \n \n \n \n \n \n\n\n\n
\n
@article{assael_restoring_2022,\n\ttitle = {Restoring and attributing ancient texts using deep neural networks},\n\tvolume = {603},\n\tcopyright = {2022 The Author(s)},\n\tissn = {1476-4687},\n\turl = {https://www.nature.com/articles/s41586-022-04448-z/},\n\tdoi = {10.1038/s41586-022-04448-z},\n\tabstract = {Ancient history relies on disciplines such as epigraphy—the study of inscribed texts known as inscriptions—for evidence of the thought, language, society and history of past civilizations1. However, over the centuries, many inscriptions have been damaged to the point of illegibility, transported far from their original location and their date of writing is steeped in uncertainty. Here we present Ithaca, a deep neural network for the textual restoration, geographical attribution and chronological attribution of ancient Greek inscriptions. Ithaca is designed to assist and expand the historian’s workflow. The architecture of Ithaca focuses on collaboration, decision support and interpretability. While Ithaca alone achieves 62\\% accuracy when restoring damaged texts, the use of Ithaca by historians improved their accuracy from 25\\% to 72\\%, confirming the synergistic effect of this research tool. Ithaca can attribute inscriptions to their original location with an accuracy of 71\\% and can date them to less than 30 years of their ground-truth ranges, redating key texts of Classical Athens and contributing to topical debates in ancient history. This research shows how models such as Ithaca can unlock the cooperative potential between artificial intelligence and historians, transformationally impacting the way that we study and write about one of the most important periods in human history.},\n\tlanguage = {en},\n\tnumber = {7900},\n\turldate = {2022-09-28},\n\tjournal = {Nature},\n\tauthor = {Assael, Yannis and Sommerschield, Thea and Shillingford, Brendan and Bordbar, Mahyar and Pavlopoulos, John and Chatzipanagiotou, Marita and Androutsopoulos, Ion and Prag, Jonathan and de Freitas, Nando},\n\tmonth = mar,\n\tyear = {2022},\n\tnote = {Number: 7900\nPublisher: Nature Publishing Group},\n\tkeywords = {Archaeology, Computer science, History},\n\tpages = {280--283},\n}\n\n
\n
\n\n\n
\n Ancient history relies on disciplines such as epigraphy—the study of inscribed texts known as inscriptions—for evidence of the thought, language, society and history of past civilizations1. However, over the centuries, many inscriptions have been damaged to the point of illegibility, transported far from their original location and their date of writing is steeped in uncertainty. Here we present Ithaca, a deep neural network for the textual restoration, geographical attribution and chronological attribution of ancient Greek inscriptions. Ithaca is designed to assist and expand the historian’s workflow. The architecture of Ithaca focuses on collaboration, decision support and interpretability. While Ithaca alone achieves 62% accuracy when restoring damaged texts, the use of Ithaca by historians improved their accuracy from 25% to 72%, confirming the synergistic effect of this research tool. Ithaca can attribute inscriptions to their original location with an accuracy of 71% and can date them to less than 30 years of their ground-truth ranges, redating key texts of Classical Athens and contributing to topical debates in ancient history. This research shows how models such as Ithaca can unlock the cooperative potential between artificial intelligence and historians, transformationally impacting the way that we study and write about one of the most important periods in human history.\n
\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Child, R.\n \n \n (1)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n \n Language Models are Unsupervised Multitask Learners.\n \n \n \n \n\n\n \n Radford, A.; Wu, J.; Child, R.; Luan, D.; Amodei, D.; and Sutskever, I.\n\n\n \n\n\n\n In 2019. \n \n\n\n\n
\n\n\n\n \n \n \"LanguagePaper\n  \n \n\n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@inproceedings{radford_language_2019,\n\ttitle = {Language {Models} are {Unsupervised} {Multitask} {Learners}},\n\turl = {https://www.semanticscholar.org/paper/Language-Models-are-Unsupervised-Multitask-Learners-Radford-Wu/9405cc0d6169988371b2755e573cc28650d14dfe},\n\tabstract = {Natural language processing tasks, such as question answering, machine translation, reading comprehension, and summarization, are typically approached with supervised learning on taskspecific datasets. We demonstrate that language models begin to learn these tasks without any explicit supervision when trained on a new dataset of millions of webpages called WebText. When conditioned on a document plus questions, the answers generated by the language model reach 55 F1 on the CoQA dataset matching or exceeding the performance of 3 out of 4 baseline systems without using the 127,000+ training examples. The capacity of the language model is essential to the success of zero-shot task transfer and increasing it improves performance in a log-linear fashion across tasks. Our largest model, GPT-2, is a 1.5B parameter Transformer that achieves state of the art results on 7 out of 8 tested language modeling datasets in a zero-shot setting but still underfits WebText. Samples from the model reflect these improvements and contain coherent paragraphs of text. These findings suggest a promising path towards building language processing systems which learn to perform tasks from their naturally occurring demonstrations.},\n\turldate = {2023-02-02},\n\tauthor = {Radford, Alec and Wu, Jeff and Child, Rewon and Luan, D. and Amodei, Dario and Sutskever, Ilya},\n\tyear = {2019},\n}\n\n
\n
\n\n\n
\n Natural language processing tasks, such as question answering, machine translation, reading comprehension, and summarization, are typically approached with supervised learning on taskspecific datasets. We demonstrate that language models begin to learn these tasks without any explicit supervision when trained on a new dataset of millions of webpages called WebText. When conditioned on a document plus questions, the answers generated by the language model reach 55 F1 on the CoQA dataset matching or exceeding the performance of 3 out of 4 baseline systems without using the 127,000+ training examples. The capacity of the language model is essential to the success of zero-shot task transfer and increasing it improves performance in a log-linear fashion across tasks. Our largest model, GPT-2, is a 1.5B parameter Transformer that achieves state of the art results on 7 out of 8 tested language modeling datasets in a zero-shot setting but still underfits WebText. Samples from the model reflect these improvements and contain coherent paragraphs of text. These findings suggest a promising path towards building language processing systems which learn to perform tasks from their naturally occurring demonstrations.\n
\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Christlein, V.\n \n \n (1)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n \n Automatic Writer Identification in Historical Documents: A Case Study.\n \n \n \n \n\n\n \n Christlein, V.; Diem, M.; Kleber, F.; Mühlberger, G.; Schwägerl-Melchior, V.; Van Gelder, E.; and Maier, A.\n\n\n \n\n\n\n Zeitschrift für digitale Geisteswissenschaften. 2016.\n Publisher: HAB - Herzog August Bibliothek\n\n\n\n
\n\n\n\n \n \n \"AutomaticPaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@article{christleinAutomaticWriterIdentification2016,\n\ttitle = {Automatic {Writer} {Identification} in {Historical} {Documents}: {A} {Case} {Study}},\n\tshorttitle = {Automatic {Writer} {Identification} in {Historical} {Documents}},\n\turl = {http://www.zfdg.de/2016_002},\n\tdoi = {10.17175/2016_002},\n\tlanguage = {en},\n\turldate = {2023-11-17},\n\tjournal = {Zeitschrift für digitale Geisteswissenschaften},\n\tauthor = {Christlein, Vincent and Diem, Markus and Kleber, Florian and Mühlberger, Günter and Schwägerl-Melchior, Verena and Van Gelder, Esther and Maier, Andreas},\n\tyear = {2016},\n\tnote = {Publisher: HAB - Herzog August Bibliothek},\n}\n\n
\n
\n\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Clérice, T.\n \n \n (2)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n \n You Actually Look Twice At it (YALTAi): Using an object detection approach instead of region segmentation within the Kraken engine.\n \n \n \n \n\n\n \n Clérice, T.\n\n\n \n\n\n\n arXiv preprint arXiv:2207.11230. 2022.\n \n\n\n\n
\n\n\n\n \n \n \"YouPaper\n  \n \n\n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@article{clerice_you_2022,\n\ttitle = {You {Actually} {Look} {Twice} {At} it ({YALTAi}): {Using} an object detection approach instead of region segmentation within the {Kraken} engine},\n\tshorttitle = {You {Actually} {Look} {Twice} {At} it ({YALTAi})},\n\turl = {https://arxiv.org/pdf/2207.11230.pdf},\n\tabstract = {Layout Analysis (the identification of zones and their classification) is the first step along line segmenta-\ntion in Optical Character Recognition and similar tasks. The ability of identifying main body of text from\nmarginal text or running titles makes the difference between extracting the work full text of a digitized\nbook and noisy outputs. We show that most segmenters focus on pixel classification and that polygoniza-\ntion of this output has not been used as a target for the latest competition on historical document (ICDAR\n2017 and onwards), despite being the focus in the early 2010s. We propose to shift, for efficiency, the\ntask from a pixel classification-based polygonization to an object detection using isothetic rectangles. We\ncompare the output of Kraken and YOLOv5 in terms of segmentation and show that the later severely\noutperforms the first on small datasets (1110 samples and below). We release two datasets for training\nand evaluation on historical documents as well as a new package, YALTAi, which injects YOLOv5 in\nthe segmentation pipeline of Kraken 4.1.},\n\turldate = {2023-10-27},\n\tjournal = {arXiv preprint arXiv:2207.11230},\n\tauthor = {Clérice, Thibault},\n\tyear = {2022},\n}\n\n
\n
\n\n\n
\n Layout Analysis (the identification of zones and their classification) is the first step along line segmenta- tion in Optical Character Recognition and similar tasks. The ability of identifying main body of text from marginal text or running titles makes the difference between extracting the work full text of a digitized book and noisy outputs. We show that most segmenters focus on pixel classification and that polygoniza- tion of this output has not been used as a target for the latest competition on historical document (ICDAR 2017 and onwards), despite being the focus in the early 2010s. We propose to shift, for efficiency, the task from a pixel classification-based polygonization to an object detection using isothetic rectangles. We compare the output of Kraken and YOLOv5 in terms of segmentation and show that the later severely outperforms the first on small datasets (1110 samples and below). We release two datasets for training and evaluation on historical documents as well as a new package, YALTAi, which injects YOLOv5 in the segmentation pipeline of Kraken 4.1.\n
\n\n\n
\n\n\n
\n \n\n \n \n \n \n \n \n HTR-United: Mutualisons la vérité de terrain!.\n \n \n \n \n\n\n \n Chagué, A.; Clérice, T.; and Romary, L.\n\n\n \n\n\n\n In DHNord2021-Publier, partager, réutiliser les données de la recherche: les data papers et leurs enjeux, 2021. \n \n\n\n\n
\n\n\n\n \n \n \"HTR-United:Paper\n  \n \n\n \n\n \n link\n  \n \n\n bibtex\n \n\n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@inproceedings{chague_htr-united_2021,\n\ttitle = {{HTR}-{United}: {Mutualisons} la vérité de terrain!},\n\tshorttitle = {{HTR}-{United}},\n\turl = {https://hal.science/hal-03398740/document},\n\turldate = {2023-10-27},\n\tbooktitle = {{DHNord2021}-{Publier}, partager, réutiliser les données de la recherche: les data papers et leurs enjeux},\n\tauthor = {Chagué, Alix and Clérice, Thibault and Romary, Laurent},\n\tyear = {2021},\n}\n\n
\n
\n\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Constum, T.\n \n \n (1)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n Recognition and Information Extraction in Historical Handwritten Tables: Toward Understanding Early $$20\\textasciicircum\\th\\$$Century Paris Census.\n \n \n \n\n\n \n Constum, T.; Kempf, N.; Paquet, T.; Tranouez, P.; Chatelain, C.; Brée, S.; and Merveille, F.\n\n\n \n\n\n\n In Uchida, S.; Barney, E.; and Eglin, V., editor(s), Document Analysis Systems, pages 143–157, Cham, 2022. Springer International Publishing\n \n\n\n\n
\n\n\n\n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n\n\n\n
\n
@inproceedings{constumRecognitionInformationExtraction2022,\n\taddress = {Cham},\n\ttitle = {Recognition and {Information} {Extraction} in {Historical} {Handwritten} {Tables}: {Toward} {Understanding} {Early} \\$\\$20{\\textasciicircum}\\{th\\}\\$\\${Century} {Paris} {Census}},\n\tisbn = {978-3-031-06555-2},\n\tshorttitle = {Recognition and {Information} {Extraction} in {Historical} {Handwritten} {Tables}},\n\tdoi = {10.1007/978-3-031-06555-2_10},\n\tabstract = {We aim to build a vast database (up to 9 million individuals) from the handwritten tabular nominal census of Paris of 1926, 1931 and 1936, each composed of about 100,000 handwritten simple pages in a tabular format. We created a complete pipeline that goes from the scan of double pages to text prediction while minimizing the need for segmentation labels. We describe how weighted finite state transducers, writer specialization and self-training further improved our results. We also introduce through this communication two annotated datasets for handwriting recognition that are now publicly available, and an open-source toolkit to apply WFST on CTC lattices.},\n\tlanguage = {en},\n\tbooktitle = {Document {Analysis} {Systems}},\n\tpublisher = {Springer International Publishing},\n\tauthor = {Constum, Thomas and Kempf, Nicolas and Paquet, Thierry and Tranouez, Pierrick and Chatelain, Clément and Brée, Sandra and Merveille, François},\n\teditor = {Uchida, Seiichi and Barney, Elisa and Eglin, Véronique},\n\tyear = {2022},\n\tkeywords = {Document layout analysis, Handwriting recognition, Self-training, Semi-supervised learning, Table analysis, WFST, handwritten text recognition, table recognition},\n\tpages = {143--157},\n}\n\n
\n
\n\n\n
\n We aim to build a vast database (up to 9 million individuals) from the handwritten tabular nominal census of Paris of 1926, 1931 and 1936, each composed of about 100,000 handwritten simple pages in a tabular format. We created a complete pipeline that goes from the scan of double pages to text prediction while minimizing the need for segmentation labels. We describe how weighted finite state transducers, writer specialization and self-training further improved our results. We also introduce through this communication two annotated datasets for handwriting recognition that are now publicly available, and an open-source toolkit to apply WFST on CTC lattices.\n
\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Coüasnon, B.\n \n \n (1)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n A Light Transformer-Based Architecture for Handwritten Text Recognition.\n \n \n \n\n\n \n Barrere, K.; Soullard, Y.; Lemaitre, A.; and Coüasnon, B.\n\n\n \n\n\n\n In Uchida, S.; Barney, E.; and Eglin, V., editor(s), Document Analysis Systems, of Lecture Notes in Computer Science, pages 275–290, Cham, 2022. Springer International Publishing\n \n\n\n\n
\n\n\n\n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n \n \n \n \n \n \n \n \n \n \n\n\n\n
\n
@inproceedings{barrere_light_2022,\n\taddress = {Cham},\n\tseries = {Lecture {Notes} in {Computer} {Science}},\n\ttitle = {A {Light} {Transformer}-{Based} {Architecture} for {Handwritten} {Text} {Recognition}},\n\tisbn = {978-3-031-06555-2},\n\tdoi = {10.1007/978-3-031-06555-2_19},\n\tabstract = {Transformer models have been showing ground-breaking results in the domain of natural language processing. More recently, they started to gain interest in many others fields as in computer vision. Traditional Transformer models typically require a significant amount of training data to achieve satisfactory results. However, in the domain of handwritten text recognition, annotated data acquisition remains costly resulting in small datasets compared to those commonly used to train a Transformer-based model. Hence, training Transformer models able to transcribe handwritten text from images remains challenging. We propose a light encoder-decoder Transformer-based architecture for handwriting text recognition, containing a small number of parameters compared to traditional Transformer architectures. We trained our architecture using a hybrid loss, combining the well-known connectionist temporal classification with the cross-entropy. Experiments are conducted on the well-known IAM dataset with and without the use of additional synthetic data. We show that our network reaches state-of-the-art results in both cases, compared with other larger Transformer-based models.},\n\tlanguage = {en},\n\tbooktitle = {Document {Analysis} {Systems}},\n\tpublisher = {Springer International Publishing},\n\tauthor = {Barrere, Killian and Soullard, Yann and Lemaitre, Aurélie and Coüasnon, Bertrand},\n\teditor = {Uchida, Seiichi and Barney, Elisa and Eglin, Véronique},\n\tyear = {2022},\n\tkeywords = {Handwritten text recognition, Hybrid loss, Light network, Neural networks, Transformer},\n\tpages = {275--290},\n}\n\n
\n
\n\n\n
\n Transformer models have been showing ground-breaking results in the domain of natural language processing. More recently, they started to gain interest in many others fields as in computer vision. Traditional Transformer models typically require a significant amount of training data to achieve satisfactory results. However, in the domain of handwritten text recognition, annotated data acquisition remains costly resulting in small datasets compared to those commonly used to train a Transformer-based model. Hence, training Transformer models able to transcribe handwritten text from images remains challenging. We propose a light encoder-decoder Transformer-based architecture for handwriting text recognition, containing a small number of parameters compared to traditional Transformer architectures. We trained our architecture using a hybrid loss, combining the well-known connectionist temporal classification with the cross-entropy. Experiments are conducted on the well-known IAM dataset with and without the use of additional synthetic data. We show that our network reaches state-of-the-art results in both cases, compared with other larger Transformer-based models.\n
\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n De La Rosa Turbides, T.\n \n \n (1)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n \n Learning-Based Planning:.\n \n \n \n \n\n\n \n Jiménez Celorrio, S.; and De La Rosa Turbides, T.\n\n\n \n\n\n\n In Rabuñal Dopico, J. R.; Dorado, J.; and Pazos, A., editor(s), Encyclopedia of Artificial Intelligence, pages 1024–1028. IGI Global, 2009.\n \n\n\n\n
\n\n\n\n \n \n \"Learning-BasedPaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@incollection{rabunal_dopico_learning-based_2009,\n\ttitle = {Learning-{Based} {Planning}:},\n\tisbn = {978-1-59904-849-9 978-1-59904-850-5},\n\tshorttitle = {Learning-{Based} {Planning}},\n\turl = {http://services.igi-global.com/resolvedoi/resolve.aspx?doi=10.4018/978-1-59904-849-9.ch151},\n\tabstract = {Automated Planning (AP) studies the generation of action sequences for problem solving. A problem in AP is defined by a state-transition function describing the dynamics of the world, the initial state of the world and the goals to be achieved. According to this definition, AP problems seem to be easily tackled by searching for a path in a graph, which is a well-studied problem. However, the graphs resulting from AP problems are so large that explicitly specifying them is not feasible. Thus, different approaches have been tried to address AP problems. Since the mid 90’s, new planning algorithms have enabled the solution of practical-size AP problems. Nevertheless, domain-independent planners still fail in solving complex AP problems, as solving planning tasks is a PSPACE-Complete problem (Bylander, 94). How do humans cope with this planning-inherent complexity? One answer is that our experience allows us to solve problems more quickly; we are endowed with learning skills that help us plan when problems are selected from a stable population. Inspire by this idea, the field of learning-based planning studies the development of AP systems able to modify their performance according to previous experiences. Since the first days, Artificial Intelligence (AI) has been concerned with the problem of Machine Learning (ML). As early as 1959, Arthur L. Samuel developed a prominent program that learned to improve its play in the game of checkers (Samuel, 1959). It is hardly surprising that ML has often been used to make changes in systems that perform tasks associated with AI, such as perception, robot control or AP. This article analyses the diverse ways ML can be used to improve AP processes. First, we review the major AP concepts and summarize the main research done in learning-based planning. Second, we describe current trends in applying ML to AP. Finally, we comment on the next avenues for combining AP and ML and conclude.},\n\turldate = {2023-09-27},\n\tbooktitle = {Encyclopedia of {Artificial} {Intelligence}},\n\tpublisher = {IGI Global},\n\tauthor = {Jiménez Celorrio, Sergio and De La Rosa Turbides, Tomás},\n\teditor = {Rabuñal Dopico, Juan Ramón and Dorado, Julian and Pazos, Alejandro},\n\tyear = {2009},\n\tdoi = {10.4018/978-1-59904-849-9.ch151},\n\tpages = {1024--1028},\n}\n\n
\n
\n\n\n
\n Automated Planning (AP) studies the generation of action sequences for problem solving. A problem in AP is defined by a state-transition function describing the dynamics of the world, the initial state of the world and the goals to be achieved. According to this definition, AP problems seem to be easily tackled by searching for a path in a graph, which is a well-studied problem. However, the graphs resulting from AP problems are so large that explicitly specifying them is not feasible. Thus, different approaches have been tried to address AP problems. Since the mid 90’s, new planning algorithms have enabled the solution of practical-size AP problems. Nevertheless, domain-independent planners still fail in solving complex AP problems, as solving planning tasks is a PSPACE-Complete problem (Bylander, 94). How do humans cope with this planning-inherent complexity? One answer is that our experience allows us to solve problems more quickly; we are endowed with learning skills that help us plan when problems are selected from a stable population. Inspire by this idea, the field of learning-based planning studies the development of AP systems able to modify their performance according to previous experiences. Since the first days, Artificial Intelligence (AI) has been concerned with the problem of Machine Learning (ML). As early as 1959, Arthur L. Samuel developed a prominent program that learned to improve its play in the game of checkers (Samuel, 1959). It is hardly surprising that ML has often been used to make changes in systems that perform tasks associated with AI, such as perception, robot control or AP. This article analyses the diverse ways ML can be used to improve AP processes. First, we review the major AP concepts and summarize the main research done in learning-based planning. Second, we describe current trends in applying ML to AP. Finally, we comment on the next avenues for combining AP and ML and conclude.\n
\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Devlin, J.\n \n \n (1)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n Artificial intelligence: a modern approach.\n \n \n \n\n\n \n Russell, S. J.; Norvig, P.; Chang, M.; Devlin, J.; Dragan, A.; Forsyth, D.; Goodfellow, I.; Malik, J.; Mansinghka, V.; Pearl, J.; and Wooldridge, M. J.\n\n\n \n\n\n\n of Pearson series in artificial intelligencePearson, Harlow, Fourth edition, global edition edition, 2022.\n \n\n\n\n
\n\n\n\n \n\n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@book{russell_artificial_2022,\n\taddress = {Harlow},\n\tedition = {Fourth edition, global edition},\n\tseries = {Pearson series in artificial intelligence},\n\ttitle = {Artificial intelligence: a modern approach},\n\tisbn = {978-1-292-40113-3},\n\tshorttitle = {Artificial intelligence},\n\tabstract = {"Updated edition of popular textbook on Artificial Intelligence. This edition specific looks at ways of keeping artificial intelligence under control"},\n\tlanguage = {eng},\n\tpublisher = {Pearson},\n\tauthor = {Russell, Stuart J. and Norvig, Peter and Chang, Ming-wei and Devlin, Jacob and Dragan, Anca and Forsyth, David and Goodfellow, Ian and Malik, Jitendra and Mansinghka, Vikas and Pearl, Judea and Wooldridge, Michael J.},\n\tyear = {2022},\n}\n\n
\n
\n\n\n
\n \"Updated edition of popular textbook on Artificial Intelligence. This edition specific looks at ways of keeping artificial intelligence under control\"\n
\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Diem, M.\n \n \n (1)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n \n Automatic Writer Identification in Historical Documents: A Case Study.\n \n \n \n \n\n\n \n Christlein, V.; Diem, M.; Kleber, F.; Mühlberger, G.; Schwägerl-Melchior, V.; Van Gelder, E.; and Maier, A.\n\n\n \n\n\n\n Zeitschrift für digitale Geisteswissenschaften. 2016.\n Publisher: HAB - Herzog August Bibliothek\n\n\n\n
\n\n\n\n \n \n \"AutomaticPaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@article{christleinAutomaticWriterIdentification2016,\n\ttitle = {Automatic {Writer} {Identification} in {Historical} {Documents}: {A} {Case} {Study}},\n\tshorttitle = {Automatic {Writer} {Identification} in {Historical} {Documents}},\n\turl = {http://www.zfdg.de/2016_002},\n\tdoi = {10.17175/2016_002},\n\tlanguage = {en},\n\turldate = {2023-11-17},\n\tjournal = {Zeitschrift für digitale Geisteswissenschaften},\n\tauthor = {Christlein, Vincent and Diem, Markus and Kleber, Florian and Mühlberger, Günter and Schwägerl-Melchior, Verena and Van Gelder, Esther and Maier, Andreas},\n\tyear = {2016},\n\tnote = {Publisher: HAB - Herzog August Bibliothek},\n}\n\n
\n
\n\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Dihal, K.\n \n \n (1)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n \n Hopes and fears for intelligent machines in fiction and reality.\n \n \n \n \n\n\n \n Cave, S.; and Dihal, K.\n\n\n \n\n\n\n Nature Machine Intelligence, 1(2): 74–78. February 2019.\n Number: 2 Publisher: Nature Publishing Group\n\n\n\n
\n\n\n\n \n \n \"HopesPaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n \n \n \n \n \n \n \n \n\n\n\n
\n
@article{cave2019,\n\ttitle = {Hopes and fears for intelligent machines in fiction and reality},\n\tvolume = {1},\n\tcopyright = {2019 Springer Nature Limited},\n\tissn = {2522-5839},\n\turl = {https://www.nature.com/articles/s42256-019-0020-9},\n\tdoi = {10.1038/s42256-019-0020-9},\n\tabstract = {This paper categorizes some of the fundamental hopes and fears expressed in imaginings of artificial intelligence (AI), based on a survey of 300 fictional and non-fictional works. The categories are structured into four dichotomies, each comprising a hope and a parallel fear, mediated by the notion of control. These are: the hope for much longer lives (‘immortality’) and the fear of losing one’s identity (‘inhumanity’); the hope for a life free of work (‘ease’), and the fear of becoming redundant (‘obsolescence’); the hope that AI can fulfil one’s desires (‘gratification’), alongside the fear that humans will become redundant to each other (‘alienation’); and the hope that AI offers power over others (‘dominance’), with the fear that it will turn against us (‘uprising’). This Perspective further argues that these perceptions of AI’s possibilities, which may be quite detached from the reality of the technology, can influence how it is developed, deployed and regulated.},\n\tlanguage = {en},\n\tnumber = {2},\n\turldate = {2024-01-23},\n\tjournal = {Nature Machine Intelligence},\n\tauthor = {Cave, Stephen and Dihal, Kanta},\n\tmonth = feb,\n\tyear = {2019},\n\tnote = {Number: 2\nPublisher: Nature Publishing Group},\n\tkeywords = {Cultural and media studies, Literature, Science, technology and society},\n\tpages = {74--78},\n}\n\n
\n
\n\n\n
\n This paper categorizes some of the fundamental hopes and fears expressed in imaginings of artificial intelligence (AI), based on a survey of 300 fictional and non-fictional works. The categories are structured into four dichotomies, each comprising a hope and a parallel fear, mediated by the notion of control. These are: the hope for much longer lives (‘immortality’) and the fear of losing one’s identity (‘inhumanity’); the hope for a life free of work (‘ease’), and the fear of becoming redundant (‘obsolescence’); the hope that AI can fulfil one’s desires (‘gratification’), alongside the fear that humans will become redundant to each other (‘alienation’); and the hope that AI offers power over others (‘dominance’), with the fear that it will turn against us (‘uprising’). This Perspective further argues that these perceptions of AI’s possibilities, which may be quite detached from the reality of the technology, can influence how it is developed, deployed and regulated.\n
\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Dobson, J.\n \n \n (1)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n Interpretable Outputs: Criteria for Machine Learning in the Humanities.\n \n \n \n\n\n \n Dobson, J.\n\n\n \n\n\n\n Digital Humanities Quarterly, 15(2). June 2020.\n \n\n\n\n
\n\n\n\n \n\n \n\n \n link\n  \n \n\n bibtex\n \n\n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@article{dobson_interpretable_2020,\n\ttitle = {Interpretable {Outputs}: {Criteria} for {Machine} {Learning} in the {Humanities}},\n\tvolume = {15},\n\tissn = {1938-4122},\n\tshorttitle = {Interpretable {Outputs}},\n\tnumber = {2},\n\tjournal = {Digital Humanities Quarterly},\n\tauthor = {Dobson, James},\n\tmonth = jun,\n\tyear = {2020},\n}\n\n
\n
\n\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Dragan, A.\n \n \n (1)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n Artificial intelligence: a modern approach.\n \n \n \n\n\n \n Russell, S. J.; Norvig, P.; Chang, M.; Devlin, J.; Dragan, A.; Forsyth, D.; Goodfellow, I.; Malik, J.; Mansinghka, V.; Pearl, J.; and Wooldridge, M. J.\n\n\n \n\n\n\n of Pearson series in artificial intelligencePearson, Harlow, Fourth edition, global edition edition, 2022.\n \n\n\n\n
\n\n\n\n \n\n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@book{russell_artificial_2022,\n\taddress = {Harlow},\n\tedition = {Fourth edition, global edition},\n\tseries = {Pearson series in artificial intelligence},\n\ttitle = {Artificial intelligence: a modern approach},\n\tisbn = {978-1-292-40113-3},\n\tshorttitle = {Artificial intelligence},\n\tabstract = {"Updated edition of popular textbook on Artificial Intelligence. This edition specific looks at ways of keeping artificial intelligence under control"},\n\tlanguage = {eng},\n\tpublisher = {Pearson},\n\tauthor = {Russell, Stuart J. and Norvig, Peter and Chang, Ming-wei and Devlin, Jacob and Dragan, Anca and Forsyth, David and Goodfellow, Ian and Malik, Jitendra and Mansinghka, Vikas and Pearl, Judea and Wooldridge, Michael J.},\n\tyear = {2022},\n}\n\n
\n
\n\n\n
\n \"Updated edition of popular textbook on Artificial Intelligence. This edition specific looks at ways of keeping artificial intelligence under control\"\n
\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Eberhart, A.\n \n \n (1)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n \n Neuro-symbolic approaches in artificial intelligence.\n \n \n \n \n\n\n \n Hitzler, P.; Eberhart, A.; Ebrahimi, M.; Sarker, M. K.; and Zhou, L.\n\n\n \n\n\n\n National Science Review, 9(6): nwac035. June 2022.\n \n\n\n\n
\n\n\n\n \n \n \"Neuro-symbolicPaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@article{hitzler2022,\n\ttitle = {Neuro-symbolic approaches in artificial intelligence},\n\tvolume = {9},\n\tissn = {2095-5138},\n\turl = {https://doi.org/10.1093/nsr/nwac035},\n\tdoi = {10.1093/nsr/nwac035},\n\tnumber = {6},\n\turldate = {2024-01-23},\n\tjournal = {National Science Review},\n\tauthor = {Hitzler, Pascal and Eberhart, Aaron and Ebrahimi, Monireh and Sarker, Md Kamruzzaman and Zhou, Lu},\n\tmonth = jun,\n\tyear = {2022},\n\tpages = {nwac035},\n}\n\n
\n
\n\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Ebrahimi, M.\n \n \n (1)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n \n Neuro-symbolic approaches in artificial intelligence.\n \n \n \n \n\n\n \n Hitzler, P.; Eberhart, A.; Ebrahimi, M.; Sarker, M. K.; and Zhou, L.\n\n\n \n\n\n\n National Science Review, 9(6): nwac035. June 2022.\n \n\n\n\n
\n\n\n\n \n \n \"Neuro-symbolicPaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@article{hitzler2022,\n\ttitle = {Neuro-symbolic approaches in artificial intelligence},\n\tvolume = {9},\n\tissn = {2095-5138},\n\turl = {https://doi.org/10.1093/nsr/nwac035},\n\tdoi = {10.1093/nsr/nwac035},\n\tnumber = {6},\n\turldate = {2024-01-23},\n\tjournal = {National Science Review},\n\tauthor = {Hitzler, Pascal and Eberhart, Aaron and Ebrahimi, Monireh and Sarker, Md Kamruzzaman and Zhou, Lu},\n\tmonth = jun,\n\tyear = {2022},\n\tpages = {nwac035},\n}\n\n
\n
\n\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Fink, G.\n \n \n (1)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n \n PHOCNet: A Deep Convolutional Neural Network for Word Spotting in Handwritten Documents.\n \n \n \n \n\n\n \n Sudholt, S.; and Fink, G. A.\n\n\n \n\n\n\n December 2017.\n arXiv:1604.00187 [cs]\n\n\n\n
\n\n\n\n \n \n \"PHOCNet:Paper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n \n \n\n\n\n
\n
@misc{sudholt2017,\n\ttitle = {{PHOCNet}: {A} {Deep} {Convolutional} {Neural} {Network} for {Word} {Spotting} in {Handwritten} {Documents}},\n\tshorttitle = {{PHOCNet}},\n\turl = {http://arxiv.org/abs/1604.00187},\n\tdoi = {10.48550/arXiv.1604.00187},\n\tabstract = {In recent years, deep convolutional neural networks have achieved state of the art performance in various computer vision task such as classification, detection or segmentation. Due to their outstanding performance, CNNs are more and more used in the field of document image analysis as well. In this work, we present a CNN architecture that is trained with the recently proposed PHOC representation. We show empirically that our CNN architecture is able to outperform state of the art results for various word spotting benchmarks while exhibiting short training and test times.},\n\turldate = {2023-11-17},\n\tpublisher = {arXiv},\n\tauthor = {Sudholt, Sebastian and Fink, Gernot A.},\n\tmonth = dec,\n\tyear = {2017},\n\tnote = {arXiv:1604.00187 [cs]},\n\tkeywords = {Computer Science - Computer Vision and Pattern Recognition},\n}\n\n
\n
\n\n\n
\n In recent years, deep convolutional neural networks have achieved state of the art performance in various computer vision task such as classification, detection or segmentation. Due to their outstanding performance, CNNs are more and more used in the field of document image analysis as well. In this work, we present a CNN architecture that is trained with the recently proposed PHOC representation. We show empirically that our CNN architecture is able to outperform state of the art results for various word spotting benchmarks while exhibiting short training and test times.\n
\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Fischer, A.\n \n \n (2)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n \n Bullingers Briefwechsel zugänglich machen: Stand der Handschriftenerkennung.\n \n \n \n \n\n\n \n Ströbel, P.; Hodel, T.; Fischer, A.; Scius, A.; Wolf, B.; Janka, A.; Widmer, J.; Scheurer, P.; and Volk, M.\n\n\n \n\n\n\n . March 2023.\n Publisher: [object Object]\n\n\n\n
\n\n\n\n \n \n \"BullingersPaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n\n\n\n
\n
@article{strobel2023a,\n\ttitle = {Bullingers {Briefwechsel} zugänglich machen: {Stand} der {Handschriftenerkennung}},\n\tcopyright = {Creative Commons Attribution 4.0 International, Open Access},\n\tshorttitle = {Bullingers {Briefwechsel} zugänglich machen},\n\turl = {https://zenodo.org/record/7715357},\n\tdoi = {10.5281/ZENODO.7715357},\n\tabstract = {"Anhand des Briefwechsels Heinrich Bullingers (1504-1575), das rund 10'000 Briefe umfasst, demonstrieren wir den Stand der Forschung in automatisierter Handschriftenerkennung. Es finden sich mehr als hundert unterschiedliche Schreiberhände in den Briefen mit sehr unterschiedlicher Verteilung. Das Korpus ist zweisprachig (Latein/Deutsch) und teilweise findet der Sprachwechsel innerhalb von Abschnitten oder gar Sätzen statt. Auf Grund dieser Vielfalt eignet sich der Briefwechsel optimal als Testumgebung für entsprechende Algorithmen und ist aufschlussreiche für Forschungsprojekte und Erinnerungsinstitutionen mit ähnlichen Problemstellungen. Im Paper werden drei Verfahren gegeneinander gestellt und abgewogen. Im folgenden werde drei Ansätze an dem Korpus getestet, die Aufschlüsse zum Stand und möglichen Entwicklungen im Bereich der Handschriftenerkennung versprechen. Erstens wird mit Transkribus eine etablierte Plattform genutzt, die zwei Engines (HTR+ und PyLaia) anbietet. Zweitens wird mit Hilfe von Data Augmentation versucht die Erkennung mit der state-of-the-art Engine HTRFlor zu verbessern und drittens werden neue Transformer-basierte Modelle (TrOCR) eingesetzt." Ein Beitrag zur 9. Tagung des Verbands "Digital Humanities im deutschsprachigen Raum" - DHd 2023 Open Humanities Open Culture.},\n\turldate = {2024-04-22},\n\tauthor = {Ströbel, Phillip and Hodel, Tobias and Fischer, Andreas and Scius, Anna and Wolf, Beat and Janka, Anna and Widmer, Jonas and Scheurer, Patricia and Volk, Martin},\n\tcollaborator = {Trilcke, Peer and Busch, Anna and Helling, Patrick and Plum, Alistair and Wolter, Vivien and Weis, Joëlle and Chudoba, Hendrik},\n\tmonth = mar,\n\tyear = {2023},\n\tnote = {Publisher: [object Object]},\n\tkeywords = {Annotieren, Bewertung, DHd2023, Data augmentation, Daten, Handschriftenerkennung, Manuskript, Transkription, maschinelles Lernen},\n}\n\n
\n
\n\n\n
\n \"Anhand des Briefwechsels Heinrich Bullingers (1504-1575), das rund 10'000 Briefe umfasst, demonstrieren wir den Stand der Forschung in automatisierter Handschriftenerkennung. Es finden sich mehr als hundert unterschiedliche Schreiberhände in den Briefen mit sehr unterschiedlicher Verteilung. Das Korpus ist zweisprachig (Latein/Deutsch) und teilweise findet der Sprachwechsel innerhalb von Abschnitten oder gar Sätzen statt. Auf Grund dieser Vielfalt eignet sich der Briefwechsel optimal als Testumgebung für entsprechende Algorithmen und ist aufschlussreiche für Forschungsprojekte und Erinnerungsinstitutionen mit ähnlichen Problemstellungen. Im Paper werden drei Verfahren gegeneinander gestellt und abgewogen. Im folgenden werde drei Ansätze an dem Korpus getestet, die Aufschlüsse zum Stand und möglichen Entwicklungen im Bereich der Handschriftenerkennung versprechen. Erstens wird mit Transkribus eine etablierte Plattform genutzt, die zwei Engines (HTR+ und PyLaia) anbietet. Zweitens wird mit Hilfe von Data Augmentation versucht die Erkennung mit der state-of-the-art Engine HTRFlor zu verbessern und drittens werden neue Transformer-basierte Modelle (TrOCR) eingesetzt.\" Ein Beitrag zur 9. Tagung des Verbands \"Digital Humanities im deutschsprachigen Raum\" - DHd 2023 Open Humanities Open Culture.\n
\n\n\n
\n\n\n
\n \n\n \n \n \n \n \n \n The Bullinger Dataset: A Writer Adaptation Challenge.\n \n \n \n \n\n\n \n Scius-Bertrand, A.; Ströbel, P.; Volk, M.; Hodel, T.; and Fischer, A.\n\n\n \n\n\n\n In Fink, G. A.; Jain, R.; Kise, K.; and Zanibbi, R., editor(s), Document Analysis and Recognition - ICDAR 2023, volume 14187, pages 397–410. Springer Nature Switzerland, Cham, 2023.\n Series Title: Lecture Notes in Computer Science\n\n\n\n
\n\n\n\n \n \n \"ThePaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@incollection{fink_bullinger_2023,\n\taddress = {Cham},\n\ttitle = {The {Bullinger} {Dataset}: {A} {Writer} {Adaptation} {Challenge}},\n\tvolume = {14187},\n\tisbn = {978-3-031-41675-0 978-3-031-41676-7},\n\tshorttitle = {The {Bullinger} {Dataset}},\n\turl = {https://link.springer.com/10.1007/978-3-031-41676-7_23},\n\tlanguage = {en},\n\turldate = {2023-08-24},\n\tbooktitle = {Document {Analysis} and {Recognition} - {ICDAR} 2023},\n\tpublisher = {Springer Nature Switzerland},\n\tauthor = {Scius-Bertrand, Anna and Ströbel, Phillip and Volk, Martin and Hodel, Tobias and Fischer, Andreas},\n\teditor = {Fink, Gernot A. and Jain, Rajiv and Kise, Koichi and Zanibbi, Richard},\n\tyear = {2023},\n\tdoi = {10.1007/978-3-031-41676-7_23},\n\tnote = {Series Title: Lecture Notes in Computer Science},\n\tpages = {397--410},\n}\n\n
\n
\n\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Fornes, A.\n \n \n (1)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n \n Word Spotting and Recognition with Embedded Attributes.\n \n \n \n \n\n\n \n Almazan, J.; Gordo, A.; Fornes, A.; and Valveny, E.\n\n\n \n\n\n\n IEEE Transactions on Pattern Analysis and Machine Intelligence, 36(12): 2552–2566. December 2014.\n \n\n\n\n
\n\n\n\n \n \n \"WordPaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@article{almazanWordSpottingRecognition2014,\n\ttitle = {Word {Spotting} and {Recognition} with {Embedded} {Attributes}},\n\tvolume = {36},\n\tissn = {0162-8828, 2160-9292},\n\turl = {http://ieeexplore.ieee.org/document/6857995/},\n\tdoi = {10.1109/TPAMI.2014.2339814},\n\tnumber = {12},\n\turldate = {2023-11-17},\n\tjournal = {IEEE Transactions on Pattern Analysis and Machine Intelligence},\n\tauthor = {Almazan, Jon and Gordo, Albert and Fornes, Alicia and Valveny, Ernest},\n\tmonth = dec,\n\tyear = {2014},\n\tpages = {2552--2566},\n}\n\n
\n
\n\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Forsyth, D.\n \n \n (1)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n Artificial intelligence: a modern approach.\n \n \n \n\n\n \n Russell, S. J.; Norvig, P.; Chang, M.; Devlin, J.; Dragan, A.; Forsyth, D.; Goodfellow, I.; Malik, J.; Mansinghka, V.; Pearl, J.; and Wooldridge, M. J.\n\n\n \n\n\n\n of Pearson series in artificial intelligencePearson, Harlow, Fourth edition, global edition edition, 2022.\n \n\n\n\n
\n\n\n\n \n\n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@book{russell_artificial_2022,\n\taddress = {Harlow},\n\tedition = {Fourth edition, global edition},\n\tseries = {Pearson series in artificial intelligence},\n\ttitle = {Artificial intelligence: a modern approach},\n\tisbn = {978-1-292-40113-3},\n\tshorttitle = {Artificial intelligence},\n\tabstract = {"Updated edition of popular textbook on Artificial Intelligence. This edition specific looks at ways of keeping artificial intelligence under control"},\n\tlanguage = {eng},\n\tpublisher = {Pearson},\n\tauthor = {Russell, Stuart J. and Norvig, Peter and Chang, Ming-wei and Devlin, Jacob and Dragan, Anca and Forsyth, David and Goodfellow, Ian and Malik, Jitendra and Mansinghka, Vikas and Pearl, Judea and Wooldridge, Michael J.},\n\tyear = {2022},\n}\n\n
\n
\n\n\n
\n \"Updated edition of popular textbook on Artificial Intelligence. This edition specific looks at ways of keeping artificial intelligence under control\"\n
\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Gasparini, A.\n \n \n (1)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n \n Understanding Artificial Intelligence in Research Libraries – Extensive Literature Review.\n \n \n \n \n\n\n \n Gasparini, A.; and Kautonen, H.\n\n\n \n\n\n\n LIBER Quarterly: The Journal of the Association of European Research Libraries, 32(1). January 2022.\n Number: 1\n\n\n\n
\n\n\n\n \n \n \"UnderstandingPaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n \n \n\n\n\n
\n
@article{gasparini_understanding_2022,\n\ttitle = {Understanding {Artificial} {Intelligence} in {Research} {Libraries} – {Extensive} {Literature} {Review}},\n\tvolume = {32},\n\tcopyright = {Copyright (c) 2022 Andrea Gasparini, Heli Kautonen},\n\tissn = {2213-056X},\n\turl = {https://liberquarterly.eu/article/view/10934},\n\tdoi = {10.53377/lq.10934},\n\tabstract = {Artificial intelligence (AI) now forms a part of various activities in the academic world. AI will also affect how research libraries perform and carry out their services and how the various kinds of data they hold in their repositories will be used in the future. For the moment, the landscape is complex and unclear, and library personnel and leaders are uncertain about where they should lay the path ahead. This extensive literature review provides an overview of how research libraries understand, react to, and work with AI. This paper examines the roles conceived for libraries and librarians, their users, and AI. Finally, design thinking is presented as an approach to solving emerging issues with AI and opening up opportunities for this technology at a more strategic level.},\n\tlanguage = {en},\n\tnumber = {1},\n\turldate = {2023-09-28},\n\tjournal = {LIBER Quarterly: The Journal of the Association of European Research Libraries},\n\tauthor = {Gasparini, Andrea and Kautonen, Heli},\n\tmonth = jan,\n\tyear = {2022},\n\tnote = {Number: 1},\n\tkeywords = {literature review},\n}\n\n
\n
\n\n\n
\n Artificial intelligence (AI) now forms a part of various activities in the academic world. AI will also affect how research libraries perform and carry out their services and how the various kinds of data they hold in their repositories will be used in the future. For the moment, the landscape is complex and unclear, and library personnel and leaders are uncertain about where they should lay the path ahead. This extensive literature review provides an overview of how research libraries understand, react to, and work with AI. This paper examines the roles conceived for libraries and librarians, their users, and AI. Finally, design thinking is presented as an approach to solving emerging issues with AI and opening up opportunities for this technology at a more strategic level.\n
\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Gebru, T.\n \n \n (2)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n \n Lessons from Archives: Strategies for Collecting Sociocultural Data in Machine Learning.\n \n \n \n \n\n\n \n Gebru, T.\n\n\n \n\n\n\n In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, of KDD '20, pages 3609, New York, NY, USA, August 2020. Association for Computing Machinery\n \n\n\n\n
\n\n\n\n \n \n \"LessonsPaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n\n\n\n
\n
@inproceedings{gebru2020,\n\taddress = {New York, NY, USA},\n\tseries = {{KDD} '20},\n\ttitle = {Lessons from {Archives}: {Strategies} for {Collecting} {Sociocultural} {Data} in {Machine} {Learning}},\n\tisbn = {978-1-4503-7998-4},\n\tshorttitle = {Lessons from {Archives}},\n\turl = {https://doi.org/10.1145/3394486.3409559},\n\tdoi = {10.1145/3394486.3409559},\n\tabstract = {A growing body of work shows that many problems in fairness, accountability, transparency, and ethics in machine learning systems are rooted in decisions surrounding the data collection and annotation process. We argue that a new specialization should be formed within machine learning that is focused on methodologies for data collection and annotation: efforts that require institutional frameworks and procedures. Specifically for sociocultural data, parallels can be drawn from archives and libraries. Archives are the longest standing communal effort to gather human information and archive scholars have already developed the language and procedures to address and discuss many challenges pertaining to data collection such as consent, power, inclusivity, transparency, and ethics privacy. We discuss these five key approaches in document collection practices in archives that can inform data collection in sociocultural machine learning.},\n\turldate = {2024-01-02},\n\tbooktitle = {Proceedings of the 26th {ACM} {SIGKDD} {International} {Conference} on {Knowledge} {Discovery} \\& {Data} {Mining}},\n\tpublisher = {Association for Computing Machinery},\n\tauthor = {Gebru, Timnit},\n\tmonth = aug,\n\tyear = {2020},\n\tkeywords = {ai, archives, ethics, fairness, machine learning, sociocultural data},\n\tpages = {3609},\n}\n\n
\n
\n\n\n
\n A growing body of work shows that many problems in fairness, accountability, transparency, and ethics in machine learning systems are rooted in decisions surrounding the data collection and annotation process. We argue that a new specialization should be formed within machine learning that is focused on methodologies for data collection and annotation: efforts that require institutional frameworks and procedures. Specifically for sociocultural data, parallels can be drawn from archives and libraries. Archives are the longest standing communal effort to gather human information and archive scholars have already developed the language and procedures to address and discuss many challenges pertaining to data collection such as consent, power, inclusivity, transparency, and ethics privacy. We discuss these five key approaches in document collection practices in archives that can inform data collection in sociocultural machine learning.\n
\n\n\n
\n\n\n
\n \n\n \n \n \n \n \n \n Model Cards for Model Reporting.\n \n \n \n \n\n\n \n Mitchell, M.; Wu, S.; Zaldivar, A.; Barnes, P.; Vasserman, L.; Hutchinson, B.; Spitzer, E.; Raji, I. D.; and Gebru, T.\n\n\n \n\n\n\n Proceedings of the Conference on Fairness, Accountability, and Transparency,220–229. January 2019.\n arXiv: 1810.03993\n\n\n\n
\n\n\n\n \n \n \"ModelPaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n \n \n \n \n\n\n\n
\n
@article{mitchell_model_2019,\n\ttitle = {Model {Cards} for {Model} {Reporting}},\n\turl = {http://arxiv.org/abs/1810.03993},\n\tdoi = {10.1145/3287560.3287596},\n\tabstract = {Trained machine learning models are increasingly used to perform high-impact tasks in areas such as law enforcement, medicine, education, and employment. In order to clarify the intended use cases of machine learning models and minimize their usage in contexts for which they are not well suited, we recommend that released models be accompanied by documentation detailing their performance characteristics. In this paper, we propose a framework that we call model cards, to encourage such transparent model reporting. Model cards are short documents accompanying trained machine learning models that provide benchmarked evaluation in a variety of conditions, such as across different cultural, demographic, or phenotypic groups (e.g., race, geographic location, sex, Fitzpatrick skin type) and intersectional groups (e.g., age and race, or sex and Fitzpatrick skin type) that are relevant to the intended application domains. Model cards also disclose the context in which models are intended to be used, details of the performance evaluation procedures, and other relevant information. While we focus primarily on human-centered machine learning models in the application fields of computer vision and natural language processing, this framework can be used to document any trained machine learning model. To solidify the concept, we provide cards for two supervised models: One trained to detect smiling faces in images, and one trained to detect toxic comments in text. We propose model cards as a step towards the responsible democratization of machine learning and related AI technology, increasing transparency into how well AI technology works. We hope this work encourages those releasing trained machine learning models to accompany model releases with similar detailed evaluation numbers and other relevant documentation.},\n\turldate = {2022-01-24},\n\tjournal = {Proceedings of the Conference on Fairness, Accountability, and Transparency},\n\tauthor = {Mitchell, Margaret and Wu, Simone and Zaldivar, Andrew and Barnes, Parker and Vasserman, Lucy and Hutchinson, Ben and Spitzer, Elena and Raji, Inioluwa Deborah and Gebru, Timnit},\n\tmonth = jan,\n\tyear = {2019},\n\tnote = {arXiv: 1810.03993},\n\tkeywords = {Computer Science - Artificial Intelligence, Computer Science - Machine Learning},\n\tpages = {220--229},\n}\n\n
\n
\n\n\n
\n Trained machine learning models are increasingly used to perform high-impact tasks in areas such as law enforcement, medicine, education, and employment. In order to clarify the intended use cases of machine learning models and minimize their usage in contexts for which they are not well suited, we recommend that released models be accompanied by documentation detailing their performance characteristics. In this paper, we propose a framework that we call model cards, to encourage such transparent model reporting. Model cards are short documents accompanying trained machine learning models that provide benchmarked evaluation in a variety of conditions, such as across different cultural, demographic, or phenotypic groups (e.g., race, geographic location, sex, Fitzpatrick skin type) and intersectional groups (e.g., age and race, or sex and Fitzpatrick skin type) that are relevant to the intended application domains. Model cards also disclose the context in which models are intended to be used, details of the performance evaluation procedures, and other relevant information. While we focus primarily on human-centered machine learning models in the application fields of computer vision and natural language processing, this framework can be used to document any trained machine learning model. To solidify the concept, we provide cards for two supervised models: One trained to detect smiling faces in images, and one trained to detect toxic comments in text. We propose model cards as a step towards the responsible democratization of machine learning and related AI technology, increasing transparency into how well AI technology works. We hope this work encourages those releasing trained machine learning models to accompany model releases with similar detailed evaluation numbers and other relevant documentation.\n
\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Girshick, R.\n \n \n (1)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n \n Detectron2.\n \n \n \n \n\n\n \n Wu, Y.; Kirillov, A.; Massa, F.; Lo, W.; and Girshick, R.\n\n\n \n\n\n\n 2019.\n \n\n\n\n
\n\n\n\n \n \n \"Detectron2Paper\n  \n \n\n \n\n \n link\n  \n \n\n bibtex\n \n\n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@misc{wu2019,\n\ttitle = {Detectron2},\n\turl = {https://github.com/facebookresearch/detectron2},\n\tauthor = {Wu, Yuxin and Kirillov, Alexander and Massa, Francisco and Lo, Wan-Yen and Girshick, Ross},\n\tyear = {2019},\n}\n\n
\n
\n\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Gomez, A.\n \n \n (1)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n \n Attention Is All You Need.\n \n \n \n \n\n\n \n Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A. N.; Kaiser, L.; and Polosukhin, I.\n\n\n \n\n\n\n December 2017.\n arXiv:1706.03762 [cs]\n\n\n\n
\n\n\n\n \n \n \"AttentionPaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n \n \n \n \n\n\n\n
\n
@misc{vaswani2017,\n\ttitle = {Attention {Is} {All} {You} {Need}},\n\turl = {http://arxiv.org/abs/1706.03762},\n\tdoi = {10.48550/arXiv.1706.03762},\n\tabstract = {The dominant sequence transduction models are based on complex recurrent or convolutional neural networks in an encoder-decoder configuration. The best performing models also connect the encoder and decoder through an attention mechanism. We propose a new simple network architecture, the Transformer, based solely on attention mechanisms, dispensing with recurrence and convolutions entirely. Experiments on two machine translation tasks show these models to be superior in quality while being more parallelizable and requiring significantly less time to train. Our model achieves 28.4 BLEU on the WMT 2014 English-to-German translation task, improving over the existing best results, including ensembles by over 2 BLEU. On the WMT 2014 English-to-French translation task, our model establishes a new single-model state-of-the-art BLEU score of 41.8 after training for 3.5 days on eight GPUs, a small fraction of the training costs of the best models from the literature. We show that the Transformer generalizes well to other tasks by applying it successfully to English constituency parsing both with large and limited training data.},\n\turldate = {2023-02-02},\n\tpublisher = {arXiv},\n\tauthor = {Vaswani, Ashish and Shazeer, Noam and Parmar, Niki and Uszkoreit, Jakob and Jones, Llion and Gomez, Aidan N. and Kaiser, Lukasz and Polosukhin, Illia},\n\tmonth = dec,\n\tyear = {2017},\n\tnote = {arXiv:1706.03762 [cs]},\n\tkeywords = {Computer Science - Computation and Language, Computer Science - Machine Learning},\n}\n\n
\n
\n\n\n
\n The dominant sequence transduction models are based on complex recurrent or convolutional neural networks in an encoder-decoder configuration. The best performing models also connect the encoder and decoder through an attention mechanism. We propose a new simple network architecture, the Transformer, based solely on attention mechanisms, dispensing with recurrence and convolutions entirely. Experiments on two machine translation tasks show these models to be superior in quality while being more parallelizable and requiring significantly less time to train. Our model achieves 28.4 BLEU on the WMT 2014 English-to-German translation task, improving over the existing best results, including ensembles by over 2 BLEU. On the WMT 2014 English-to-French translation task, our model establishes a new single-model state-of-the-art BLEU score of 41.8 after training for 3.5 days on eight GPUs, a small fraction of the training costs of the best models from the literature. We show that the Transformer generalizes well to other tasks by applying it successfully to English constituency parsing both with large and limited training data.\n
\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Goodfellow, I.\n \n \n (1)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n Artificial intelligence: a modern approach.\n \n \n \n\n\n \n Russell, S. J.; Norvig, P.; Chang, M.; Devlin, J.; Dragan, A.; Forsyth, D.; Goodfellow, I.; Malik, J.; Mansinghka, V.; Pearl, J.; and Wooldridge, M. J.\n\n\n \n\n\n\n of Pearson series in artificial intelligencePearson, Harlow, Fourth edition, global edition edition, 2022.\n \n\n\n\n
\n\n\n\n \n\n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@book{russell_artificial_2022,\n\taddress = {Harlow},\n\tedition = {Fourth edition, global edition},\n\tseries = {Pearson series in artificial intelligence},\n\ttitle = {Artificial intelligence: a modern approach},\n\tisbn = {978-1-292-40113-3},\n\tshorttitle = {Artificial intelligence},\n\tabstract = {"Updated edition of popular textbook on Artificial Intelligence. This edition specific looks at ways of keeping artificial intelligence under control"},\n\tlanguage = {eng},\n\tpublisher = {Pearson},\n\tauthor = {Russell, Stuart J. and Norvig, Peter and Chang, Ming-wei and Devlin, Jacob and Dragan, Anca and Forsyth, David and Goodfellow, Ian and Malik, Jitendra and Mansinghka, Vikas and Pearl, Judea and Wooldridge, Michael J.},\n\tyear = {2022},\n}\n\n
\n
\n\n\n
\n \"Updated edition of popular textbook on Artificial Intelligence. This edition specific looks at ways of keeping artificial intelligence under control\"\n
\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Gordo, A.\n \n \n (1)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n \n Word Spotting and Recognition with Embedded Attributes.\n \n \n \n \n\n\n \n Almazan, J.; Gordo, A.; Fornes, A.; and Valveny, E.\n\n\n \n\n\n\n IEEE Transactions on Pattern Analysis and Machine Intelligence, 36(12): 2552–2566. December 2014.\n \n\n\n\n
\n\n\n\n \n \n \"WordPaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@article{almazanWordSpottingRecognition2014,\n\ttitle = {Word {Spotting} and {Recognition} with {Embedded} {Attributes}},\n\tvolume = {36},\n\tissn = {0162-8828, 2160-9292},\n\turl = {http://ieeexplore.ieee.org/document/6857995/},\n\tdoi = {10.1109/TPAMI.2014.2339814},\n\tnumber = {12},\n\turldate = {2023-11-17},\n\tjournal = {IEEE Transactions on Pattern Analysis and Machine Intelligence},\n\tauthor = {Almazan, Jon and Gordo, Albert and Fornes, Alicia and Valveny, Ernest},\n\tmonth = dec,\n\tyear = {2014},\n\tpages = {2552--2566},\n}\n\n
\n
\n\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Green, J.\n \n \n (1)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n \n Turing’s Genius – Defining an apt microcosm.\n \n \n \n \n\n\n \n Bowen, J.; Trickett, T.; Green, J. B. A.; and Lomas, A.\n\n\n \n\n\n\n In July 2018. BCS Learning & Development\n \n\n\n\n
\n\n\n\n \n \n \"Turing’sPaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@inproceedings{bowen_turings_2018,\n\ttitle = {Turing’s {Genius} – {Defining} an apt microcosm},\n\turl = {https://www.scienceopen.com/hosted-document?doi=10.14236/ewic/EVA2018.31},\n\tdoi = {10.14236/ewic/EVA2018.31},\n\tabstract = {Alan Turing (1912–1954) is widely acknowledged as a genius. As well as codebreaking during World War II and taking a pioneering role in computer hardware design and software after the War, he also wrote three important foundational papers in the fields of theoretical computer science, artificial intelligence, and mathematical biology. He has been called the father of computer science, but he also admired by mathematicians, philosophers, and perhaps more surprisingly biologists, for his wide-ranging ideas. His influence stretches from scientific to cultural and even political impact. For all these reasons, he was a true polymath. This paper considers the genius of Turing from various angles, both scientific and artistic. The four authors provide position statements on how Turing has influenced and inspired their work, together with short biographies, as a starting point for a panel session and visual music performance.},\n\turldate = {2023-09-27},\n\tpublisher = {BCS Learning \\& Development},\n\tauthor = {Bowen, Jonathan and Trickett, Terry and Green, Jeremy B. A. and Lomas, Andy},\n\tmonth = jul,\n\tyear = {2018},\n}\n\n
\n
\n\n\n
\n Alan Turing (1912–1954) is widely acknowledged as a genius. As well as codebreaking during World War II and taking a pioneering role in computer hardware design and software after the War, he also wrote three important foundational papers in the fields of theoretical computer science, artificial intelligence, and mathematical biology. He has been called the father of computer science, but he also admired by mathematicians, philosophers, and perhaps more surprisingly biologists, for his wide-ranging ideas. His influence stretches from scientific to cultural and even political impact. For all these reasons, he was a true polymath. This paper considers the genius of Turing from various angles, both scientific and artistic. The four authors provide position statements on how Turing has influenced and inspired their work, together with short biographies, as a starting point for a panel session and visual music performance.\n
\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Grüning, T.\n \n \n (3)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n \n Cells in Multidimensional Recurrent Neural Networks.\n \n \n \n \n\n\n \n Leifert, G.; Strauß, T.; Grüning, T.; Wustlich, W.; and Labahn, R.\n\n\n \n\n\n\n Journal of Machine Learning Research, 17: 97:1–97:37. 2016.\n \n\n\n\n
\n\n\n\n \n \n \"CellsPaper\n  \n \n\n \n\n \n link\n  \n \n\n bibtex\n \n\n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@article{leifert_cells_2016,\n\ttitle = {Cells in {Multidimensional} {Recurrent} {Neural} {Networks}},\n\tvolume = {17},\n\turl = {http://jmlr.org/papers/v17/14-203.html},\n\turldate = {2018-06-29},\n\tjournal = {Journal of Machine Learning Research},\n\tauthor = {Leifert, Gundram and Strauß, Tobias and Grüning, Tobias and Wustlich, Welf and Labahn, Roger},\n\tyear = {2016},\n\tpages = {97:1--97:37},\n}\n\n
\n
\n\n\n\n
\n\n\n
\n \n\n \n \n \n \n \n \n System Description of CITlab's Recognition & Retrieval Engine for ICDAR2017 Competition on Information Extraction in Historical Handwritten Records.\n \n \n \n \n\n\n \n Strauss, T.; Weidemann, M.; Michael, J.; Leifert, G.; Grüning, T.; and Labahn, R.\n\n\n \n\n\n\n CoRR, abs/1804.09943. 2018.\n \n\n\n\n
\n\n\n\n \n \n \"SystemPaper\n  \n \n\n \n\n \n link\n  \n \n\n bibtex\n \n\n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@article{strauss_system_2018,\n\ttitle = {System {Description} of {CITlab}'s {Recognition} \\& {Retrieval} {Engine} for {ICDAR2017} {Competition} on {Information} {Extraction} in {Historical} {Handwritten} {Records}},\n\tvolume = {abs/1804.09943},\n\turl = {http://arxiv.org/abs/1804.09943},\n\turldate = {2018-06-29},\n\tjournal = {CoRR},\n\tauthor = {Strauss, Tobias and Weidemann, Max and Michael, Johannes and Leifert, Gundram and Grüning, Tobias and Labahn, Roger},\n\tyear = {2018},\n}\n\n
\n
\n\n\n\n
\n\n\n
\n \n\n \n \n \n \n \n \n Regular expressions for decoding of neural network outputs.\n \n \n \n \n\n\n \n Strauß, T.; Leifert, G.; Grüning, T.; and Labahn, R.\n\n\n \n\n\n\n CoRR, abs/1509.04438. 2015.\n \n\n\n\n
\n\n\n\n \n \n \"RegularPaper\n  \n \n\n \n\n \n link\n  \n \n\n bibtex\n \n\n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@article{straus_regular_2015,\n\ttitle = {Regular expressions for decoding of neural network outputs},\n\tvolume = {abs/1509.04438},\n\turl = {http://arxiv.org/abs/1509.04438},\n\turldate = {2018-06-29},\n\tjournal = {CoRR},\n\tauthor = {Strauß, Tobias and Leifert, Gundram and Grüning, Tobias and Labahn, Roger},\n\tyear = {2015},\n}\n\n
\n
\n\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Haverals, W.\n \n \n (1)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n \n CERberus: guardian against character errors.\n \n \n \n \n\n\n \n Haverals, W.\n\n\n \n\n\n\n 2023.\n \n\n\n\n
\n\n\n\n \n \n \"CERberus:Paper\n  \n \n\n \n\n \n link\n  \n \n\n bibtex\n \n\n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@misc{haverals_cerberus_2023,\n\ttitle = {{CERberus}: guardian against character errors},\n\turl = {https://github.com/WHaverals/CERberus},\n\tauthor = {Haverals, Wouter},\n\tyear = {2023},\n}\n\n
\n
\n\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Hitzler, P.\n \n \n (1)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n \n Neuro-symbolic approaches in artificial intelligence.\n \n \n \n \n\n\n \n Hitzler, P.; Eberhart, A.; Ebrahimi, M.; Sarker, M. K.; and Zhou, L.\n\n\n \n\n\n\n National Science Review, 9(6): nwac035. June 2022.\n \n\n\n\n
\n\n\n\n \n \n \"Neuro-symbolicPaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@article{hitzler2022,\n\ttitle = {Neuro-symbolic approaches in artificial intelligence},\n\tvolume = {9},\n\tissn = {2095-5138},\n\turl = {https://doi.org/10.1093/nsr/nwac035},\n\tdoi = {10.1093/nsr/nwac035},\n\tnumber = {6},\n\turldate = {2024-01-23},\n\tjournal = {National Science Review},\n\tauthor = {Hitzler, Pascal and Eberhart, Aaron and Ebrahimi, Monireh and Sarker, Md Kamruzzaman and Zhou, Lu},\n\tmonth = jun,\n\tyear = {2022},\n\tpages = {nwac035},\n}\n\n
\n
\n\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Hochreiter, S.\n \n \n (1)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n \n LSTM can solve hard long time lag problems.\n \n \n \n \n\n\n \n Hochreiter, S.; and Schmidhuber, J.\n\n\n \n\n\n\n Advances in neural information processing systems, 9. 1996.\n \n\n\n\n
\n\n\n\n \n \n \"LSTMPaper\n  \n \n\n \n\n \n link\n  \n \n\n bibtex\n \n\n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@article{hochreiter_lstm_1996,\n\ttitle = {{LSTM} can solve hard long time lag problems},\n\tvolume = {9},\n\turl = {https://proceedings.neurips.cc/paper/1996/hash/a4d2f0d23dcc84ce983ff9157f8b7f88-Abstract.html},\n\turldate = {2023-09-27},\n\tjournal = {Advances in neural information processing systems},\n\tauthor = {Hochreiter, Sepp and Schmidhuber, Jürgen},\n\tyear = {1996},\n}\n\n
\n
\n\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Hodel, T.\n \n \n (5)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n \n Bullingers Briefwechsel zugänglich machen: Stand der Handschriftenerkennung.\n \n \n \n \n\n\n \n Ströbel, P.; Hodel, T.; Fischer, A.; Scius, A.; Wolf, B.; Janka, A.; Widmer, J.; Scheurer, P.; and Volk, M.\n\n\n \n\n\n\n . March 2023.\n Publisher: [object Object]\n\n\n\n
\n\n\n\n \n \n \"BullingersPaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n\n\n\n
\n
@article{strobel2023a,\n\ttitle = {Bullingers {Briefwechsel} zugänglich machen: {Stand} der {Handschriftenerkennung}},\n\tcopyright = {Creative Commons Attribution 4.0 International, Open Access},\n\tshorttitle = {Bullingers {Briefwechsel} zugänglich machen},\n\turl = {https://zenodo.org/record/7715357},\n\tdoi = {10.5281/ZENODO.7715357},\n\tabstract = {"Anhand des Briefwechsels Heinrich Bullingers (1504-1575), das rund 10'000 Briefe umfasst, demonstrieren wir den Stand der Forschung in automatisierter Handschriftenerkennung. Es finden sich mehr als hundert unterschiedliche Schreiberhände in den Briefen mit sehr unterschiedlicher Verteilung. Das Korpus ist zweisprachig (Latein/Deutsch) und teilweise findet der Sprachwechsel innerhalb von Abschnitten oder gar Sätzen statt. Auf Grund dieser Vielfalt eignet sich der Briefwechsel optimal als Testumgebung für entsprechende Algorithmen und ist aufschlussreiche für Forschungsprojekte und Erinnerungsinstitutionen mit ähnlichen Problemstellungen. Im Paper werden drei Verfahren gegeneinander gestellt und abgewogen. Im folgenden werde drei Ansätze an dem Korpus getestet, die Aufschlüsse zum Stand und möglichen Entwicklungen im Bereich der Handschriftenerkennung versprechen. Erstens wird mit Transkribus eine etablierte Plattform genutzt, die zwei Engines (HTR+ und PyLaia) anbietet. Zweitens wird mit Hilfe von Data Augmentation versucht die Erkennung mit der state-of-the-art Engine HTRFlor zu verbessern und drittens werden neue Transformer-basierte Modelle (TrOCR) eingesetzt." Ein Beitrag zur 9. Tagung des Verbands "Digital Humanities im deutschsprachigen Raum" - DHd 2023 Open Humanities Open Culture.},\n\turldate = {2024-04-22},\n\tauthor = {Ströbel, Phillip and Hodel, Tobias and Fischer, Andreas and Scius, Anna and Wolf, Beat and Janka, Anna and Widmer, Jonas and Scheurer, Patricia and Volk, Martin},\n\tcollaborator = {Trilcke, Peer and Busch, Anna and Helling, Patrick and Plum, Alistair and Wolter, Vivien and Weis, Joëlle and Chudoba, Hendrik},\n\tmonth = mar,\n\tyear = {2023},\n\tnote = {Publisher: [object Object]},\n\tkeywords = {Annotieren, Bewertung, DHd2023, Data augmentation, Daten, Handschriftenerkennung, Manuskript, Transkription, maschinelles Lernen},\n}\n\n
\n
\n\n\n
\n \"Anhand des Briefwechsels Heinrich Bullingers (1504-1575), das rund 10'000 Briefe umfasst, demonstrieren wir den Stand der Forschung in automatisierter Handschriftenerkennung. Es finden sich mehr als hundert unterschiedliche Schreiberhände in den Briefen mit sehr unterschiedlicher Verteilung. Das Korpus ist zweisprachig (Latein/Deutsch) und teilweise findet der Sprachwechsel innerhalb von Abschnitten oder gar Sätzen statt. Auf Grund dieser Vielfalt eignet sich der Briefwechsel optimal als Testumgebung für entsprechende Algorithmen und ist aufschlussreiche für Forschungsprojekte und Erinnerungsinstitutionen mit ähnlichen Problemstellungen. Im Paper werden drei Verfahren gegeneinander gestellt und abgewogen. Im folgenden werde drei Ansätze an dem Korpus getestet, die Aufschlüsse zum Stand und möglichen Entwicklungen im Bereich der Handschriftenerkennung versprechen. Erstens wird mit Transkribus eine etablierte Plattform genutzt, die zwei Engines (HTR+ und PyLaia) anbietet. Zweitens wird mit Hilfe von Data Augmentation versucht die Erkennung mit der state-of-the-art Engine HTRFlor zu verbessern und drittens werden neue Transformer-basierte Modelle (TrOCR) eingesetzt.\" Ein Beitrag zur 9. Tagung des Verbands \"Digital Humanities im deutschsprachigen Raum\" - DHd 2023 Open Humanities Open Culture.\n
\n\n\n
\n\n\n
\n \n\n \n \n \n \n \n \n The Adaptability of a Transformer-Based OCR Model for Historical Documents.\n \n \n \n \n\n\n \n Ströbel, P. B.; Hodel, T.; Boente, W.; and Volk, M.\n\n\n \n\n\n\n In Coustaty, M.; and Fornés, A., editor(s), Document Analysis and Recognition – ICDAR 2023 Workshops, volume 14193, pages 34–48. Springer Nature Switzerland, Cham, 2023.\n Series Title: Lecture Notes in Computer Science\n\n\n\n
\n\n\n\n \n \n \"ThePaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@incollection{coustaty_adaptability_2023,\n\taddress = {Cham},\n\ttitle = {The {Adaptability} of a {Transformer}-{Based} {OCR} {Model} for {Historical} {Documents}},\n\tvolume = {14193},\n\tisbn = {978-3-031-41497-8 978-3-031-41498-5},\n\turl = {https://link.springer.com/10.1007/978-3-031-41498-5_3},\n\tlanguage = {en},\n\turldate = {2023-10-17},\n\tbooktitle = {Document {Analysis} and {Recognition} – {ICDAR} 2023 {Workshops}},\n\tpublisher = {Springer Nature Switzerland},\n\tauthor = {Ströbel, Phillip Benjamin and Hodel, Tobias and Boente, Walter and Volk, Martin},\n\teditor = {Coustaty, Mickael and Fornés, Alicia},\n\tyear = {2023},\n\tdoi = {10.1007/978-3-031-41498-5_3},\n\tnote = {Series Title: Lecture Notes in Computer Science},\n\tpages = {34--48},\n}\n\n
\n
\n\n\n\n
\n\n\n
\n \n\n \n \n \n \n \n \n The Bullinger Dataset: A Writer Adaptation Challenge.\n \n \n \n \n\n\n \n Scius-Bertrand, A.; Ströbel, P.; Volk, M.; Hodel, T.; and Fischer, A.\n\n\n \n\n\n\n In Fink, G. A.; Jain, R.; Kise, K.; and Zanibbi, R., editor(s), Document Analysis and Recognition - ICDAR 2023, volume 14187, pages 397–410. Springer Nature Switzerland, Cham, 2023.\n Series Title: Lecture Notes in Computer Science\n\n\n\n
\n\n\n\n \n \n \"ThePaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@incollection{fink_bullinger_2023,\n\taddress = {Cham},\n\ttitle = {The {Bullinger} {Dataset}: {A} {Writer} {Adaptation} {Challenge}},\n\tvolume = {14187},\n\tisbn = {978-3-031-41675-0 978-3-031-41676-7},\n\tshorttitle = {The {Bullinger} {Dataset}},\n\turl = {https://link.springer.com/10.1007/978-3-031-41676-7_23},\n\tlanguage = {en},\n\turldate = {2023-08-24},\n\tbooktitle = {Document {Analysis} and {Recognition} - {ICDAR} 2023},\n\tpublisher = {Springer Nature Switzerland},\n\tauthor = {Scius-Bertrand, Anna and Ströbel, Phillip and Volk, Martin and Hodel, Tobias and Fischer, Andreas},\n\teditor = {Fink, Gernot A. and Jain, Rajiv and Kise, Koichi and Zanibbi, Richard},\n\tyear = {2023},\n\tdoi = {10.1007/978-3-031-41676-7_23},\n\tnote = {Series Title: Lecture Notes in Computer Science},\n\tpages = {397--410},\n}\n\n
\n
\n\n\n\n
\n\n\n
\n \n\n \n \n \n \n \n \n Konsequenzen der Handschriftenerkennung und des maschinellen Lernens für die Geschichtswissenschaft. Anwendung, Einordnung und Methodenkritik.\n \n \n \n \n\n\n \n Hodel, T.\n\n\n \n\n\n\n Historische Zeitschrift, 316(1): 151–180. 2023.\n \n\n\n\n
\n\n\n\n \n \n \"KonsequenzenPaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@article{hodel_konsequenzen_2023,\n\ttitle = {Konsequenzen der {Handschriftenerkennung} und des maschinellen {Lernens} für die {Geschichtswissenschaft}. {Anwendung}, {Einordnung} und {Methodenkritik}},\n\tvolume = {316},\n\turl = {https://doi.org/10.1515/hzhz-2023-0006},\n\tdoi = {10.1515/hzhz-2023-0006},\n\tnumber = {1},\n\tjournal = {Historische Zeitschrift},\n\tauthor = {Hodel, Tobias},\n\tyear = {2023},\n\tpages = {151--180},\n}\n\n
\n
\n\n\n\n
\n\n\n
\n \n\n \n \n \n \n \n \n Die Maschine und die Geschichtswissenschaft: Der Einfluss von deep learning auf eine Disziplin.\n \n \n \n \n\n\n \n Hodel, T.\n\n\n \n\n\n\n In Döring, K. D.; Haas, S.; König, M.; and Wettlaufer, J., editor(s), Digital History: Konzepte, Methoden und Kritiken Digitaler Geschichtswissenschaft, volume 6, of Studies in Digital History and Hermeneutics, pages 65–80. De Gruyter Oldenbourg, Berlin, Boston, 2022.\n \n\n\n\n
\n\n\n\n \n \n \"DiePaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@incollection{hodel_maschine_2022,\n\taddress = {Berlin, Boston},\n\tseries = {Studies in {Digital} {History} and {Hermeneutics}},\n\ttitle = {Die {Maschine} und die {Geschichtswissenschaft}: {Der} {Einfluss} von deep learning auf eine {Disziplin}},\n\tvolume = {6},\n\turl = {https://doi.org/10.1515/9783110757101-004},\n\turldate = {2022-08-25},\n\tbooktitle = {Digital {History}: {Konzepte}, {Methoden} und {Kritiken} {Digitaler} {Geschichtswissenschaft}},\n\tpublisher = {De Gruyter Oldenbourg},\n\tauthor = {Hodel, Tobias},\n\teditor = {Döring, Karoline Dominika and Haas, Stefan and König, Mareike and Wettlaufer, Jörg},\n\tyear = {2022},\n\tdoi = {doi:10.1515/9783110757101-004},\n\tpages = {65--80},\n}\n\n
\n
\n\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Hutchinson, B.\n \n \n (1)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n \n Model Cards for Model Reporting.\n \n \n \n \n\n\n \n Mitchell, M.; Wu, S.; Zaldivar, A.; Barnes, P.; Vasserman, L.; Hutchinson, B.; Spitzer, E.; Raji, I. D.; and Gebru, T.\n\n\n \n\n\n\n Proceedings of the Conference on Fairness, Accountability, and Transparency,220–229. January 2019.\n arXiv: 1810.03993\n\n\n\n
\n\n\n\n \n \n \"ModelPaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n \n \n \n \n\n\n\n
\n
@article{mitchell_model_2019,\n\ttitle = {Model {Cards} for {Model} {Reporting}},\n\turl = {http://arxiv.org/abs/1810.03993},\n\tdoi = {10.1145/3287560.3287596},\n\tabstract = {Trained machine learning models are increasingly used to perform high-impact tasks in areas such as law enforcement, medicine, education, and employment. In order to clarify the intended use cases of machine learning models and minimize their usage in contexts for which they are not well suited, we recommend that released models be accompanied by documentation detailing their performance characteristics. In this paper, we propose a framework that we call model cards, to encourage such transparent model reporting. Model cards are short documents accompanying trained machine learning models that provide benchmarked evaluation in a variety of conditions, such as across different cultural, demographic, or phenotypic groups (e.g., race, geographic location, sex, Fitzpatrick skin type) and intersectional groups (e.g., age and race, or sex and Fitzpatrick skin type) that are relevant to the intended application domains. Model cards also disclose the context in which models are intended to be used, details of the performance evaluation procedures, and other relevant information. While we focus primarily on human-centered machine learning models in the application fields of computer vision and natural language processing, this framework can be used to document any trained machine learning model. To solidify the concept, we provide cards for two supervised models: One trained to detect smiling faces in images, and one trained to detect toxic comments in text. We propose model cards as a step towards the responsible democratization of machine learning and related AI technology, increasing transparency into how well AI technology works. We hope this work encourages those releasing trained machine learning models to accompany model releases with similar detailed evaluation numbers and other relevant documentation.},\n\turldate = {2022-01-24},\n\tjournal = {Proceedings of the Conference on Fairness, Accountability, and Transparency},\n\tauthor = {Mitchell, Margaret and Wu, Simone and Zaldivar, Andrew and Barnes, Parker and Vasserman, Lucy and Hutchinson, Ben and Spitzer, Elena and Raji, Inioluwa Deborah and Gebru, Timnit},\n\tmonth = jan,\n\tyear = {2019},\n\tnote = {arXiv: 1810.03993},\n\tkeywords = {Computer Science - Artificial Intelligence, Computer Science - Machine Learning},\n\tpages = {220--229},\n}\n\n
\n
\n\n\n
\n Trained machine learning models are increasingly used to perform high-impact tasks in areas such as law enforcement, medicine, education, and employment. In order to clarify the intended use cases of machine learning models and minimize their usage in contexts for which they are not well suited, we recommend that released models be accompanied by documentation detailing their performance characteristics. In this paper, we propose a framework that we call model cards, to encourage such transparent model reporting. Model cards are short documents accompanying trained machine learning models that provide benchmarked evaluation in a variety of conditions, such as across different cultural, demographic, or phenotypic groups (e.g., race, geographic location, sex, Fitzpatrick skin type) and intersectional groups (e.g., age and race, or sex and Fitzpatrick skin type) that are relevant to the intended application domains. Model cards also disclose the context in which models are intended to be used, details of the performance evaluation procedures, and other relevant information. While we focus primarily on human-centered machine learning models in the application fields of computer vision and natural language processing, this framework can be used to document any trained machine learning model. To solidify the concept, we provide cards for two supervised models: One trained to detect smiling faces in images, and one trained to detect toxic comments in text. We propose model cards as a step towards the responsible democratization of machine learning and related AI technology, increasing transparency into how well AI technology works. We hope this work encourages those releasing trained machine learning models to accompany model releases with similar detailed evaluation numbers and other relevant documentation.\n
\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Jacsont, P.\n \n \n (1)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n \n Impact of Image Enhancement Methods on Automatic Transcription Trainings with eScriptorium.\n \n \n \n \n\n\n \n Jacsont, P.; and Leblanc, E.\n\n\n \n\n\n\n June 2023.\n \n\n\n\n
\n\n\n\n \n \n \"ImpactPaper\n  \n \n\n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n \n \n \n \n \n \n \n \n\n\n\n
\n
@unpublished{jacsont2023,\n\ttitle = {Impact of {Image} {Enhancement} {Methods} on {Automatic} {Transcription} {Trainings} with {eScriptorium}},\n\turl = {https://hal.science/hal-03831686},\n\tabstract = {This study stems from the Desenrollando el cordel (Untangling the cordel) project, which focuses on 19th-century Spanish prints editing. It evaluates the impact of image enhancement methods on the automatic transcription of low-quality documents, both in terms of printing and digitisation. We compare different methods (binarisation, deblur) and present the results obtained during the training of models with the Kraken tool. We demonstrate that binarisation methods give better results than the other, and that the combination of several techniques did not significantly improve the transcription prediction. This study shows the significance of using image enhancement methods with Kraken. It paves the way for further experiments with larger and more varied corpora to help future projects design their automatic transcription workflow.},\n\turldate = {2024-05-03},\n\tauthor = {Jacsont, Pauline and Leblanc, Elina},\n\tmonth = jun,\n\tyear = {2023},\n\tkeywords = {Spanish literature, binarisation, image enhancement methods, printed documents},\n}\n\n
\n
\n\n\n
\n This study stems from the Desenrollando el cordel (Untangling the cordel) project, which focuses on 19th-century Spanish prints editing. It evaluates the impact of image enhancement methods on the automatic transcription of low-quality documents, both in terms of printing and digitisation. We compare different methods (binarisation, deblur) and present the results obtained during the training of models with the Kraken tool. We demonstrate that binarisation methods give better results than the other, and that the combination of several techniques did not significantly improve the transcription prediction. This study shows the significance of using image enhancement methods with Kraken. It paves the way for further experiments with larger and more varied corpora to help future projects design their automatic transcription workflow.\n
\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Janka, A.\n \n \n (1)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n \n Bullingers Briefwechsel zugänglich machen: Stand der Handschriftenerkennung.\n \n \n \n \n\n\n \n Ströbel, P.; Hodel, T.; Fischer, A.; Scius, A.; Wolf, B.; Janka, A.; Widmer, J.; Scheurer, P.; and Volk, M.\n\n\n \n\n\n\n . March 2023.\n Publisher: [object Object]\n\n\n\n
\n\n\n\n \n \n \"BullingersPaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n\n\n\n
\n
@article{strobel2023a,\n\ttitle = {Bullingers {Briefwechsel} zugänglich machen: {Stand} der {Handschriftenerkennung}},\n\tcopyright = {Creative Commons Attribution 4.0 International, Open Access},\n\tshorttitle = {Bullingers {Briefwechsel} zugänglich machen},\n\turl = {https://zenodo.org/record/7715357},\n\tdoi = {10.5281/ZENODO.7715357},\n\tabstract = {"Anhand des Briefwechsels Heinrich Bullingers (1504-1575), das rund 10'000 Briefe umfasst, demonstrieren wir den Stand der Forschung in automatisierter Handschriftenerkennung. Es finden sich mehr als hundert unterschiedliche Schreiberhände in den Briefen mit sehr unterschiedlicher Verteilung. Das Korpus ist zweisprachig (Latein/Deutsch) und teilweise findet der Sprachwechsel innerhalb von Abschnitten oder gar Sätzen statt. Auf Grund dieser Vielfalt eignet sich der Briefwechsel optimal als Testumgebung für entsprechende Algorithmen und ist aufschlussreiche für Forschungsprojekte und Erinnerungsinstitutionen mit ähnlichen Problemstellungen. Im Paper werden drei Verfahren gegeneinander gestellt und abgewogen. Im folgenden werde drei Ansätze an dem Korpus getestet, die Aufschlüsse zum Stand und möglichen Entwicklungen im Bereich der Handschriftenerkennung versprechen. Erstens wird mit Transkribus eine etablierte Plattform genutzt, die zwei Engines (HTR+ und PyLaia) anbietet. Zweitens wird mit Hilfe von Data Augmentation versucht die Erkennung mit der state-of-the-art Engine HTRFlor zu verbessern und drittens werden neue Transformer-basierte Modelle (TrOCR) eingesetzt." Ein Beitrag zur 9. Tagung des Verbands "Digital Humanities im deutschsprachigen Raum" - DHd 2023 Open Humanities Open Culture.},\n\turldate = {2024-04-22},\n\tauthor = {Ströbel, Phillip and Hodel, Tobias and Fischer, Andreas and Scius, Anna and Wolf, Beat and Janka, Anna and Widmer, Jonas and Scheurer, Patricia and Volk, Martin},\n\tcollaborator = {Trilcke, Peer and Busch, Anna and Helling, Patrick and Plum, Alistair and Wolter, Vivien and Weis, Joëlle and Chudoba, Hendrik},\n\tmonth = mar,\n\tyear = {2023},\n\tnote = {Publisher: [object Object]},\n\tkeywords = {Annotieren, Bewertung, DHd2023, Data augmentation, Daten, Handschriftenerkennung, Manuskript, Transkription, maschinelles Lernen},\n}\n\n
\n
\n\n\n
\n \"Anhand des Briefwechsels Heinrich Bullingers (1504-1575), das rund 10'000 Briefe umfasst, demonstrieren wir den Stand der Forschung in automatisierter Handschriftenerkennung. Es finden sich mehr als hundert unterschiedliche Schreiberhände in den Briefen mit sehr unterschiedlicher Verteilung. Das Korpus ist zweisprachig (Latein/Deutsch) und teilweise findet der Sprachwechsel innerhalb von Abschnitten oder gar Sätzen statt. Auf Grund dieser Vielfalt eignet sich der Briefwechsel optimal als Testumgebung für entsprechende Algorithmen und ist aufschlussreiche für Forschungsprojekte und Erinnerungsinstitutionen mit ähnlichen Problemstellungen. Im Paper werden drei Verfahren gegeneinander gestellt und abgewogen. Im folgenden werde drei Ansätze an dem Korpus getestet, die Aufschlüsse zum Stand und möglichen Entwicklungen im Bereich der Handschriftenerkennung versprechen. Erstens wird mit Transkribus eine etablierte Plattform genutzt, die zwei Engines (HTR+ und PyLaia) anbietet. Zweitens wird mit Hilfe von Data Augmentation versucht die Erkennung mit der state-of-the-art Engine HTRFlor zu verbessern und drittens werden neue Transformer-basierte Modelle (TrOCR) eingesetzt.\" Ein Beitrag zur 9. Tagung des Verbands \"Digital Humanities im deutschsprachigen Raum\" - DHd 2023 Open Humanities Open Culture.\n
\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Jia, Z.\n \n \n (1)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n \n TRIG: Transformer-Based Text Recognizer with Initial Embedding Guidance.\n \n \n \n \n\n\n \n Tao, Y.; Jia, Z.; Ma, R.; and Xu, S.\n\n\n \n\n\n\n Electronics, 10(22): 2780. January 2021.\n Number: 22 Publisher: Multidisciplinary Digital Publishing Institute\n\n\n\n
\n\n\n\n \n \n \"TRIG:Paper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n \n \n \n \n \n \n \n \n \n \n\n\n\n
\n
@article{tao_trig_2021,\n\ttitle = {{TRIG}: {Transformer}-{Based} {Text} {Recognizer} with {Initial} {Embedding} {Guidance}},\n\tvolume = {10},\n\tcopyright = {http://creativecommons.org/licenses/by/3.0/},\n\tissn = {2079-9292},\n\tshorttitle = {{TRIG}},\n\turl = {https://www.mdpi.com/2079-9292/10/22/2780},\n\tdoi = {10.3390/electronics10222780},\n\tabstract = {Scene text recognition (STR) is an important bridge between images and text, attracting abundant research attention. While convolutional neural networks (CNNS) have achieved remarkable progress in this task, most of the existing works need an extra module (context modeling module) to help CNN to capture global dependencies to solve the inductive bias and strengthen the relationship between text features. Recently, the transformer has been proposed as a promising network for global context modeling by self-attention mechanism, but one of the main short-comings, when applied to recognition, is the efficiency. We propose a 1-D split to address the challenges of complexity and replace the CNN with the transformer encoder to reduce the need for a context modeling module. Furthermore, recent methods use a frozen initial embedding to guide the decoder to decode the features to text, leading to a loss of accuracy. We propose to use a learnable initial embedding learned from the transformer encoder to make it adaptive to different input images. Above all, we introduce a novel architecture for text recognition, named TRansformer-based text recognizer with Initial embedding Guidance (TRIG), composed of three stages (transformation, feature extraction, and prediction). Extensive experiments show that our approach can achieve state-of-the-art on text recognition benchmarks.},\n\tlanguage = {en},\n\tnumber = {22},\n\turldate = {2023-09-29},\n\tjournal = {Electronics},\n\tauthor = {Tao, Yue and Jia, Zhiwei and Ma, Runze and Xu, Shugong},\n\tmonth = jan,\n\tyear = {2021},\n\tnote = {Number: 22\nPublisher: Multidisciplinary Digital Publishing Institute},\n\tkeywords = {1-D split, initial embedding, scene text recognition, self-attention, transformer},\n\tpages = {2780},\n}\n\n
\n
\n\n\n
\n Scene text recognition (STR) is an important bridge between images and text, attracting abundant research attention. While convolutional neural networks (CNNS) have achieved remarkable progress in this task, most of the existing works need an extra module (context modeling module) to help CNN to capture global dependencies to solve the inductive bias and strengthen the relationship between text features. Recently, the transformer has been proposed as a promising network for global context modeling by self-attention mechanism, but one of the main short-comings, when applied to recognition, is the efficiency. We propose a 1-D split to address the challenges of complexity and replace the CNN with the transformer encoder to reduce the need for a context modeling module. Furthermore, recent methods use a frozen initial embedding to guide the decoder to decode the features to text, leading to a loss of accuracy. We propose to use a learnable initial embedding learned from the transformer encoder to make it adaptive to different input images. Above all, we introduce a novel architecture for text recognition, named TRansformer-based text recognizer with Initial embedding Guidance (TRIG), composed of three stages (transformation, feature extraction, and prediction). Extensive experiments show that our approach can achieve state-of-the-art on text recognition benchmarks.\n
\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Jiménez Celorrio, S.\n \n \n (1)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n \n Learning-Based Planning:.\n \n \n \n \n\n\n \n Jiménez Celorrio, S.; and De La Rosa Turbides, T.\n\n\n \n\n\n\n In Rabuñal Dopico, J. R.; Dorado, J.; and Pazos, A., editor(s), Encyclopedia of Artificial Intelligence, pages 1024–1028. IGI Global, 2009.\n \n\n\n\n
\n\n\n\n \n \n \"Learning-BasedPaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@incollection{rabunal_dopico_learning-based_2009,\n\ttitle = {Learning-{Based} {Planning}:},\n\tisbn = {978-1-59904-849-9 978-1-59904-850-5},\n\tshorttitle = {Learning-{Based} {Planning}},\n\turl = {http://services.igi-global.com/resolvedoi/resolve.aspx?doi=10.4018/978-1-59904-849-9.ch151},\n\tabstract = {Automated Planning (AP) studies the generation of action sequences for problem solving. A problem in AP is defined by a state-transition function describing the dynamics of the world, the initial state of the world and the goals to be achieved. According to this definition, AP problems seem to be easily tackled by searching for a path in a graph, which is a well-studied problem. However, the graphs resulting from AP problems are so large that explicitly specifying them is not feasible. Thus, different approaches have been tried to address AP problems. Since the mid 90’s, new planning algorithms have enabled the solution of practical-size AP problems. Nevertheless, domain-independent planners still fail in solving complex AP problems, as solving planning tasks is a PSPACE-Complete problem (Bylander, 94). How do humans cope with this planning-inherent complexity? One answer is that our experience allows us to solve problems more quickly; we are endowed with learning skills that help us plan when problems are selected from a stable population. Inspire by this idea, the field of learning-based planning studies the development of AP systems able to modify their performance according to previous experiences. Since the first days, Artificial Intelligence (AI) has been concerned with the problem of Machine Learning (ML). As early as 1959, Arthur L. Samuel developed a prominent program that learned to improve its play in the game of checkers (Samuel, 1959). It is hardly surprising that ML has often been used to make changes in systems that perform tasks associated with AI, such as perception, robot control or AP. This article analyses the diverse ways ML can be used to improve AP processes. First, we review the major AP concepts and summarize the main research done in learning-based planning. Second, we describe current trends in applying ML to AP. Finally, we comment on the next avenues for combining AP and ML and conclude.},\n\turldate = {2023-09-27},\n\tbooktitle = {Encyclopedia of {Artificial} {Intelligence}},\n\tpublisher = {IGI Global},\n\tauthor = {Jiménez Celorrio, Sergio and De La Rosa Turbides, Tomás},\n\teditor = {Rabuñal Dopico, Juan Ramón and Dorado, Julian and Pazos, Alejandro},\n\tyear = {2009},\n\tdoi = {10.4018/978-1-59904-849-9.ch151},\n\tpages = {1024--1028},\n}\n\n
\n
\n\n\n
\n Automated Planning (AP) studies the generation of action sequences for problem solving. A problem in AP is defined by a state-transition function describing the dynamics of the world, the initial state of the world and the goals to be achieved. According to this definition, AP problems seem to be easily tackled by searching for a path in a graph, which is a well-studied problem. However, the graphs resulting from AP problems are so large that explicitly specifying them is not feasible. Thus, different approaches have been tried to address AP problems. Since the mid 90’s, new planning algorithms have enabled the solution of practical-size AP problems. Nevertheless, domain-independent planners still fail in solving complex AP problems, as solving planning tasks is a PSPACE-Complete problem (Bylander, 94). How do humans cope with this planning-inherent complexity? One answer is that our experience allows us to solve problems more quickly; we are endowed with learning skills that help us plan when problems are selected from a stable population. Inspire by this idea, the field of learning-based planning studies the development of AP systems able to modify their performance according to previous experiences. Since the first days, Artificial Intelligence (AI) has been concerned with the problem of Machine Learning (ML). As early as 1959, Arthur L. Samuel developed a prominent program that learned to improve its play in the game of checkers (Samuel, 1959). It is hardly surprising that ML has often been used to make changes in systems that perform tasks associated with AI, such as perception, robot control or AP. This article analyses the diverse ways ML can be used to improve AP processes. First, we review the major AP concepts and summarize the main research done in learning-based planning. Second, we describe current trends in applying ML to AP. Finally, we comment on the next avenues for combining AP and ML and conclude.\n
\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Jones, L.\n \n \n (1)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n \n Attention Is All You Need.\n \n \n \n \n\n\n \n Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A. N.; Kaiser, L.; and Polosukhin, I.\n\n\n \n\n\n\n December 2017.\n arXiv:1706.03762 [cs]\n\n\n\n
\n\n\n\n \n \n \"AttentionPaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n \n \n \n \n\n\n\n
\n
@misc{vaswani2017,\n\ttitle = {Attention {Is} {All} {You} {Need}},\n\turl = {http://arxiv.org/abs/1706.03762},\n\tdoi = {10.48550/arXiv.1706.03762},\n\tabstract = {The dominant sequence transduction models are based on complex recurrent or convolutional neural networks in an encoder-decoder configuration. The best performing models also connect the encoder and decoder through an attention mechanism. We propose a new simple network architecture, the Transformer, based solely on attention mechanisms, dispensing with recurrence and convolutions entirely. Experiments on two machine translation tasks show these models to be superior in quality while being more parallelizable and requiring significantly less time to train. Our model achieves 28.4 BLEU on the WMT 2014 English-to-German translation task, improving over the existing best results, including ensembles by over 2 BLEU. On the WMT 2014 English-to-French translation task, our model establishes a new single-model state-of-the-art BLEU score of 41.8 after training for 3.5 days on eight GPUs, a small fraction of the training costs of the best models from the literature. We show that the Transformer generalizes well to other tasks by applying it successfully to English constituency parsing both with large and limited training data.},\n\turldate = {2023-02-02},\n\tpublisher = {arXiv},\n\tauthor = {Vaswani, Ashish and Shazeer, Noam and Parmar, Niki and Uszkoreit, Jakob and Jones, Llion and Gomez, Aidan N. and Kaiser, Lukasz and Polosukhin, Illia},\n\tmonth = dec,\n\tyear = {2017},\n\tnote = {arXiv:1706.03762 [cs]},\n\tkeywords = {Computer Science - Computation and Language, Computer Science - Machine Learning},\n}\n\n
\n
\n\n\n
\n The dominant sequence transduction models are based on complex recurrent or convolutional neural networks in an encoder-decoder configuration. The best performing models also connect the encoder and decoder through an attention mechanism. We propose a new simple network architecture, the Transformer, based solely on attention mechanisms, dispensing with recurrence and convolutions entirely. Experiments on two machine translation tasks show these models to be superior in quality while being more parallelizable and requiring significantly less time to train. Our model achieves 28.4 BLEU on the WMT 2014 English-to-German translation task, improving over the existing best results, including ensembles by over 2 BLEU. On the WMT 2014 English-to-French translation task, our model establishes a new single-model state-of-the-art BLEU score of 41.8 after training for 3.5 days on eight GPUs, a small fraction of the training costs of the best models from the literature. We show that the Transformer generalizes well to other tasks by applying it successfully to English constituency parsing both with large and limited training data.\n
\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Jurafsky, D.\n \n \n (1)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n Speech and language processing: an introduction to natural language processing, computational linguistics, and speech recognition.\n \n \n \n\n\n \n Jurafsky, D.; and Martin, J. H.\n\n\n \n\n\n\n Prentice Hall, Upper Saddle River, NJ, 2 edition, 2009.\n \n\n\n\n
\n\n\n\n \n\n \n\n \n link\n  \n \n\n bibtex\n \n\n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@book{jurafsky_speech_2009,\n\taddress = {Upper Saddle River, NJ},\n\tedition = {2},\n\ttitle = {Speech and language processing: an introduction to natural language processing, computational linguistics, and speech recognition},\n\tisbn = {978-0-13-187321-6},\n\tshorttitle = {Speech and language processing},\n\tlanguage = {eng},\n\tpublisher = {Prentice Hall},\n\tauthor = {Jurafsky, Dan and Martin, James H.},\n\tyear = {2009},\n}\n\n
\n
\n\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Kaiser, L.\n \n \n (1)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n \n Attention Is All You Need.\n \n \n \n \n\n\n \n Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A. N.; Kaiser, L.; and Polosukhin, I.\n\n\n \n\n\n\n December 2017.\n arXiv:1706.03762 [cs]\n\n\n\n
\n\n\n\n \n \n \"AttentionPaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n \n \n \n \n\n\n\n
\n
@misc{vaswani2017,\n\ttitle = {Attention {Is} {All} {You} {Need}},\n\turl = {http://arxiv.org/abs/1706.03762},\n\tdoi = {10.48550/arXiv.1706.03762},\n\tabstract = {The dominant sequence transduction models are based on complex recurrent or convolutional neural networks in an encoder-decoder configuration. The best performing models also connect the encoder and decoder through an attention mechanism. We propose a new simple network architecture, the Transformer, based solely on attention mechanisms, dispensing with recurrence and convolutions entirely. Experiments on two machine translation tasks show these models to be superior in quality while being more parallelizable and requiring significantly less time to train. Our model achieves 28.4 BLEU on the WMT 2014 English-to-German translation task, improving over the existing best results, including ensembles by over 2 BLEU. On the WMT 2014 English-to-French translation task, our model establishes a new single-model state-of-the-art BLEU score of 41.8 after training for 3.5 days on eight GPUs, a small fraction of the training costs of the best models from the literature. We show that the Transformer generalizes well to other tasks by applying it successfully to English constituency parsing both with large and limited training data.},\n\turldate = {2023-02-02},\n\tpublisher = {arXiv},\n\tauthor = {Vaswani, Ashish and Shazeer, Noam and Parmar, Niki and Uszkoreit, Jakob and Jones, Llion and Gomez, Aidan N. and Kaiser, Lukasz and Polosukhin, Illia},\n\tmonth = dec,\n\tyear = {2017},\n\tnote = {arXiv:1706.03762 [cs]},\n\tkeywords = {Computer Science - Computation and Language, Computer Science - Machine Learning},\n}\n\n
\n
\n\n\n
\n The dominant sequence transduction models are based on complex recurrent or convolutional neural networks in an encoder-decoder configuration. The best performing models also connect the encoder and decoder through an attention mechanism. We propose a new simple network architecture, the Transformer, based solely on attention mechanisms, dispensing with recurrence and convolutions entirely. Experiments on two machine translation tasks show these models to be superior in quality while being more parallelizable and requiring significantly less time to train. Our model achieves 28.4 BLEU on the WMT 2014 English-to-German translation task, improving over the existing best results, including ensembles by over 2 BLEU. On the WMT 2014 English-to-French translation task, our model establishes a new single-model state-of-the-art BLEU score of 41.8 after training for 3.5 days on eight GPUs, a small fraction of the training costs of the best models from the literature. We show that the Transformer generalizes well to other tasks by applying it successfully to English constituency parsing both with large and limited training data.\n
\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Kautonen, H.\n \n \n (1)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n \n Understanding Artificial Intelligence in Research Libraries – Extensive Literature Review.\n \n \n \n \n\n\n \n Gasparini, A.; and Kautonen, H.\n\n\n \n\n\n\n LIBER Quarterly: The Journal of the Association of European Research Libraries, 32(1). January 2022.\n Number: 1\n\n\n\n
\n\n\n\n \n \n \"UnderstandingPaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n \n \n\n\n\n
\n
@article{gasparini_understanding_2022,\n\ttitle = {Understanding {Artificial} {Intelligence} in {Research} {Libraries} – {Extensive} {Literature} {Review}},\n\tvolume = {32},\n\tcopyright = {Copyright (c) 2022 Andrea Gasparini, Heli Kautonen},\n\tissn = {2213-056X},\n\turl = {https://liberquarterly.eu/article/view/10934},\n\tdoi = {10.53377/lq.10934},\n\tabstract = {Artificial intelligence (AI) now forms a part of various activities in the academic world. AI will also affect how research libraries perform and carry out their services and how the various kinds of data they hold in their repositories will be used in the future. For the moment, the landscape is complex and unclear, and library personnel and leaders are uncertain about where they should lay the path ahead. This extensive literature review provides an overview of how research libraries understand, react to, and work with AI. This paper examines the roles conceived for libraries and librarians, their users, and AI. Finally, design thinking is presented as an approach to solving emerging issues with AI and opening up opportunities for this technology at a more strategic level.},\n\tlanguage = {en},\n\tnumber = {1},\n\turldate = {2023-09-28},\n\tjournal = {LIBER Quarterly: The Journal of the Association of European Research Libraries},\n\tauthor = {Gasparini, Andrea and Kautonen, Heli},\n\tmonth = jan,\n\tyear = {2022},\n\tnote = {Number: 1},\n\tkeywords = {literature review},\n}\n\n
\n
\n\n\n
\n Artificial intelligence (AI) now forms a part of various activities in the academic world. AI will also affect how research libraries perform and carry out their services and how the various kinds of data they hold in their repositories will be used in the future. For the moment, the landscape is complex and unclear, and library personnel and leaders are uncertain about where they should lay the path ahead. This extensive literature review provides an overview of how research libraries understand, react to, and work with AI. This paper examines the roles conceived for libraries and librarians, their users, and AI. Finally, design thinking is presented as an approach to solving emerging issues with AI and opening up opportunities for this technology at a more strategic level.\n
\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Kempf, N.\n \n \n (1)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n Recognition and Information Extraction in Historical Handwritten Tables: Toward Understanding Early $$20\\textasciicircum\\th\\$$Century Paris Census.\n \n \n \n\n\n \n Constum, T.; Kempf, N.; Paquet, T.; Tranouez, P.; Chatelain, C.; Brée, S.; and Merveille, F.\n\n\n \n\n\n\n In Uchida, S.; Barney, E.; and Eglin, V., editor(s), Document Analysis Systems, pages 143–157, Cham, 2022. Springer International Publishing\n \n\n\n\n
\n\n\n\n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n\n\n\n
\n
@inproceedings{constumRecognitionInformationExtraction2022,\n\taddress = {Cham},\n\ttitle = {Recognition and {Information} {Extraction} in {Historical} {Handwritten} {Tables}: {Toward} {Understanding} {Early} \\$\\$20{\\textasciicircum}\\{th\\}\\$\\${Century} {Paris} {Census}},\n\tisbn = {978-3-031-06555-2},\n\tshorttitle = {Recognition and {Information} {Extraction} in {Historical} {Handwritten} {Tables}},\n\tdoi = {10.1007/978-3-031-06555-2_10},\n\tabstract = {We aim to build a vast database (up to 9 million individuals) from the handwritten tabular nominal census of Paris of 1926, 1931 and 1936, each composed of about 100,000 handwritten simple pages in a tabular format. We created a complete pipeline that goes from the scan of double pages to text prediction while minimizing the need for segmentation labels. We describe how weighted finite state transducers, writer specialization and self-training further improved our results. We also introduce through this communication two annotated datasets for handwriting recognition that are now publicly available, and an open-source toolkit to apply WFST on CTC lattices.},\n\tlanguage = {en},\n\tbooktitle = {Document {Analysis} {Systems}},\n\tpublisher = {Springer International Publishing},\n\tauthor = {Constum, Thomas and Kempf, Nicolas and Paquet, Thierry and Tranouez, Pierrick and Chatelain, Clément and Brée, Sandra and Merveille, François},\n\teditor = {Uchida, Seiichi and Barney, Elisa and Eglin, Véronique},\n\tyear = {2022},\n\tkeywords = {Document layout analysis, Handwriting recognition, Self-training, Semi-supervised learning, Table analysis, WFST, handwritten text recognition, table recognition},\n\tpages = {143--157},\n}\n\n
\n
\n\n\n
\n We aim to build a vast database (up to 9 million individuals) from the handwritten tabular nominal census of Paris of 1926, 1931 and 1936, each composed of about 100,000 handwritten simple pages in a tabular format. We created a complete pipeline that goes from the scan of double pages to text prediction while minimizing the need for segmentation labels. We describe how weighted finite state transducers, writer specialization and self-training further improved our results. We also introduce through this communication two annotated datasets for handwriting recognition that are now publicly available, and an open-source toolkit to apply WFST on CTC lattices.\n
\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Khan, A.\n \n \n (2)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n \n A Comprehensive Overview of Large Language Models.\n \n \n \n \n\n\n \n Naveed, H.; Khan, A. U.; Qiu, S.; Saqib, M.; Anwar, S.; Usman, M.; Akhtar, N.; Barnes, N.; and Mian, A.\n\n\n \n\n\n\n December 2023.\n arXiv:2307.06435 [cs]\n\n\n\n
\n\n\n\n \n \n \"APaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n \n \n \n \n \n \n \n \n \n \n\n\n\n
\n
@misc{naveed2023,\n\ttitle = {A {Comprehensive} {Overview} of {Large} {Language} {Models}},\n\turl = {http://arxiv.org/abs/2307.06435},\n\tdoi = {10.48550/arXiv.2307.06435},\n\tabstract = {Large Language Models (LLMs) have recently demonstrated remarkable capabilities in natural language processing tasks and beyond. This success of LLMs has led to a large influx of research contributions in this direction. These works encompass diverse topics such as architectural innovations, better training strategies, context length improvements, fine-tuning, multi-modal LLMs, robotics, datasets, benchmarking, efficiency, and more. With the rapid development of techniques and regular breakthroughs in LLM research, it has become considerably challenging to perceive the bigger picture of the advances in this direction. Considering the rapidly emerging plethora of literature on LLMs, it is imperative that the research community is able to benefit from a concise yet comprehensive overview of the recent developments in this field. This article provides an overview of the existing literature on a broad range of LLM-related concepts. Our self-contained comprehensive overview of LLMs discusses relevant background concepts along with covering the advanced topics at the frontier of research in LLMs. This review article is intended to not only provide a systematic survey but also a quick comprehensive reference for the researchers and practitioners to draw insights from extensive informative summaries of the existing works to advance the LLM research.},\n\turldate = {2024-02-19},\n\tpublisher = {arXiv},\n\tauthor = {Naveed, Humza and Khan, Asad Ullah and Qiu, Shi and Saqib, Muhammad and Anwar, Saeed and Usman, Muhammad and Akhtar, Naveed and Barnes, Nick and Mian, Ajmal},\n\tmonth = dec,\n\tyear = {2023},\n\tnote = {arXiv:2307.06435 [cs]},\n\tkeywords = {Computer Science - Computation and Language, LLM, Large Language Model, Overview, Überblick},\n}\n\n
\n
\n\n\n
\n Large Language Models (LLMs) have recently demonstrated remarkable capabilities in natural language processing tasks and beyond. This success of LLMs has led to a large influx of research contributions in this direction. These works encompass diverse topics such as architectural innovations, better training strategies, context length improvements, fine-tuning, multi-modal LLMs, robotics, datasets, benchmarking, efficiency, and more. With the rapid development of techniques and regular breakthroughs in LLM research, it has become considerably challenging to perceive the bigger picture of the advances in this direction. Considering the rapidly emerging plethora of literature on LLMs, it is imperative that the research community is able to benefit from a concise yet comprehensive overview of the recent developments in this field. This article provides an overview of the existing literature on a broad range of LLM-related concepts. Our self-contained comprehensive overview of LLMs discusses relevant background concepts along with covering the advanced topics at the frontier of research in LLMs. This review article is intended to not only provide a systematic survey but also a quick comprehensive reference for the researchers and practitioners to draw insights from extensive informative summaries of the existing works to advance the LLM research.\n
\n\n\n
\n\n\n
\n \n\n \n \n \n \n \n \n A Comprehensive Overview of Large Language Models.\n \n \n \n \n\n\n \n Naveed, H.; Khan, A. U.; Qiu, S.; Saqib, M.; Anwar, S.; Usman, M.; Akhtar, N.; Barnes, N.; and Mian, A.\n\n\n \n\n\n\n December 2023.\n arXiv:2307.06435 [cs]\n\n\n\n
\n\n\n\n \n \n \"APaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n \n \n \n \n \n \n \n \n \n \n\n\n\n
\n
@misc{naveed2023a,\n\ttitle = {A {Comprehensive} {Overview} of {Large} {Language} {Models}},\n\turl = {http://arxiv.org/abs/2307.06435},\n\tdoi = {10.48550/arXiv.2307.06435},\n\tabstract = {Large Language Models (LLMs) have recently demonstrated remarkable capabilities in natural language processing tasks and beyond. This success of LLMs has led to a large influx of research contributions in this direction. These works encompass diverse topics such as architectural innovations, better training strategies, context length improvements, fine-tuning, multi-modal LLMs, robotics, datasets, benchmarking, efficiency, and more. With the rapid development of techniques and regular breakthroughs in LLM research, it has become considerably challenging to perceive the bigger picture of the advances in this direction. Considering the rapidly emerging plethora of literature on LLMs, it is imperative that the research community is able to benefit from a concise yet comprehensive overview of the recent developments in this field. This article provides an overview of the existing literature on a broad range of LLM-related concepts. Our self-contained comprehensive overview of LLMs discusses relevant background concepts along with covering the advanced topics at the frontier of research in LLMs. This review article is intended to not only provide a systematic survey but also a quick comprehensive reference for the researchers and practitioners to draw insights from extensive informative summaries of the existing works to advance the LLM research.},\n\turldate = {2024-02-19},\n\tpublisher = {arXiv},\n\tauthor = {Naveed, Humza and Khan, Asad Ullah and Qiu, Shi and Saqib, Muhammad and Anwar, Saeed and Usman, Muhammad and Akhtar, Naveed and Barnes, Nick and Mian, Ajmal},\n\tmonth = dec,\n\tyear = {2023},\n\tnote = {arXiv:2307.06435 [cs]},\n\tkeywords = {Computer Science - Computation and Language, LLM, Large Language Model, Overview, Überblick},\n}\n\n
\n
\n\n\n
\n Large Language Models (LLMs) have recently demonstrated remarkable capabilities in natural language processing tasks and beyond. This success of LLMs has led to a large influx of research contributions in this direction. These works encompass diverse topics such as architectural innovations, better training strategies, context length improvements, fine-tuning, multi-modal LLMs, robotics, datasets, benchmarking, efficiency, and more. With the rapid development of techniques and regular breakthroughs in LLM research, it has become considerably challenging to perceive the bigger picture of the advances in this direction. Considering the rapidly emerging plethora of literature on LLMs, it is imperative that the research community is able to benefit from a concise yet comprehensive overview of the recent developments in this field. This article provides an overview of the existing literature on a broad range of LLM-related concepts. Our self-contained comprehensive overview of LLMs discusses relevant background concepts along with covering the advanced topics at the frontier of research in LLMs. This review article is intended to not only provide a systematic survey but also a quick comprehensive reference for the researchers and practitioners to draw insights from extensive informative summaries of the existing works to advance the LLM research.\n
\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Kirillov, A.\n \n \n (1)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n \n Detectron2.\n \n \n \n \n\n\n \n Wu, Y.; Kirillov, A.; Massa, F.; Lo, W.; and Girshick, R.\n\n\n \n\n\n\n 2019.\n \n\n\n\n
\n\n\n\n \n \n \"Detectron2Paper\n  \n \n\n \n\n \n link\n  \n \n\n bibtex\n \n\n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@misc{wu2019,\n\ttitle = {Detectron2},\n\turl = {https://github.com/facebookresearch/detectron2},\n\tauthor = {Wu, Yuxin and Kirillov, Alexander and Massa, Francisco and Lo, Wan-Yen and Girshick, Ross},\n\tyear = {2019},\n}\n\n
\n
\n\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Kleber, F.\n \n \n (1)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n \n Automatic Writer Identification in Historical Documents: A Case Study.\n \n \n \n \n\n\n \n Christlein, V.; Diem, M.; Kleber, F.; Mühlberger, G.; Schwägerl-Melchior, V.; Van Gelder, E.; and Maier, A.\n\n\n \n\n\n\n Zeitschrift für digitale Geisteswissenschaften. 2016.\n Publisher: HAB - Herzog August Bibliothek\n\n\n\n
\n\n\n\n \n \n \"AutomaticPaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@article{christleinAutomaticWriterIdentification2016,\n\ttitle = {Automatic {Writer} {Identification} in {Historical} {Documents}: {A} {Case} {Study}},\n\tshorttitle = {Automatic {Writer} {Identification} in {Historical} {Documents}},\n\turl = {http://www.zfdg.de/2016_002},\n\tdoi = {10.17175/2016_002},\n\tlanguage = {en},\n\turldate = {2023-11-17},\n\tjournal = {Zeitschrift für digitale Geisteswissenschaften},\n\tauthor = {Christlein, Vincent and Diem, Markus and Kleber, Florian and Mühlberger, Günter and Schwägerl-Melchior, Verena and Van Gelder, Esther and Maier, Andreas},\n\tyear = {2016},\n\tnote = {Publisher: HAB - Herzog August Bibliothek},\n}\n\n
\n
\n\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Labahn, R.\n \n \n (3)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n \n Cells in Multidimensional Recurrent Neural Networks.\n \n \n \n \n\n\n \n Leifert, G.; Strauß, T.; Grüning, T.; Wustlich, W.; and Labahn, R.\n\n\n \n\n\n\n Journal of Machine Learning Research, 17: 97:1–97:37. 2016.\n \n\n\n\n
\n\n\n\n \n \n \"CellsPaper\n  \n \n\n \n\n \n link\n  \n \n\n bibtex\n \n\n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@article{leifert_cells_2016,\n\ttitle = {Cells in {Multidimensional} {Recurrent} {Neural} {Networks}},\n\tvolume = {17},\n\turl = {http://jmlr.org/papers/v17/14-203.html},\n\turldate = {2018-06-29},\n\tjournal = {Journal of Machine Learning Research},\n\tauthor = {Leifert, Gundram and Strauß, Tobias and Grüning, Tobias and Wustlich, Welf and Labahn, Roger},\n\tyear = {2016},\n\tpages = {97:1--97:37},\n}\n\n
\n
\n\n\n\n
\n\n\n
\n \n\n \n \n \n \n \n \n System Description of CITlab's Recognition & Retrieval Engine for ICDAR2017 Competition on Information Extraction in Historical Handwritten Records.\n \n \n \n \n\n\n \n Strauss, T.; Weidemann, M.; Michael, J.; Leifert, G.; Grüning, T.; and Labahn, R.\n\n\n \n\n\n\n CoRR, abs/1804.09943. 2018.\n \n\n\n\n
\n\n\n\n \n \n \"SystemPaper\n  \n \n\n \n\n \n link\n  \n \n\n bibtex\n \n\n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@article{strauss_system_2018,\n\ttitle = {System {Description} of {CITlab}'s {Recognition} \\& {Retrieval} {Engine} for {ICDAR2017} {Competition} on {Information} {Extraction} in {Historical} {Handwritten} {Records}},\n\tvolume = {abs/1804.09943},\n\turl = {http://arxiv.org/abs/1804.09943},\n\turldate = {2018-06-29},\n\tjournal = {CoRR},\n\tauthor = {Strauss, Tobias and Weidemann, Max and Michael, Johannes and Leifert, Gundram and Grüning, Tobias and Labahn, Roger},\n\tyear = {2018},\n}\n\n
\n
\n\n\n\n
\n\n\n
\n \n\n \n \n \n \n \n \n Regular expressions for decoding of neural network outputs.\n \n \n \n \n\n\n \n Strauß, T.; Leifert, G.; Grüning, T.; and Labahn, R.\n\n\n \n\n\n\n CoRR, abs/1509.04438. 2015.\n \n\n\n\n
\n\n\n\n \n \n \"RegularPaper\n  \n \n\n \n\n \n link\n  \n \n\n bibtex\n \n\n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@article{straus_regular_2015,\n\ttitle = {Regular expressions for decoding of neural network outputs},\n\tvolume = {abs/1509.04438},\n\turl = {http://arxiv.org/abs/1509.04438},\n\turldate = {2018-06-29},\n\tjournal = {CoRR},\n\tauthor = {Strauß, Tobias and Leifert, Gundram and Grüning, Tobias and Labahn, Roger},\n\tyear = {2015},\n}\n\n
\n
\n\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Lauc, D.\n \n \n (1)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n Inferring standard name form, gender and nobility from historical texts using stable model semantics.\n \n \n \n\n\n \n Lauc, D.; and Vitek, D.\n\n\n \n\n\n\n Digital Humanities Quarterly, 015(1). May 2021.\n \n\n\n\n
\n\n\n\n \n\n \n\n \n link\n  \n \n\n bibtex\n \n\n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n \n \n\n\n\n
\n
@article{lauc_inferring_2021,\n\ttitle = {Inferring standard name form, gender and nobility from historical texts using stable model semantics},\n\tvolume = {015},\n\tissn = {1938-4122},\n\tnumber = {1},\n\tjournal = {Digital Humanities Quarterly},\n\tauthor = {Lauc, Davor and Vitek, Darko},\n\tmonth = may,\n\tyear = {2021},\n\tkeywords = {nlp},\n}\n\n
\n
\n\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Leblanc, E.\n \n \n (1)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n \n Impact of Image Enhancement Methods on Automatic Transcription Trainings with eScriptorium.\n \n \n \n \n\n\n \n Jacsont, P.; and Leblanc, E.\n\n\n \n\n\n\n June 2023.\n \n\n\n\n
\n\n\n\n \n \n \"ImpactPaper\n  \n \n\n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n \n \n \n \n \n \n \n \n\n\n\n
\n
@unpublished{jacsont2023,\n\ttitle = {Impact of {Image} {Enhancement} {Methods} on {Automatic} {Transcription} {Trainings} with {eScriptorium}},\n\turl = {https://hal.science/hal-03831686},\n\tabstract = {This study stems from the Desenrollando el cordel (Untangling the cordel) project, which focuses on 19th-century Spanish prints editing. It evaluates the impact of image enhancement methods on the automatic transcription of low-quality documents, both in terms of printing and digitisation. We compare different methods (binarisation, deblur) and present the results obtained during the training of models with the Kraken tool. We demonstrate that binarisation methods give better results than the other, and that the combination of several techniques did not significantly improve the transcription prediction. This study shows the significance of using image enhancement methods with Kraken. It paves the way for further experiments with larger and more varied corpora to help future projects design their automatic transcription workflow.},\n\turldate = {2024-05-03},\n\tauthor = {Jacsont, Pauline and Leblanc, Elina},\n\tmonth = jun,\n\tyear = {2023},\n\tkeywords = {Spanish literature, binarisation, image enhancement methods, printed documents},\n}\n\n
\n
\n\n\n
\n This study stems from the Desenrollando el cordel (Untangling the cordel) project, which focuses on 19th-century Spanish prints editing. It evaluates the impact of image enhancement methods on the automatic transcription of low-quality documents, both in terms of printing and digitisation. We compare different methods (binarisation, deblur) and present the results obtained during the training of models with the Kraken tool. We demonstrate that binarisation methods give better results than the other, and that the combination of several techniques did not significantly improve the transcription prediction. This study shows the significance of using image enhancement methods with Kraken. It paves the way for further experiments with larger and more varied corpora to help future projects design their automatic transcription workflow.\n
\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Lee, B.\n \n \n (1)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n Compounded Mediation: A Data Archaeology of the Newspaper Navigator Dataset.\n \n \n \n\n\n \n Lee, B.\n\n\n \n\n\n\n Digital Humanities Quarterly, 015(4). .\n \n\n\n\n
\n\n\n\n \n\n \n\n \n link\n  \n \n\n bibtex\n \n\n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@article{lee_compounded_nodate,\n\ttitle = {Compounded {Mediation}: {A} {Data} {Archaeology} of the {Newspaper} {Navigator} {Dataset}},\n\tvolume = {015},\n\tissn = {1938-4122},\n\tshorttitle = {Compounded {Mediation}},\n\tnumber = {4},\n\tjournal = {Digital Humanities Quarterly},\n\tauthor = {Lee, Benjamin},\n}\n\n
\n
\n\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Leifert, G.\n \n \n (3)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n \n Cells in Multidimensional Recurrent Neural Networks.\n \n \n \n \n\n\n \n Leifert, G.; Strauß, T.; Grüning, T.; Wustlich, W.; and Labahn, R.\n\n\n \n\n\n\n Journal of Machine Learning Research, 17: 97:1–97:37. 2016.\n \n\n\n\n
\n\n\n\n \n \n \"CellsPaper\n  \n \n\n \n\n \n link\n  \n \n\n bibtex\n \n\n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@article{leifert_cells_2016,\n\ttitle = {Cells in {Multidimensional} {Recurrent} {Neural} {Networks}},\n\tvolume = {17},\n\turl = {http://jmlr.org/papers/v17/14-203.html},\n\turldate = {2018-06-29},\n\tjournal = {Journal of Machine Learning Research},\n\tauthor = {Leifert, Gundram and Strauß, Tobias and Grüning, Tobias and Wustlich, Welf and Labahn, Roger},\n\tyear = {2016},\n\tpages = {97:1--97:37},\n}\n\n
\n
\n\n\n\n
\n\n\n
\n \n\n \n \n \n \n \n \n System Description of CITlab's Recognition & Retrieval Engine for ICDAR2017 Competition on Information Extraction in Historical Handwritten Records.\n \n \n \n \n\n\n \n Strauss, T.; Weidemann, M.; Michael, J.; Leifert, G.; Grüning, T.; and Labahn, R.\n\n\n \n\n\n\n CoRR, abs/1804.09943. 2018.\n \n\n\n\n
\n\n\n\n \n \n \"SystemPaper\n  \n \n\n \n\n \n link\n  \n \n\n bibtex\n \n\n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@article{strauss_system_2018,\n\ttitle = {System {Description} of {CITlab}'s {Recognition} \\& {Retrieval} {Engine} for {ICDAR2017} {Competition} on {Information} {Extraction} in {Historical} {Handwritten} {Records}},\n\tvolume = {abs/1804.09943},\n\turl = {http://arxiv.org/abs/1804.09943},\n\turldate = {2018-06-29},\n\tjournal = {CoRR},\n\tauthor = {Strauss, Tobias and Weidemann, Max and Michael, Johannes and Leifert, Gundram and Grüning, Tobias and Labahn, Roger},\n\tyear = {2018},\n}\n\n
\n
\n\n\n\n
\n\n\n
\n \n\n \n \n \n \n \n \n Regular expressions for decoding of neural network outputs.\n \n \n \n \n\n\n \n Strauß, T.; Leifert, G.; Grüning, T.; and Labahn, R.\n\n\n \n\n\n\n CoRR, abs/1509.04438. 2015.\n \n\n\n\n
\n\n\n\n \n \n \"RegularPaper\n  \n \n\n \n\n \n link\n  \n \n\n bibtex\n \n\n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@article{straus_regular_2015,\n\ttitle = {Regular expressions for decoding of neural network outputs},\n\tvolume = {abs/1509.04438},\n\turl = {http://arxiv.org/abs/1509.04438},\n\turldate = {2018-06-29},\n\tjournal = {CoRR},\n\tauthor = {Strauß, Tobias and Leifert, Gundram and Grüning, Tobias and Labahn, Roger},\n\tyear = {2015},\n}\n\n
\n
\n\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Lemaitre, A.\n \n \n (1)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n A Light Transformer-Based Architecture for Handwritten Text Recognition.\n \n \n \n\n\n \n Barrere, K.; Soullard, Y.; Lemaitre, A.; and Coüasnon, B.\n\n\n \n\n\n\n In Uchida, S.; Barney, E.; and Eglin, V., editor(s), Document Analysis Systems, of Lecture Notes in Computer Science, pages 275–290, Cham, 2022. Springer International Publishing\n \n\n\n\n
\n\n\n\n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n \n \n \n \n \n \n \n \n \n \n\n\n\n
\n
@inproceedings{barrere_light_2022,\n\taddress = {Cham},\n\tseries = {Lecture {Notes} in {Computer} {Science}},\n\ttitle = {A {Light} {Transformer}-{Based} {Architecture} for {Handwritten} {Text} {Recognition}},\n\tisbn = {978-3-031-06555-2},\n\tdoi = {10.1007/978-3-031-06555-2_19},\n\tabstract = {Transformer models have been showing ground-breaking results in the domain of natural language processing. More recently, they started to gain interest in many others fields as in computer vision. Traditional Transformer models typically require a significant amount of training data to achieve satisfactory results. However, in the domain of handwritten text recognition, annotated data acquisition remains costly resulting in small datasets compared to those commonly used to train a Transformer-based model. Hence, training Transformer models able to transcribe handwritten text from images remains challenging. We propose a light encoder-decoder Transformer-based architecture for handwriting text recognition, containing a small number of parameters compared to traditional Transformer architectures. We trained our architecture using a hybrid loss, combining the well-known connectionist temporal classification with the cross-entropy. Experiments are conducted on the well-known IAM dataset with and without the use of additional synthetic data. We show that our network reaches state-of-the-art results in both cases, compared with other larger Transformer-based models.},\n\tlanguage = {en},\n\tbooktitle = {Document {Analysis} {Systems}},\n\tpublisher = {Springer International Publishing},\n\tauthor = {Barrere, Killian and Soullard, Yann and Lemaitre, Aurélie and Coüasnon, Bertrand},\n\teditor = {Uchida, Seiichi and Barney, Elisa and Eglin, Véronique},\n\tyear = {2022},\n\tkeywords = {Handwritten text recognition, Hybrid loss, Light network, Neural networks, Transformer},\n\tpages = {275--290},\n}\n\n
\n
\n\n\n
\n Transformer models have been showing ground-breaking results in the domain of natural language processing. More recently, they started to gain interest in many others fields as in computer vision. Traditional Transformer models typically require a significant amount of training data to achieve satisfactory results. However, in the domain of handwritten text recognition, annotated data acquisition remains costly resulting in small datasets compared to those commonly used to train a Transformer-based model. Hence, training Transformer models able to transcribe handwritten text from images remains challenging. We propose a light encoder-decoder Transformer-based architecture for handwriting text recognition, containing a small number of parameters compared to traditional Transformer architectures. We trained our architecture using a hybrid loss, combining the well-known connectionist temporal classification with the cross-entropy. Experiments are conducted on the well-known IAM dataset with and without the use of additional synthetic data. We show that our network reaches state-of-the-art results in both cases, compared with other larger Transformer-based models.\n
\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Linzen, T.\n \n \n (1)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n \n Targeted Syntactic Evaluation of Language Models.\n \n \n \n \n\n\n \n Marvin, R.; and Linzen, T.\n\n\n \n\n\n\n August 2018.\n arXiv:1808.09031 [cs]\n\n\n\n
\n\n\n\n \n \n \"TargetedPaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n \n \n\n\n\n
\n
@misc{marvin_targeted_2018,\n\ttitle = {Targeted {Syntactic} {Evaluation} of {Language} {Models}},\n\turl = {http://arxiv.org/abs/1808.09031},\n\tdoi = {10.48550/arXiv.1808.09031},\n\tabstract = {We present a dataset for evaluating the grammaticality of the predictions of a language model. We automatically construct a large number of minimally different pairs of English sentences, each consisting of a grammatical and an ungrammatical sentence. The sentence pairs represent different variations of structure-sensitive phenomena: subject-verb agreement, reflexive anaphora and negative polarity items. We expect a language model to assign a higher probability to the grammatical sentence than the ungrammatical one. In an experiment using this data set, an LSTM language model performed poorly on many of the constructions. Multi-task training with a syntactic objective (CCG supertagging) improved the LSTM's accuracy, but a large gap remained between its performance and the accuracy of human participants recruited online. This suggests that there is considerable room for improvement over LSTMs in capturing syntax in a language model.},\n\turldate = {2022-09-02},\n\tpublisher = {arXiv},\n\tauthor = {Marvin, Rebecca and Linzen, Tal},\n\tmonth = aug,\n\tyear = {2018},\n\tnote = {arXiv:1808.09031 [cs]},\n\tkeywords = {Computer Science - Computation and Language},\n}\n
\n
\n\n\n
\n We present a dataset for evaluating the grammaticality of the predictions of a language model. We automatically construct a large number of minimally different pairs of English sentences, each consisting of a grammatical and an ungrammatical sentence. The sentence pairs represent different variations of structure-sensitive phenomena: subject-verb agreement, reflexive anaphora and negative polarity items. We expect a language model to assign a higher probability to the grammatical sentence than the ungrammatical one. In an experiment using this data set, an LSTM language model performed poorly on many of the constructions. Multi-task training with a syntactic objective (CCG supertagging) improved the LSTM's accuracy, but a large gap remained between its performance and the accuracy of human participants recruited online. This suggests that there is considerable room for improvement over LSTMs in capturing syntax in a language model.\n
\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Lo, W.\n \n \n (1)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n \n Detectron2.\n \n \n \n \n\n\n \n Wu, Y.; Kirillov, A.; Massa, F.; Lo, W.; and Girshick, R.\n\n\n \n\n\n\n 2019.\n \n\n\n\n
\n\n\n\n \n \n \"Detectron2Paper\n  \n \n\n \n\n \n link\n  \n \n\n bibtex\n \n\n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@misc{wu2019,\n\ttitle = {Detectron2},\n\turl = {https://github.com/facebookresearch/detectron2},\n\tauthor = {Wu, Yuxin and Kirillov, Alexander and Massa, Francisco and Lo, Wan-Yen and Girshick, Ross},\n\tyear = {2019},\n}\n\n
\n
\n\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Lomas, A.\n \n \n (1)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n \n Turing’s Genius – Defining an apt microcosm.\n \n \n \n \n\n\n \n Bowen, J.; Trickett, T.; Green, J. B. A.; and Lomas, A.\n\n\n \n\n\n\n In July 2018. BCS Learning & Development\n \n\n\n\n
\n\n\n\n \n \n \"Turing’sPaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@inproceedings{bowen_turings_2018,\n\ttitle = {Turing’s {Genius} – {Defining} an apt microcosm},\n\turl = {https://www.scienceopen.com/hosted-document?doi=10.14236/ewic/EVA2018.31},\n\tdoi = {10.14236/ewic/EVA2018.31},\n\tabstract = {Alan Turing (1912–1954) is widely acknowledged as a genius. As well as codebreaking during World War II and taking a pioneering role in computer hardware design and software after the War, he also wrote three important foundational papers in the fields of theoretical computer science, artificial intelligence, and mathematical biology. He has been called the father of computer science, but he also admired by mathematicians, philosophers, and perhaps more surprisingly biologists, for his wide-ranging ideas. His influence stretches from scientific to cultural and even political impact. For all these reasons, he was a true polymath. This paper considers the genius of Turing from various angles, both scientific and artistic. The four authors provide position statements on how Turing has influenced and inspired their work, together with short biographies, as a starting point for a panel session and visual music performance.},\n\turldate = {2023-09-27},\n\tpublisher = {BCS Learning \\& Development},\n\tauthor = {Bowen, Jonathan and Trickett, Terry and Green, Jeremy B. A. and Lomas, Andy},\n\tmonth = jul,\n\tyear = {2018},\n}\n\n
\n
\n\n\n
\n Alan Turing (1912–1954) is widely acknowledged as a genius. As well as codebreaking during World War II and taking a pioneering role in computer hardware design and software after the War, he also wrote three important foundational papers in the fields of theoretical computer science, artificial intelligence, and mathematical biology. He has been called the father of computer science, but he also admired by mathematicians, philosophers, and perhaps more surprisingly biologists, for his wide-ranging ideas. His influence stretches from scientific to cultural and even political impact. For all these reasons, he was a true polymath. This paper considers the genius of Turing from various angles, both scientific and artistic. The four authors provide position statements on how Turing has influenced and inspired their work, together with short biographies, as a starting point for a panel session and visual music performance.\n
\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Luan, D.\n \n \n (1)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n \n Language Models are Unsupervised Multitask Learners.\n \n \n \n \n\n\n \n Radford, A.; Wu, J.; Child, R.; Luan, D.; Amodei, D.; and Sutskever, I.\n\n\n \n\n\n\n In 2019. \n \n\n\n\n
\n\n\n\n \n \n \"LanguagePaper\n  \n \n\n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@inproceedings{radford_language_2019,\n\ttitle = {Language {Models} are {Unsupervised} {Multitask} {Learners}},\n\turl = {https://www.semanticscholar.org/paper/Language-Models-are-Unsupervised-Multitask-Learners-Radford-Wu/9405cc0d6169988371b2755e573cc28650d14dfe},\n\tabstract = {Natural language processing tasks, such as question answering, machine translation, reading comprehension, and summarization, are typically approached with supervised learning on taskspecific datasets. We demonstrate that language models begin to learn these tasks without any explicit supervision when trained on a new dataset of millions of webpages called WebText. When conditioned on a document plus questions, the answers generated by the language model reach 55 F1 on the CoQA dataset matching or exceeding the performance of 3 out of 4 baseline systems without using the 127,000+ training examples. The capacity of the language model is essential to the success of zero-shot task transfer and increasing it improves performance in a log-linear fashion across tasks. Our largest model, GPT-2, is a 1.5B parameter Transformer that achieves state of the art results on 7 out of 8 tested language modeling datasets in a zero-shot setting but still underfits WebText. Samples from the model reflect these improvements and contain coherent paragraphs of text. These findings suggest a promising path towards building language processing systems which learn to perform tasks from their naturally occurring demonstrations.},\n\turldate = {2023-02-02},\n\tauthor = {Radford, Alec and Wu, Jeff and Child, Rewon and Luan, D. and Amodei, Dario and Sutskever, Ilya},\n\tyear = {2019},\n}\n\n
\n
\n\n\n
\n Natural language processing tasks, such as question answering, machine translation, reading comprehension, and summarization, are typically approached with supervised learning on taskspecific datasets. We demonstrate that language models begin to learn these tasks without any explicit supervision when trained on a new dataset of millions of webpages called WebText. When conditioned on a document plus questions, the answers generated by the language model reach 55 F1 on the CoQA dataset matching or exceeding the performance of 3 out of 4 baseline systems without using the 127,000+ training examples. The capacity of the language model is essential to the success of zero-shot task transfer and increasing it improves performance in a log-linear fashion across tasks. Our largest model, GPT-2, is a 1.5B parameter Transformer that achieves state of the art results on 7 out of 8 tested language modeling datasets in a zero-shot setting but still underfits WebText. Samples from the model reflect these improvements and contain coherent paragraphs of text. These findings suggest a promising path towards building language processing systems which learn to perform tasks from their naturally occurring demonstrations.\n
\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Luger, G.\n \n \n (1)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n Artificial intelligence: structures and strategies for complex problem solving.\n \n \n \n\n\n \n Luger, G. F.\n\n\n \n\n\n\n Pearson Addison-Wesley, Boston, 6th ed edition, 2009.\n OCLC: ocn183611012\n\n\n\n
\n\n\n\n \n\n \n\n \n link\n  \n \n\n bibtex\n \n\n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n \n \n \n \n \n \n \n \n \n \n\n\n\n
\n
@book{luger_artificial_2009,\n\taddress = {Boston},\n\tedition = {6th ed},\n\ttitle = {Artificial intelligence: structures and strategies for complex problem solving},\n\tisbn = {978-0-321-54589-3},\n\tshorttitle = {Artificial intelligence},\n\tpublisher = {Pearson Addison-Wesley},\n\tauthor = {Luger, George F.},\n\tyear = {2009},\n\tnote = {OCLC: ocn183611012},\n\tkeywords = {Artificial intelligence, Knowledge representation (Information theory), LISP (Computer program language), Problem solving, Prolog (Computer program language)},\n}\n\n
\n
\n\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Ma, R.\n \n \n (1)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n \n TRIG: Transformer-Based Text Recognizer with Initial Embedding Guidance.\n \n \n \n \n\n\n \n Tao, Y.; Jia, Z.; Ma, R.; and Xu, S.\n\n\n \n\n\n\n Electronics, 10(22): 2780. January 2021.\n Number: 22 Publisher: Multidisciplinary Digital Publishing Institute\n\n\n\n
\n\n\n\n \n \n \"TRIG:Paper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n \n \n \n \n \n \n \n \n \n \n\n\n\n
\n
@article{tao_trig_2021,\n\ttitle = {{TRIG}: {Transformer}-{Based} {Text} {Recognizer} with {Initial} {Embedding} {Guidance}},\n\tvolume = {10},\n\tcopyright = {http://creativecommons.org/licenses/by/3.0/},\n\tissn = {2079-9292},\n\tshorttitle = {{TRIG}},\n\turl = {https://www.mdpi.com/2079-9292/10/22/2780},\n\tdoi = {10.3390/electronics10222780},\n\tabstract = {Scene text recognition (STR) is an important bridge between images and text, attracting abundant research attention. While convolutional neural networks (CNNS) have achieved remarkable progress in this task, most of the existing works need an extra module (context modeling module) to help CNN to capture global dependencies to solve the inductive bias and strengthen the relationship between text features. Recently, the transformer has been proposed as a promising network for global context modeling by self-attention mechanism, but one of the main short-comings, when applied to recognition, is the efficiency. We propose a 1-D split to address the challenges of complexity and replace the CNN with the transformer encoder to reduce the need for a context modeling module. Furthermore, recent methods use a frozen initial embedding to guide the decoder to decode the features to text, leading to a loss of accuracy. We propose to use a learnable initial embedding learned from the transformer encoder to make it adaptive to different input images. Above all, we introduce a novel architecture for text recognition, named TRansformer-based text recognizer with Initial embedding Guidance (TRIG), composed of three stages (transformation, feature extraction, and prediction). Extensive experiments show that our approach can achieve state-of-the-art on text recognition benchmarks.},\n\tlanguage = {en},\n\tnumber = {22},\n\turldate = {2023-09-29},\n\tjournal = {Electronics},\n\tauthor = {Tao, Yue and Jia, Zhiwei and Ma, Runze and Xu, Shugong},\n\tmonth = jan,\n\tyear = {2021},\n\tnote = {Number: 22\nPublisher: Multidisciplinary Digital Publishing Institute},\n\tkeywords = {1-D split, initial embedding, scene text recognition, self-attention, transformer},\n\tpages = {2780},\n}\n\n
\n
\n\n\n
\n Scene text recognition (STR) is an important bridge between images and text, attracting abundant research attention. While convolutional neural networks (CNNS) have achieved remarkable progress in this task, most of the existing works need an extra module (context modeling module) to help CNN to capture global dependencies to solve the inductive bias and strengthen the relationship between text features. Recently, the transformer has been proposed as a promising network for global context modeling by self-attention mechanism, but one of the main short-comings, when applied to recognition, is the efficiency. We propose a 1-D split to address the challenges of complexity and replace the CNN with the transformer encoder to reduce the need for a context modeling module. Furthermore, recent methods use a frozen initial embedding to guide the decoder to decode the features to text, leading to a loss of accuracy. We propose to use a learnable initial embedding learned from the transformer encoder to make it adaptive to different input images. Above all, we introduce a novel architecture for text recognition, named TRansformer-based text recognizer with Initial embedding Guidance (TRIG), composed of three stages (transformation, feature extraction, and prediction). Extensive experiments show that our approach can achieve state-of-the-art on text recognition benchmarks.\n
\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Maier, A.\n \n \n (1)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n \n Automatic Writer Identification in Historical Documents: A Case Study.\n \n \n \n \n\n\n \n Christlein, V.; Diem, M.; Kleber, F.; Mühlberger, G.; Schwägerl-Melchior, V.; Van Gelder, E.; and Maier, A.\n\n\n \n\n\n\n Zeitschrift für digitale Geisteswissenschaften. 2016.\n Publisher: HAB - Herzog August Bibliothek\n\n\n\n
\n\n\n\n \n \n \"AutomaticPaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@article{christleinAutomaticWriterIdentification2016,\n\ttitle = {Automatic {Writer} {Identification} in {Historical} {Documents}: {A} {Case} {Study}},\n\tshorttitle = {Automatic {Writer} {Identification} in {Historical} {Documents}},\n\turl = {http://www.zfdg.de/2016_002},\n\tdoi = {10.17175/2016_002},\n\tlanguage = {en},\n\turldate = {2023-11-17},\n\tjournal = {Zeitschrift für digitale Geisteswissenschaften},\n\tauthor = {Christlein, Vincent and Diem, Markus and Kleber, Florian and Mühlberger, Günter and Schwägerl-Melchior, Verena and Van Gelder, Esther and Maier, Andreas},\n\tyear = {2016},\n\tnote = {Publisher: HAB - Herzog August Bibliothek},\n}\n\n
\n
\n\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Malik, J.\n \n \n (1)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n Artificial intelligence: a modern approach.\n \n \n \n\n\n \n Russell, S. J.; Norvig, P.; Chang, M.; Devlin, J.; Dragan, A.; Forsyth, D.; Goodfellow, I.; Malik, J.; Mansinghka, V.; Pearl, J.; and Wooldridge, M. J.\n\n\n \n\n\n\n of Pearson series in artificial intelligencePearson, Harlow, Fourth edition, global edition edition, 2022.\n \n\n\n\n
\n\n\n\n \n\n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@book{russell_artificial_2022,\n\taddress = {Harlow},\n\tedition = {Fourth edition, global edition},\n\tseries = {Pearson series in artificial intelligence},\n\ttitle = {Artificial intelligence: a modern approach},\n\tisbn = {978-1-292-40113-3},\n\tshorttitle = {Artificial intelligence},\n\tabstract = {"Updated edition of popular textbook on Artificial Intelligence. This edition specific looks at ways of keeping artificial intelligence under control"},\n\tlanguage = {eng},\n\tpublisher = {Pearson},\n\tauthor = {Russell, Stuart J. and Norvig, Peter and Chang, Ming-wei and Devlin, Jacob and Dragan, Anca and Forsyth, David and Goodfellow, Ian and Malik, Jitendra and Mansinghka, Vikas and Pearl, Judea and Wooldridge, Michael J.},\n\tyear = {2022},\n}\n\n
\n
\n\n\n
\n \"Updated edition of popular textbook on Artificial Intelligence. This edition specific looks at ways of keeping artificial intelligence under control\"\n
\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Mansinghka, V.\n \n \n (1)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n Artificial intelligence: a modern approach.\n \n \n \n\n\n \n Russell, S. J.; Norvig, P.; Chang, M.; Devlin, J.; Dragan, A.; Forsyth, D.; Goodfellow, I.; Malik, J.; Mansinghka, V.; Pearl, J.; and Wooldridge, M. J.\n\n\n \n\n\n\n of Pearson series in artificial intelligencePearson, Harlow, Fourth edition, global edition edition, 2022.\n \n\n\n\n
\n\n\n\n \n\n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@book{russell_artificial_2022,\n\taddress = {Harlow},\n\tedition = {Fourth edition, global edition},\n\tseries = {Pearson series in artificial intelligence},\n\ttitle = {Artificial intelligence: a modern approach},\n\tisbn = {978-1-292-40113-3},\n\tshorttitle = {Artificial intelligence},\n\tabstract = {"Updated edition of popular textbook on Artificial Intelligence. This edition specific looks at ways of keeping artificial intelligence under control"},\n\tlanguage = {eng},\n\tpublisher = {Pearson},\n\tauthor = {Russell, Stuart J. and Norvig, Peter and Chang, Ming-wei and Devlin, Jacob and Dragan, Anca and Forsyth, David and Goodfellow, Ian and Malik, Jitendra and Mansinghka, Vikas and Pearl, Judea and Wooldridge, Michael J.},\n\tyear = {2022},\n}\n\n
\n
\n\n\n
\n \"Updated edition of popular textbook on Artificial Intelligence. This edition specific looks at ways of keeping artificial intelligence under control\"\n
\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Martin, J.\n \n \n (1)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n Speech and language processing: an introduction to natural language processing, computational linguistics, and speech recognition.\n \n \n \n\n\n \n Jurafsky, D.; and Martin, J. H.\n\n\n \n\n\n\n Prentice Hall, Upper Saddle River, NJ, 2 edition, 2009.\n \n\n\n\n
\n\n\n\n \n\n \n\n \n link\n  \n \n\n bibtex\n \n\n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@book{jurafsky_speech_2009,\n\taddress = {Upper Saddle River, NJ},\n\tedition = {2},\n\ttitle = {Speech and language processing: an introduction to natural language processing, computational linguistics, and speech recognition},\n\tisbn = {978-0-13-187321-6},\n\tshorttitle = {Speech and language processing},\n\tlanguage = {eng},\n\tpublisher = {Prentice Hall},\n\tauthor = {Jurafsky, Dan and Martin, James H.},\n\tyear = {2009},\n}\n\n
\n
\n\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Marvin, R.\n \n \n (1)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n \n Targeted Syntactic Evaluation of Language Models.\n \n \n \n \n\n\n \n Marvin, R.; and Linzen, T.\n\n\n \n\n\n\n August 2018.\n arXiv:1808.09031 [cs]\n\n\n\n
\n\n\n\n \n \n \"TargetedPaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n \n \n\n\n\n
\n
@misc{marvin_targeted_2018,\n\ttitle = {Targeted {Syntactic} {Evaluation} of {Language} {Models}},\n\turl = {http://arxiv.org/abs/1808.09031},\n\tdoi = {10.48550/arXiv.1808.09031},\n\tabstract = {We present a dataset for evaluating the grammaticality of the predictions of a language model. We automatically construct a large number of minimally different pairs of English sentences, each consisting of a grammatical and an ungrammatical sentence. The sentence pairs represent different variations of structure-sensitive phenomena: subject-verb agreement, reflexive anaphora and negative polarity items. We expect a language model to assign a higher probability to the grammatical sentence than the ungrammatical one. In an experiment using this data set, an LSTM language model performed poorly on many of the constructions. Multi-task training with a syntactic objective (CCG supertagging) improved the LSTM's accuracy, but a large gap remained between its performance and the accuracy of human participants recruited online. This suggests that there is considerable room for improvement over LSTMs in capturing syntax in a language model.},\n\turldate = {2022-09-02},\n\tpublisher = {arXiv},\n\tauthor = {Marvin, Rebecca and Linzen, Tal},\n\tmonth = aug,\n\tyear = {2018},\n\tnote = {arXiv:1808.09031 [cs]},\n\tkeywords = {Computer Science - Computation and Language},\n}\n
\n
\n\n\n
\n We present a dataset for evaluating the grammaticality of the predictions of a language model. We automatically construct a large number of minimally different pairs of English sentences, each consisting of a grammatical and an ungrammatical sentence. The sentence pairs represent different variations of structure-sensitive phenomena: subject-verb agreement, reflexive anaphora and negative polarity items. We expect a language model to assign a higher probability to the grammatical sentence than the ungrammatical one. In an experiment using this data set, an LSTM language model performed poorly on many of the constructions. Multi-task training with a syntactic objective (CCG supertagging) improved the LSTM's accuracy, but a large gap remained between its performance and the accuracy of human participants recruited online. This suggests that there is considerable room for improvement over LSTMs in capturing syntax in a language model.\n
\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Massa, F.\n \n \n (1)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n \n Detectron2.\n \n \n \n \n\n\n \n Wu, Y.; Kirillov, A.; Massa, F.; Lo, W.; and Girshick, R.\n\n\n \n\n\n\n 2019.\n \n\n\n\n
\n\n\n\n \n \n \"Detectron2Paper\n  \n \n\n \n\n \n link\n  \n \n\n bibtex\n \n\n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@misc{wu2019,\n\ttitle = {Detectron2},\n\turl = {https://github.com/facebookresearch/detectron2},\n\tauthor = {Wu, Yuxin and Kirillov, Alexander and Massa, Francisco and Lo, Wan-Yen and Girshick, Ross},\n\tyear = {2019},\n}\n\n
\n
\n\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Merveille, F.\n \n \n (1)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n Recognition and Information Extraction in Historical Handwritten Tables: Toward Understanding Early $$20\\textasciicircum\\th\\$$Century Paris Census.\n \n \n \n\n\n \n Constum, T.; Kempf, N.; Paquet, T.; Tranouez, P.; Chatelain, C.; Brée, S.; and Merveille, F.\n\n\n \n\n\n\n In Uchida, S.; Barney, E.; and Eglin, V., editor(s), Document Analysis Systems, pages 143–157, Cham, 2022. Springer International Publishing\n \n\n\n\n
\n\n\n\n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n\n\n\n
\n
@inproceedings{constumRecognitionInformationExtraction2022,\n\taddress = {Cham},\n\ttitle = {Recognition and {Information} {Extraction} in {Historical} {Handwritten} {Tables}: {Toward} {Understanding} {Early} \\$\\$20{\\textasciicircum}\\{th\\}\\$\\${Century} {Paris} {Census}},\n\tisbn = {978-3-031-06555-2},\n\tshorttitle = {Recognition and {Information} {Extraction} in {Historical} {Handwritten} {Tables}},\n\tdoi = {10.1007/978-3-031-06555-2_10},\n\tabstract = {We aim to build a vast database (up to 9 million individuals) from the handwritten tabular nominal census of Paris of 1926, 1931 and 1936, each composed of about 100,000 handwritten simple pages in a tabular format. We created a complete pipeline that goes from the scan of double pages to text prediction while minimizing the need for segmentation labels. We describe how weighted finite state transducers, writer specialization and self-training further improved our results. We also introduce through this communication two annotated datasets for handwriting recognition that are now publicly available, and an open-source toolkit to apply WFST on CTC lattices.},\n\tlanguage = {en},\n\tbooktitle = {Document {Analysis} {Systems}},\n\tpublisher = {Springer International Publishing},\n\tauthor = {Constum, Thomas and Kempf, Nicolas and Paquet, Thierry and Tranouez, Pierrick and Chatelain, Clément and Brée, Sandra and Merveille, François},\n\teditor = {Uchida, Seiichi and Barney, Elisa and Eglin, Véronique},\n\tyear = {2022},\n\tkeywords = {Document layout analysis, Handwriting recognition, Self-training, Semi-supervised learning, Table analysis, WFST, handwritten text recognition, table recognition},\n\tpages = {143--157},\n}\n\n
\n
\n\n\n
\n We aim to build a vast database (up to 9 million individuals) from the handwritten tabular nominal census of Paris of 1926, 1931 and 1936, each composed of about 100,000 handwritten simple pages in a tabular format. We created a complete pipeline that goes from the scan of double pages to text prediction while minimizing the need for segmentation labels. We describe how weighted finite state transducers, writer specialization and self-training further improved our results. We also introduce through this communication two annotated datasets for handwriting recognition that are now publicly available, and an open-source toolkit to apply WFST on CTC lattices.\n
\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Mian, A.\n \n \n (2)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n \n A Comprehensive Overview of Large Language Models.\n \n \n \n \n\n\n \n Naveed, H.; Khan, A. U.; Qiu, S.; Saqib, M.; Anwar, S.; Usman, M.; Akhtar, N.; Barnes, N.; and Mian, A.\n\n\n \n\n\n\n December 2023.\n arXiv:2307.06435 [cs]\n\n\n\n
\n\n\n\n \n \n \"APaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n \n \n \n \n \n \n \n \n \n \n\n\n\n
\n
@misc{naveed2023,\n\ttitle = {A {Comprehensive} {Overview} of {Large} {Language} {Models}},\n\turl = {http://arxiv.org/abs/2307.06435},\n\tdoi = {10.48550/arXiv.2307.06435},\n\tabstract = {Large Language Models (LLMs) have recently demonstrated remarkable capabilities in natural language processing tasks and beyond. This success of LLMs has led to a large influx of research contributions in this direction. These works encompass diverse topics such as architectural innovations, better training strategies, context length improvements, fine-tuning, multi-modal LLMs, robotics, datasets, benchmarking, efficiency, and more. With the rapid development of techniques and regular breakthroughs in LLM research, it has become considerably challenging to perceive the bigger picture of the advances in this direction. Considering the rapidly emerging plethora of literature on LLMs, it is imperative that the research community is able to benefit from a concise yet comprehensive overview of the recent developments in this field. This article provides an overview of the existing literature on a broad range of LLM-related concepts. Our self-contained comprehensive overview of LLMs discusses relevant background concepts along with covering the advanced topics at the frontier of research in LLMs. This review article is intended to not only provide a systematic survey but also a quick comprehensive reference for the researchers and practitioners to draw insights from extensive informative summaries of the existing works to advance the LLM research.},\n\turldate = {2024-02-19},\n\tpublisher = {arXiv},\n\tauthor = {Naveed, Humza and Khan, Asad Ullah and Qiu, Shi and Saqib, Muhammad and Anwar, Saeed and Usman, Muhammad and Akhtar, Naveed and Barnes, Nick and Mian, Ajmal},\n\tmonth = dec,\n\tyear = {2023},\n\tnote = {arXiv:2307.06435 [cs]},\n\tkeywords = {Computer Science - Computation and Language, LLM, Large Language Model, Overview, Überblick},\n}\n\n
\n
\n\n\n
\n Large Language Models (LLMs) have recently demonstrated remarkable capabilities in natural language processing tasks and beyond. This success of LLMs has led to a large influx of research contributions in this direction. These works encompass diverse topics such as architectural innovations, better training strategies, context length improvements, fine-tuning, multi-modal LLMs, robotics, datasets, benchmarking, efficiency, and more. With the rapid development of techniques and regular breakthroughs in LLM research, it has become considerably challenging to perceive the bigger picture of the advances in this direction. Considering the rapidly emerging plethora of literature on LLMs, it is imperative that the research community is able to benefit from a concise yet comprehensive overview of the recent developments in this field. This article provides an overview of the existing literature on a broad range of LLM-related concepts. Our self-contained comprehensive overview of LLMs discusses relevant background concepts along with covering the advanced topics at the frontier of research in LLMs. This review article is intended to not only provide a systematic survey but also a quick comprehensive reference for the researchers and practitioners to draw insights from extensive informative summaries of the existing works to advance the LLM research.\n
\n\n\n
\n\n\n
\n \n\n \n \n \n \n \n \n A Comprehensive Overview of Large Language Models.\n \n \n \n \n\n\n \n Naveed, H.; Khan, A. U.; Qiu, S.; Saqib, M.; Anwar, S.; Usman, M.; Akhtar, N.; Barnes, N.; and Mian, A.\n\n\n \n\n\n\n December 2023.\n arXiv:2307.06435 [cs]\n\n\n\n
\n\n\n\n \n \n \"APaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n \n \n \n \n \n \n \n \n \n \n\n\n\n
\n
@misc{naveed2023a,\n\ttitle = {A {Comprehensive} {Overview} of {Large} {Language} {Models}},\n\turl = {http://arxiv.org/abs/2307.06435},\n\tdoi = {10.48550/arXiv.2307.06435},\n\tabstract = {Large Language Models (LLMs) have recently demonstrated remarkable capabilities in natural language processing tasks and beyond. This success of LLMs has led to a large influx of research contributions in this direction. These works encompass diverse topics such as architectural innovations, better training strategies, context length improvements, fine-tuning, multi-modal LLMs, robotics, datasets, benchmarking, efficiency, and more. With the rapid development of techniques and regular breakthroughs in LLM research, it has become considerably challenging to perceive the bigger picture of the advances in this direction. Considering the rapidly emerging plethora of literature on LLMs, it is imperative that the research community is able to benefit from a concise yet comprehensive overview of the recent developments in this field. This article provides an overview of the existing literature on a broad range of LLM-related concepts. Our self-contained comprehensive overview of LLMs discusses relevant background concepts along with covering the advanced topics at the frontier of research in LLMs. This review article is intended to not only provide a systematic survey but also a quick comprehensive reference for the researchers and practitioners to draw insights from extensive informative summaries of the existing works to advance the LLM research.},\n\turldate = {2024-02-19},\n\tpublisher = {arXiv},\n\tauthor = {Naveed, Humza and Khan, Asad Ullah and Qiu, Shi and Saqib, Muhammad and Anwar, Saeed and Usman, Muhammad and Akhtar, Naveed and Barnes, Nick and Mian, Ajmal},\n\tmonth = dec,\n\tyear = {2023},\n\tnote = {arXiv:2307.06435 [cs]},\n\tkeywords = {Computer Science - Computation and Language, LLM, Large Language Model, Overview, Überblick},\n}\n\n
\n
\n\n\n
\n Large Language Models (LLMs) have recently demonstrated remarkable capabilities in natural language processing tasks and beyond. This success of LLMs has led to a large influx of research contributions in this direction. These works encompass diverse topics such as architectural innovations, better training strategies, context length improvements, fine-tuning, multi-modal LLMs, robotics, datasets, benchmarking, efficiency, and more. With the rapid development of techniques and regular breakthroughs in LLM research, it has become considerably challenging to perceive the bigger picture of the advances in this direction. Considering the rapidly emerging plethora of literature on LLMs, it is imperative that the research community is able to benefit from a concise yet comprehensive overview of the recent developments in this field. This article provides an overview of the existing literature on a broad range of LLM-related concepts. Our self-contained comprehensive overview of LLMs discusses relevant background concepts along with covering the advanced topics at the frontier of research in LLMs. This review article is intended to not only provide a systematic survey but also a quick comprehensive reference for the researchers and practitioners to draw insights from extensive informative summaries of the existing works to advance the LLM research.\n
\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Michael, J.\n \n \n (1)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n \n System Description of CITlab's Recognition & Retrieval Engine for ICDAR2017 Competition on Information Extraction in Historical Handwritten Records.\n \n \n \n \n\n\n \n Strauss, T.; Weidemann, M.; Michael, J.; Leifert, G.; Grüning, T.; and Labahn, R.\n\n\n \n\n\n\n CoRR, abs/1804.09943. 2018.\n \n\n\n\n
\n\n\n\n \n \n \"SystemPaper\n  \n \n\n \n\n \n link\n  \n \n\n bibtex\n \n\n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@article{strauss_system_2018,\n\ttitle = {System {Description} of {CITlab}'s {Recognition} \\& {Retrieval} {Engine} for {ICDAR2017} {Competition} on {Information} {Extraction} in {Historical} {Handwritten} {Records}},\n\tvolume = {abs/1804.09943},\n\turl = {http://arxiv.org/abs/1804.09943},\n\turldate = {2018-06-29},\n\tjournal = {CoRR},\n\tauthor = {Strauss, Tobias and Weidemann, Max and Michael, Johannes and Leifert, Gundram and Grüning, Tobias and Labahn, Roger},\n\tyear = {2018},\n}\n\n
\n
\n\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Mitchell, M.\n \n \n (1)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n \n Model Cards for Model Reporting.\n \n \n \n \n\n\n \n Mitchell, M.; Wu, S.; Zaldivar, A.; Barnes, P.; Vasserman, L.; Hutchinson, B.; Spitzer, E.; Raji, I. D.; and Gebru, T.\n\n\n \n\n\n\n Proceedings of the Conference on Fairness, Accountability, and Transparency,220–229. January 2019.\n arXiv: 1810.03993\n\n\n\n
\n\n\n\n \n \n \"ModelPaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n \n \n \n \n\n\n\n
\n
@article{mitchell_model_2019,\n\ttitle = {Model {Cards} for {Model} {Reporting}},\n\turl = {http://arxiv.org/abs/1810.03993},\n\tdoi = {10.1145/3287560.3287596},\n\tabstract = {Trained machine learning models are increasingly used to perform high-impact tasks in areas such as law enforcement, medicine, education, and employment. In order to clarify the intended use cases of machine learning models and minimize their usage in contexts for which they are not well suited, we recommend that released models be accompanied by documentation detailing their performance characteristics. In this paper, we propose a framework that we call model cards, to encourage such transparent model reporting. Model cards are short documents accompanying trained machine learning models that provide benchmarked evaluation in a variety of conditions, such as across different cultural, demographic, or phenotypic groups (e.g., race, geographic location, sex, Fitzpatrick skin type) and intersectional groups (e.g., age and race, or sex and Fitzpatrick skin type) that are relevant to the intended application domains. Model cards also disclose the context in which models are intended to be used, details of the performance evaluation procedures, and other relevant information. While we focus primarily on human-centered machine learning models in the application fields of computer vision and natural language processing, this framework can be used to document any trained machine learning model. To solidify the concept, we provide cards for two supervised models: One trained to detect smiling faces in images, and one trained to detect toxic comments in text. We propose model cards as a step towards the responsible democratization of machine learning and related AI technology, increasing transparency into how well AI technology works. We hope this work encourages those releasing trained machine learning models to accompany model releases with similar detailed evaluation numbers and other relevant documentation.},\n\turldate = {2022-01-24},\n\tjournal = {Proceedings of the Conference on Fairness, Accountability, and Transparency},\n\tauthor = {Mitchell, Margaret and Wu, Simone and Zaldivar, Andrew and Barnes, Parker and Vasserman, Lucy and Hutchinson, Ben and Spitzer, Elena and Raji, Inioluwa Deborah and Gebru, Timnit},\n\tmonth = jan,\n\tyear = {2019},\n\tnote = {arXiv: 1810.03993},\n\tkeywords = {Computer Science - Artificial Intelligence, Computer Science - Machine Learning},\n\tpages = {220--229},\n}\n\n
\n
\n\n\n
\n Trained machine learning models are increasingly used to perform high-impact tasks in areas such as law enforcement, medicine, education, and employment. In order to clarify the intended use cases of machine learning models and minimize their usage in contexts for which they are not well suited, we recommend that released models be accompanied by documentation detailing their performance characteristics. In this paper, we propose a framework that we call model cards, to encourage such transparent model reporting. Model cards are short documents accompanying trained machine learning models that provide benchmarked evaluation in a variety of conditions, such as across different cultural, demographic, or phenotypic groups (e.g., race, geographic location, sex, Fitzpatrick skin type) and intersectional groups (e.g., age and race, or sex and Fitzpatrick skin type) that are relevant to the intended application domains. Model cards also disclose the context in which models are intended to be used, details of the performance evaluation procedures, and other relevant information. While we focus primarily on human-centered machine learning models in the application fields of computer vision and natural language processing, this framework can be used to document any trained machine learning model. To solidify the concept, we provide cards for two supervised models: One trained to detect smiling faces in images, and one trained to detect toxic comments in text. We propose model cards as a step towards the responsible democratization of machine learning and related AI technology, increasing transparency into how well AI technology works. We hope this work encourages those releasing trained machine learning models to accompany model releases with similar detailed evaluation numbers and other relevant documentation.\n
\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Mitkov, R.\n \n \n (1)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n The Oxford Handbook of Computational Linguistics.\n \n \n \n\n\n \n Mitkov, R.\n\n\n \n\n\n\n Oxford University Press, June 2022.\n Google-Books-ID: CnpzEAAAQBAJ\n\n\n\n
\n\n\n\n \n\n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n \n \n \n \n \n \n \n \n \n \n\n\n\n
\n
@book{mitkov_oxford_2022,\n\ttitle = {The {Oxford} {Handbook} of {Computational} {Linguistics}},\n\tisbn = {978-0-19-162554-1},\n\tabstract = {Ruslan Mitkov's highly successful Oxford Handbook of Computational Linguistics has been substantially revised and expanded in this second edition. Alongside updated accounts of the topics covered in the first edition, it includes 17 new chapters on subjects such as semantic role-labelling, text-to-speech synthesis, translation technology, opinion mining and sentiment analysis, and the application of Natural Language Processing in educational and biomedical contexts, among many others. The volume is divided into four parts that examine, respectively: the linguistic fundamentals of computational linguistics; the methods and resources used, such as statistical modelling, machine learning, and corpus annotation; key language processing tasks including text segmentation, anaphora resolution, and speech recognition; and the major applications of Natural Language Processing, from machine translation to author profiling. The book will be an essential reference for researchers and students in computational linguistics and Natural Language Processing, as well as those working in related industries.},\n\tlanguage = {en},\n\tpublisher = {Oxford University Press},\n\tauthor = {Mitkov, Ruslan},\n\tmonth = jun,\n\tyear = {2022},\n\tnote = {Google-Books-ID: CnpzEAAAQBAJ},\n\tkeywords = {Computers / Artificial Intelligence / Natural Language Processing, Computers / Computer Science, Language Arts \\& Disciplines / Linguistics / General, Language Arts \\& Disciplines / Linguistics / Syntax, Language Arts \\& Disciplines / Translating \\& Interpreting},\n}\n\n
\n
\n\n\n
\n Ruslan Mitkov's highly successful Oxford Handbook of Computational Linguistics has been substantially revised and expanded in this second edition. Alongside updated accounts of the topics covered in the first edition, it includes 17 new chapters on subjects such as semantic role-labelling, text-to-speech synthesis, translation technology, opinion mining and sentiment analysis, and the application of Natural Language Processing in educational and biomedical contexts, among many others. The volume is divided into four parts that examine, respectively: the linguistic fundamentals of computational linguistics; the methods and resources used, such as statistical modelling, machine learning, and corpus annotation; key language processing tasks including text segmentation, anaphora resolution, and speech recognition; and the major applications of Natural Language Processing, from machine translation to author profiling. The book will be an essential reference for researchers and students in computational linguistics and Natural Language Processing, as well as those working in related industries.\n
\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Monnier, T.\n \n \n (1)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n \n Unsupervised Layered Image Decomposition into Object Prototypes.\n \n \n \n \n\n\n \n Monnier, T.; Vincent, E.; Ponce, J.; and Aubry, M.\n\n\n \n\n\n\n August 2021.\n arXiv:2104.14575 [cs]\n\n\n\n
\n\n\n\n \n \n \"UnsupervisedPaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n \n \n\n\n\n
\n
@misc{monnier_unsupervised_2021,\n\ttitle = {Unsupervised {Layered} {Image} {Decomposition} into {Object} {Prototypes}},\n\turl = {http://arxiv.org/abs/2104.14575},\n\tdoi = {10.48550/arXiv.2104.14575},\n\tabstract = {We present an unsupervised learning framework for decomposing images into layers of automatically discovered object models. Contrary to recent approaches that model image layers with autoencoder networks, we represent them as explicit transformations of a small set of prototypical images. Our model has three main components: (i) a set of object prototypes in the form of learnable images with a transparency channel, which we refer to as sprites; (ii) differentiable parametric functions predicting occlusions and transformation parameters necessary to instantiate the sprites in a given image; (iii) a layered image formation model with occlusion for compositing these instances into complete images including background. By jointly learning the sprites and occlusion/transformation predictors to reconstruct images, our approach not only yields accurate layered image decompositions, but also identifies object categories and instance parameters. We first validate our approach by providing results on par with the state of the art on standard multi-object synthetic benchmarks (Tetrominoes, Multi-dSprites, CLEVR6). We then demonstrate the applicability of our model to real images in tasks that include clustering (SVHN, GTSRB), cosegmentation (Weizmann Horse) and object discovery from unfiltered social network images. To the best of our knowledge, our approach is the first layered image decomposition algorithm that learns an explicit and shared concept of object type, and is robust enough to be applied to real images.},\n\turldate = {2022-09-30},\n\tpublisher = {arXiv},\n\tauthor = {Monnier, Tom and Vincent, Elliot and Ponce, Jean and Aubry, Mathieu},\n\tmonth = aug,\n\tyear = {2021},\n\tnote = {arXiv:2104.14575 [cs]},\n\tkeywords = {Computer Science - Computer Vision and Pattern Recognition},\n}\n\n
\n
\n\n\n
\n We present an unsupervised learning framework for decomposing images into layers of automatically discovered object models. Contrary to recent approaches that model image layers with autoencoder networks, we represent them as explicit transformations of a small set of prototypical images. Our model has three main components: (i) a set of object prototypes in the form of learnable images with a transparency channel, which we refer to as sprites; (ii) differentiable parametric functions predicting occlusions and transformation parameters necessary to instantiate the sprites in a given image; (iii) a layered image formation model with occlusion for compositing these instances into complete images including background. By jointly learning the sprites and occlusion/transformation predictors to reconstruct images, our approach not only yields accurate layered image decompositions, but also identifies object categories and instance parameters. We first validate our approach by providing results on par with the state of the art on standard multi-object synthetic benchmarks (Tetrominoes, Multi-dSprites, CLEVR6). We then demonstrate the applicability of our model to real images in tasks that include clustering (SVHN, GTSRB), cosegmentation (Weizmann Horse) and object discovery from unfiltered social network images. To the best of our knowledge, our approach is the first layered image decomposition algorithm that learns an explicit and shared concept of object type, and is robust enough to be applied to real images.\n
\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Mühlberger, G.\n \n \n (1)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n \n Automatic Writer Identification in Historical Documents: A Case Study.\n \n \n \n \n\n\n \n Christlein, V.; Diem, M.; Kleber, F.; Mühlberger, G.; Schwägerl-Melchior, V.; Van Gelder, E.; and Maier, A.\n\n\n \n\n\n\n Zeitschrift für digitale Geisteswissenschaften. 2016.\n Publisher: HAB - Herzog August Bibliothek\n\n\n\n
\n\n\n\n \n \n \"AutomaticPaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@article{christleinAutomaticWriterIdentification2016,\n\ttitle = {Automatic {Writer} {Identification} in {Historical} {Documents}: {A} {Case} {Study}},\n\tshorttitle = {Automatic {Writer} {Identification} in {Historical} {Documents}},\n\turl = {http://www.zfdg.de/2016_002},\n\tdoi = {10.17175/2016_002},\n\tlanguage = {en},\n\turldate = {2023-11-17},\n\tjournal = {Zeitschrift für digitale Geisteswissenschaften},\n\tauthor = {Christlein, Vincent and Diem, Markus and Kleber, Florian and Mühlberger, Günter and Schwägerl-Melchior, Verena and Van Gelder, Esther and Maier, Andreas},\n\tyear = {2016},\n\tnote = {Publisher: HAB - Herzog August Bibliothek},\n}\n\n
\n
\n\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Naveed, H.\n \n \n (2)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n \n A Comprehensive Overview of Large Language Models.\n \n \n \n \n\n\n \n Naveed, H.; Khan, A. U.; Qiu, S.; Saqib, M.; Anwar, S.; Usman, M.; Akhtar, N.; Barnes, N.; and Mian, A.\n\n\n \n\n\n\n December 2023.\n arXiv:2307.06435 [cs]\n\n\n\n
\n\n\n\n \n \n \"APaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n \n \n \n \n \n \n \n \n \n \n\n\n\n
\n
@misc{naveed2023,\n\ttitle = {A {Comprehensive} {Overview} of {Large} {Language} {Models}},\n\turl = {http://arxiv.org/abs/2307.06435},\n\tdoi = {10.48550/arXiv.2307.06435},\n\tabstract = {Large Language Models (LLMs) have recently demonstrated remarkable capabilities in natural language processing tasks and beyond. This success of LLMs has led to a large influx of research contributions in this direction. These works encompass diverse topics such as architectural innovations, better training strategies, context length improvements, fine-tuning, multi-modal LLMs, robotics, datasets, benchmarking, efficiency, and more. With the rapid development of techniques and regular breakthroughs in LLM research, it has become considerably challenging to perceive the bigger picture of the advances in this direction. Considering the rapidly emerging plethora of literature on LLMs, it is imperative that the research community is able to benefit from a concise yet comprehensive overview of the recent developments in this field. This article provides an overview of the existing literature on a broad range of LLM-related concepts. Our self-contained comprehensive overview of LLMs discusses relevant background concepts along with covering the advanced topics at the frontier of research in LLMs. This review article is intended to not only provide a systematic survey but also a quick comprehensive reference for the researchers and practitioners to draw insights from extensive informative summaries of the existing works to advance the LLM research.},\n\turldate = {2024-02-19},\n\tpublisher = {arXiv},\n\tauthor = {Naveed, Humza and Khan, Asad Ullah and Qiu, Shi and Saqib, Muhammad and Anwar, Saeed and Usman, Muhammad and Akhtar, Naveed and Barnes, Nick and Mian, Ajmal},\n\tmonth = dec,\n\tyear = {2023},\n\tnote = {arXiv:2307.06435 [cs]},\n\tkeywords = {Computer Science - Computation and Language, LLM, Large Language Model, Overview, Überblick},\n}\n\n
\n
\n\n\n
\n Large Language Models (LLMs) have recently demonstrated remarkable capabilities in natural language processing tasks and beyond. This success of LLMs has led to a large influx of research contributions in this direction. These works encompass diverse topics such as architectural innovations, better training strategies, context length improvements, fine-tuning, multi-modal LLMs, robotics, datasets, benchmarking, efficiency, and more. With the rapid development of techniques and regular breakthroughs in LLM research, it has become considerably challenging to perceive the bigger picture of the advances in this direction. Considering the rapidly emerging plethora of literature on LLMs, it is imperative that the research community is able to benefit from a concise yet comprehensive overview of the recent developments in this field. This article provides an overview of the existing literature on a broad range of LLM-related concepts. Our self-contained comprehensive overview of LLMs discusses relevant background concepts along with covering the advanced topics at the frontier of research in LLMs. This review article is intended to not only provide a systematic survey but also a quick comprehensive reference for the researchers and practitioners to draw insights from extensive informative summaries of the existing works to advance the LLM research.\n
\n\n\n
\n\n\n
\n \n\n \n \n \n \n \n \n A Comprehensive Overview of Large Language Models.\n \n \n \n \n\n\n \n Naveed, H.; Khan, A. U.; Qiu, S.; Saqib, M.; Anwar, S.; Usman, M.; Akhtar, N.; Barnes, N.; and Mian, A.\n\n\n \n\n\n\n December 2023.\n arXiv:2307.06435 [cs]\n\n\n\n
\n\n\n\n \n \n \"APaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n \n \n \n \n \n \n \n \n \n \n\n\n\n
\n
@misc{naveed2023a,\n\ttitle = {A {Comprehensive} {Overview} of {Large} {Language} {Models}},\n\turl = {http://arxiv.org/abs/2307.06435},\n\tdoi = {10.48550/arXiv.2307.06435},\n\tabstract = {Large Language Models (LLMs) have recently demonstrated remarkable capabilities in natural language processing tasks and beyond. This success of LLMs has led to a large influx of research contributions in this direction. These works encompass diverse topics such as architectural innovations, better training strategies, context length improvements, fine-tuning, multi-modal LLMs, robotics, datasets, benchmarking, efficiency, and more. With the rapid development of techniques and regular breakthroughs in LLM research, it has become considerably challenging to perceive the bigger picture of the advances in this direction. Considering the rapidly emerging plethora of literature on LLMs, it is imperative that the research community is able to benefit from a concise yet comprehensive overview of the recent developments in this field. This article provides an overview of the existing literature on a broad range of LLM-related concepts. Our self-contained comprehensive overview of LLMs discusses relevant background concepts along with covering the advanced topics at the frontier of research in LLMs. This review article is intended to not only provide a systematic survey but also a quick comprehensive reference for the researchers and practitioners to draw insights from extensive informative summaries of the existing works to advance the LLM research.},\n\turldate = {2024-02-19},\n\tpublisher = {arXiv},\n\tauthor = {Naveed, Humza and Khan, Asad Ullah and Qiu, Shi and Saqib, Muhammad and Anwar, Saeed and Usman, Muhammad and Akhtar, Naveed and Barnes, Nick and Mian, Ajmal},\n\tmonth = dec,\n\tyear = {2023},\n\tnote = {arXiv:2307.06435 [cs]},\n\tkeywords = {Computer Science - Computation and Language, LLM, Large Language Model, Overview, Überblick},\n}\n\n
\n
\n\n\n
\n Large Language Models (LLMs) have recently demonstrated remarkable capabilities in natural language processing tasks and beyond. This success of LLMs has led to a large influx of research contributions in this direction. These works encompass diverse topics such as architectural innovations, better training strategies, context length improvements, fine-tuning, multi-modal LLMs, robotics, datasets, benchmarking, efficiency, and more. With the rapid development of techniques and regular breakthroughs in LLM research, it has become considerably challenging to perceive the bigger picture of the advances in this direction. Considering the rapidly emerging plethora of literature on LLMs, it is imperative that the research community is able to benefit from a concise yet comprehensive overview of the recent developments in this field. This article provides an overview of the existing literature on a broad range of LLM-related concepts. Our self-contained comprehensive overview of LLMs discusses relevant background concepts along with covering the advanced topics at the frontier of research in LLMs. This review article is intended to not only provide a systematic survey but also a quick comprehensive reference for the researchers and practitioners to draw insights from extensive informative summaries of the existing works to advance the LLM research.\n
\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Nilsson, N.\n \n \n (1)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n \n The Quest for Artificial Intelligence.\n \n \n \n \n\n\n \n Nilsson, N. J.\n\n\n \n\n\n\n Cambridge University Press, Cambridge, 2009.\n \n\n\n\n
\n\n\n\n \n \n \"ThePaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@book{nilsson2009,\n\taddress = {Cambridge},\n\ttitle = {The {Quest} for {Artificial} {Intelligence}},\n\tisbn = {978-0-521-11639-8},\n\turl = {https://www.cambridge.org/core/books/quest-for-artificial-intelligence/32C727961B24223BBB1B3511F44F343E},\n\tabstract = {Artificial intelligence (AI) is a field within computer science that is attempting to build enhanced intelligence into computer systems. This book traces the history of the subject, from the early dreams of eighteenth-century (and earlier) pioneers to the more successful work of today's AI engineers. AI is becoming more and more a part of everyone's life. The technology is already embedded in face-recognizing cameras, speech-recognition software, Internet search engines, and health-care robots, among other applications. The book's many diagrams and easy-to-understand descriptions of AI programs will help the casual reader gain an understanding of how these and other AI systems actually work. Its thorough (but unobtrusive) end-of-chapter notes containing citations to important source materials will be of great use to AI scholars and researchers. This book promises to be the definitive history of a field that has captivated the imaginations of scientists, philosophers, and writers for centuries.},\n\turldate = {2024-01-23},\n\tpublisher = {Cambridge University Press},\n\tauthor = {Nilsson, Nils J.},\n\tyear = {2009},\n\tdoi = {10.1017/CBO9780511819346},\n}\n\n
\n
\n\n\n
\n Artificial intelligence (AI) is a field within computer science that is attempting to build enhanced intelligence into computer systems. This book traces the history of the subject, from the early dreams of eighteenth-century (and earlier) pioneers to the more successful work of today's AI engineers. AI is becoming more and more a part of everyone's life. The technology is already embedded in face-recognizing cameras, speech-recognition software, Internet search engines, and health-care robots, among other applications. The book's many diagrams and easy-to-understand descriptions of AI programs will help the casual reader gain an understanding of how these and other AI systems actually work. Its thorough (but unobtrusive) end-of-chapter notes containing citations to important source materials will be of great use to AI scholars and researchers. This book promises to be the definitive history of a field that has captivated the imaginations of scientists, philosophers, and writers for centuries.\n
\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Norvig, P.\n \n \n (1)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n Artificial intelligence: a modern approach.\n \n \n \n\n\n \n Russell, S. J.; Norvig, P.; Chang, M.; Devlin, J.; Dragan, A.; Forsyth, D.; Goodfellow, I.; Malik, J.; Mansinghka, V.; Pearl, J.; and Wooldridge, M. J.\n\n\n \n\n\n\n of Pearson series in artificial intelligencePearson, Harlow, Fourth edition, global edition edition, 2022.\n \n\n\n\n
\n\n\n\n \n\n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@book{russell_artificial_2022,\n\taddress = {Harlow},\n\tedition = {Fourth edition, global edition},\n\tseries = {Pearson series in artificial intelligence},\n\ttitle = {Artificial intelligence: a modern approach},\n\tisbn = {978-1-292-40113-3},\n\tshorttitle = {Artificial intelligence},\n\tabstract = {"Updated edition of popular textbook on Artificial Intelligence. This edition specific looks at ways of keeping artificial intelligence under control"},\n\tlanguage = {eng},\n\tpublisher = {Pearson},\n\tauthor = {Russell, Stuart J. and Norvig, Peter and Chang, Ming-wei and Devlin, Jacob and Dragan, Anca and Forsyth, David and Goodfellow, Ian and Malik, Jitendra and Mansinghka, Vikas and Pearl, Judea and Wooldridge, Michael J.},\n\tyear = {2022},\n}\n\n
\n
\n\n\n
\n \"Updated edition of popular textbook on Artificial Intelligence. This edition specific looks at ways of keeping artificial intelligence under control\"\n
\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Papazoglou, A.\n \n \n (1)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n \n Stavronikita Monastery Greek handwritten document Collection no.53 [Data set].\n \n \n \n \n\n\n \n Pratikakis, I.; Papazoglou, A.; Symeonidis, S.; and Tsochatzidis, L.\n\n\n \n\n\n\n October 2021.\n \n\n\n\n
\n\n\n\n \n \n \"StavronikitaPaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n \n \n \n \n \n \n \n \n\n\n\n
\n
@misc{pratikakis_stavronikita_2021,\n\ttitle = {Stavronikita {Monastery} {Greek} handwritten document {Collection} no.53 [{Data} set]},\n\turl = {https://zenodo.org/record/5595669},\n\tdoi = {10.5281/zenodo.5595669},\n\tabstract = {The collection is one of the oldest Stavronikita Monastery on Mount Athos. It is a parchment, four-gospel manuscript which has been written between 1301 and 1350. It comprises 54 pages with dimensions that are approximately 250x185 mm. The script is elegant minuscule and the use of majuscule letters is rare. Tachygraphical symbols and abbreviations are encountered in the manuscript as well. Furthermore, the manuscript is enriched with chrysography, elegant epititles and initials. The dataset of ΧΦ53 consists of 1038 lines of text containing 5592 words (2374 unique words) that are distributed over 54 scanned handwritten text pages.},\n\turldate = {2023-03-31},\n\tpublisher = {Zenodo},\n\tauthor = {Pratikakis, Ioannis and Papazoglou, Aleksandros and Symeonidis, Symeon and Tsochatzidis, Lazaros},\n\tmonth = oct,\n\tyear = {2021},\n\tkeywords = {greek, handwritten, miniscule, transcription},\n}\n\n
\n
\n\n\n
\n The collection is one of the oldest Stavronikita Monastery on Mount Athos. It is a parchment, four-gospel manuscript which has been written between 1301 and 1350. It comprises 54 pages with dimensions that are approximately 250x185 mm. The script is elegant minuscule and the use of majuscule letters is rare. Tachygraphical symbols and abbreviations are encountered in the manuscript as well. Furthermore, the manuscript is enriched with chrysography, elegant epititles and initials. The dataset of ΧΦ53 consists of 1038 lines of text containing 5592 words (2374 unique words) that are distributed over 54 scanned handwritten text pages.\n
\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Paquet, T.\n \n \n (1)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n Recognition and Information Extraction in Historical Handwritten Tables: Toward Understanding Early $$20\\textasciicircum\\th\\$$Century Paris Census.\n \n \n \n\n\n \n Constum, T.; Kempf, N.; Paquet, T.; Tranouez, P.; Chatelain, C.; Brée, S.; and Merveille, F.\n\n\n \n\n\n\n In Uchida, S.; Barney, E.; and Eglin, V., editor(s), Document Analysis Systems, pages 143–157, Cham, 2022. Springer International Publishing\n \n\n\n\n
\n\n\n\n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n\n\n\n
\n
@inproceedings{constumRecognitionInformationExtraction2022,\n\taddress = {Cham},\n\ttitle = {Recognition and {Information} {Extraction} in {Historical} {Handwritten} {Tables}: {Toward} {Understanding} {Early} \\$\\$20{\\textasciicircum}\\{th\\}\\$\\${Century} {Paris} {Census}},\n\tisbn = {978-3-031-06555-2},\n\tshorttitle = {Recognition and {Information} {Extraction} in {Historical} {Handwritten} {Tables}},\n\tdoi = {10.1007/978-3-031-06555-2_10},\n\tabstract = {We aim to build a vast database (up to 9 million individuals) from the handwritten tabular nominal census of Paris of 1926, 1931 and 1936, each composed of about 100,000 handwritten simple pages in a tabular format. We created a complete pipeline that goes from the scan of double pages to text prediction while minimizing the need for segmentation labels. We describe how weighted finite state transducers, writer specialization and self-training further improved our results. We also introduce through this communication two annotated datasets for handwriting recognition that are now publicly available, and an open-source toolkit to apply WFST on CTC lattices.},\n\tlanguage = {en},\n\tbooktitle = {Document {Analysis} {Systems}},\n\tpublisher = {Springer International Publishing},\n\tauthor = {Constum, Thomas and Kempf, Nicolas and Paquet, Thierry and Tranouez, Pierrick and Chatelain, Clément and Brée, Sandra and Merveille, François},\n\teditor = {Uchida, Seiichi and Barney, Elisa and Eglin, Véronique},\n\tyear = {2022},\n\tkeywords = {Document layout analysis, Handwriting recognition, Self-training, Semi-supervised learning, Table analysis, WFST, handwritten text recognition, table recognition},\n\tpages = {143--157},\n}\n\n
\n
\n\n\n
\n We aim to build a vast database (up to 9 million individuals) from the handwritten tabular nominal census of Paris of 1926, 1931 and 1936, each composed of about 100,000 handwritten simple pages in a tabular format. We created a complete pipeline that goes from the scan of double pages to text prediction while minimizing the need for segmentation labels. We describe how weighted finite state transducers, writer specialization and self-training further improved our results. We also introduce through this communication two annotated datasets for handwriting recognition that are now publicly available, and an open-source toolkit to apply WFST on CTC lattices.\n
\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Parmar, N.\n \n \n (1)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n \n Attention Is All You Need.\n \n \n \n \n\n\n \n Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A. N.; Kaiser, L.; and Polosukhin, I.\n\n\n \n\n\n\n December 2017.\n arXiv:1706.03762 [cs]\n\n\n\n
\n\n\n\n \n \n \"AttentionPaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n \n \n \n \n\n\n\n
\n
@misc{vaswani2017,\n\ttitle = {Attention {Is} {All} {You} {Need}},\n\turl = {http://arxiv.org/abs/1706.03762},\n\tdoi = {10.48550/arXiv.1706.03762},\n\tabstract = {The dominant sequence transduction models are based on complex recurrent or convolutional neural networks in an encoder-decoder configuration. The best performing models also connect the encoder and decoder through an attention mechanism. We propose a new simple network architecture, the Transformer, based solely on attention mechanisms, dispensing with recurrence and convolutions entirely. Experiments on two machine translation tasks show these models to be superior in quality while being more parallelizable and requiring significantly less time to train. Our model achieves 28.4 BLEU on the WMT 2014 English-to-German translation task, improving over the existing best results, including ensembles by over 2 BLEU. On the WMT 2014 English-to-French translation task, our model establishes a new single-model state-of-the-art BLEU score of 41.8 after training for 3.5 days on eight GPUs, a small fraction of the training costs of the best models from the literature. We show that the Transformer generalizes well to other tasks by applying it successfully to English constituency parsing both with large and limited training data.},\n\turldate = {2023-02-02},\n\tpublisher = {arXiv},\n\tauthor = {Vaswani, Ashish and Shazeer, Noam and Parmar, Niki and Uszkoreit, Jakob and Jones, Llion and Gomez, Aidan N. and Kaiser, Lukasz and Polosukhin, Illia},\n\tmonth = dec,\n\tyear = {2017},\n\tnote = {arXiv:1706.03762 [cs]},\n\tkeywords = {Computer Science - Computation and Language, Computer Science - Machine Learning},\n}\n\n
\n
\n\n\n
\n The dominant sequence transduction models are based on complex recurrent or convolutional neural networks in an encoder-decoder configuration. The best performing models also connect the encoder and decoder through an attention mechanism. We propose a new simple network architecture, the Transformer, based solely on attention mechanisms, dispensing with recurrence and convolutions entirely. Experiments on two machine translation tasks show these models to be superior in quality while being more parallelizable and requiring significantly less time to train. Our model achieves 28.4 BLEU on the WMT 2014 English-to-German translation task, improving over the existing best results, including ensembles by over 2 BLEU. On the WMT 2014 English-to-French translation task, our model establishes a new single-model state-of-the-art BLEU score of 41.8 after training for 3.5 days on eight GPUs, a small fraction of the training costs of the best models from the literature. We show that the Transformer generalizes well to other tasks by applying it successfully to English constituency parsing both with large and limited training data.\n
\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Pavlopoulos, J.\n \n \n (1)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n \n Restoring and attributing ancient texts using deep neural networks.\n \n \n \n \n\n\n \n Assael, Y.; Sommerschield, T.; Shillingford, B.; Bordbar, M.; Pavlopoulos, J.; Chatzipanagiotou, M.; Androutsopoulos, I.; Prag, J.; and de Freitas, N.\n\n\n \n\n\n\n Nature, 603(7900): 280–283. March 2022.\n Number: 7900 Publisher: Nature Publishing Group\n\n\n\n
\n\n\n\n \n \n \"RestoringPaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n \n \n \n \n \n \n\n\n\n
\n
@article{assael_restoring_2022,\n\ttitle = {Restoring and attributing ancient texts using deep neural networks},\n\tvolume = {603},\n\tcopyright = {2022 The Author(s)},\n\tissn = {1476-4687},\n\turl = {https://www.nature.com/articles/s41586-022-04448-z/},\n\tdoi = {10.1038/s41586-022-04448-z},\n\tabstract = {Ancient history relies on disciplines such as epigraphy—the study of inscribed texts known as inscriptions—for evidence of the thought, language, society and history of past civilizations1. However, over the centuries, many inscriptions have been damaged to the point of illegibility, transported far from their original location and their date of writing is steeped in uncertainty. Here we present Ithaca, a deep neural network for the textual restoration, geographical attribution and chronological attribution of ancient Greek inscriptions. Ithaca is designed to assist and expand the historian’s workflow. The architecture of Ithaca focuses on collaboration, decision support and interpretability. While Ithaca alone achieves 62\\% accuracy when restoring damaged texts, the use of Ithaca by historians improved their accuracy from 25\\% to 72\\%, confirming the synergistic effect of this research tool. Ithaca can attribute inscriptions to their original location with an accuracy of 71\\% and can date them to less than 30 years of their ground-truth ranges, redating key texts of Classical Athens and contributing to topical debates in ancient history. This research shows how models such as Ithaca can unlock the cooperative potential between artificial intelligence and historians, transformationally impacting the way that we study and write about one of the most important periods in human history.},\n\tlanguage = {en},\n\tnumber = {7900},\n\turldate = {2022-09-28},\n\tjournal = {Nature},\n\tauthor = {Assael, Yannis and Sommerschield, Thea and Shillingford, Brendan and Bordbar, Mahyar and Pavlopoulos, John and Chatzipanagiotou, Marita and Androutsopoulos, Ion and Prag, Jonathan and de Freitas, Nando},\n\tmonth = mar,\n\tyear = {2022},\n\tnote = {Number: 7900\nPublisher: Nature Publishing Group},\n\tkeywords = {Archaeology, Computer science, History},\n\tpages = {280--283},\n}\n\n
\n
\n\n\n
\n Ancient history relies on disciplines such as epigraphy—the study of inscribed texts known as inscriptions—for evidence of the thought, language, society and history of past civilizations1. However, over the centuries, many inscriptions have been damaged to the point of illegibility, transported far from their original location and their date of writing is steeped in uncertainty. Here we present Ithaca, a deep neural network for the textual restoration, geographical attribution and chronological attribution of ancient Greek inscriptions. Ithaca is designed to assist and expand the historian’s workflow. The architecture of Ithaca focuses on collaboration, decision support and interpretability. While Ithaca alone achieves 62% accuracy when restoring damaged texts, the use of Ithaca by historians improved their accuracy from 25% to 72%, confirming the synergistic effect of this research tool. Ithaca can attribute inscriptions to their original location with an accuracy of 71% and can date them to less than 30 years of their ground-truth ranges, redating key texts of Classical Athens and contributing to topical debates in ancient history. This research shows how models such as Ithaca can unlock the cooperative potential between artificial intelligence and historians, transformationally impacting the way that we study and write about one of the most important periods in human history.\n
\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Pearl, J.\n \n \n (1)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n Artificial intelligence: a modern approach.\n \n \n \n\n\n \n Russell, S. J.; Norvig, P.; Chang, M.; Devlin, J.; Dragan, A.; Forsyth, D.; Goodfellow, I.; Malik, J.; Mansinghka, V.; Pearl, J.; and Wooldridge, M. J.\n\n\n \n\n\n\n of Pearson series in artificial intelligencePearson, Harlow, Fourth edition, global edition edition, 2022.\n \n\n\n\n
\n\n\n\n \n\n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@book{russell_artificial_2022,\n\taddress = {Harlow},\n\tedition = {Fourth edition, global edition},\n\tseries = {Pearson series in artificial intelligence},\n\ttitle = {Artificial intelligence: a modern approach},\n\tisbn = {978-1-292-40113-3},\n\tshorttitle = {Artificial intelligence},\n\tabstract = {"Updated edition of popular textbook on Artificial Intelligence. This edition specific looks at ways of keeping artificial intelligence under control"},\n\tlanguage = {eng},\n\tpublisher = {Pearson},\n\tauthor = {Russell, Stuart J. and Norvig, Peter and Chang, Ming-wei and Devlin, Jacob and Dragan, Anca and Forsyth, David and Goodfellow, Ian and Malik, Jitendra and Mansinghka, Vikas and Pearl, Judea and Wooldridge, Michael J.},\n\tyear = {2022},\n}\n\n
\n
\n\n\n
\n \"Updated edition of popular textbook on Artificial Intelligence. This edition specific looks at ways of keeping artificial intelligence under control\"\n
\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Pfeifer, R.\n \n \n (1)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n How the Body Shapes the Way We Think: A New View of Intelligence.\n \n \n \n\n\n \n Pfeifer, R.; and Bongard, J.\n\n\n \n\n\n\n 2007.\n \n\n\n\n
\n\n\n\n \n\n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@book{pfeifer_how_2007,\n\ttitle = {How the {Body} {Shapes} the {Way} {We} {Think}: {A} {New} {View} of {Intelligence}},\n\tisbn = {978-0-262-16239-5},\n\tabstract = {On Embodiment in AI-development},\n\tauthor = {Pfeifer, Rolf and Bongard, Josh},\n\tyear = {2007},\n}\n\n
\n
\n\n\n
\n On Embodiment in AI-development\n
\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Pinche, A.\n \n \n (2)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n \n Generic HTR Models for Medieval Manuscripts. The CREMMALab Project.\n \n \n \n \n\n\n \n Pinche, A.\n\n\n \n\n\n\n Journal of Data Mining & Digital Humanities, Historical Documents and.... October 2023.\n \n\n\n\n
\n\n\n\n \n \n \"GenericPaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@article{pinche2023a,\n\ttitle = {Generic {HTR} {Models} for {Medieval} {Manuscripts}. {The} {CREMMALab} {Project}},\n\tvolume = {Historical Documents and...},\n\tissn = {2416-5999},\n\turl = {https://jdmdh.episciences.org/10252},\n\tdoi = {10.46298/jdmdh.10252},\n\tabstract = {In the Humanities, the emergence of digital methods has opened up research to quantitative analysis and/or to publication of large corpora. To produce more textual data faster, automatic text recognition technology (ATR)1 is increasingly involved in research projects following precursors such as the Himanis project. However, many research teams have limited resources, either financially or in terms of their expertise in artificial intelligence. It may therefore be difficult to integrate ATR into their project pipeline if they need to train a model or to create data from scratch. The goal here is not to explain how to build or improve a new ATR engine, nor to find a way to automatically align a pre-existing corpus with an image to quickly create ground truths for training. This paper aims to help humanists develop models for medieval manuscripts, create and gather training data by knowing the issues underlying their choices. The objective is also to show the importance of data consistency as a prerequisite for building homogeneous corpora and training more accurate models. We will present an overview of our work and experiment in the CREMMALab project (2021-2022), showing first how we ensure the consistency of the data and then how we have developed a generic model for medieval French manuscripts from the 13th to the 15th century, ready to be shared (more than 94\\% accuracy) and/or fine-tuned by other projects.},\n\tlanguage = {en},\n\turldate = {2024-01-03},\n\tjournal = {Journal of Data Mining \\& Digital Humanities},\n\tauthor = {Pinche, Ariane},\n\tmonth = oct,\n\tyear = {2023},\n}\n\n
\n
\n\n\n
\n In the Humanities, the emergence of digital methods has opened up research to quantitative analysis and/or to publication of large corpora. To produce more textual data faster, automatic text recognition technology (ATR)1 is increasingly involved in research projects following precursors such as the Himanis project. However, many research teams have limited resources, either financially or in terms of their expertise in artificial intelligence. It may therefore be difficult to integrate ATR into their project pipeline if they need to train a model or to create data from scratch. The goal here is not to explain how to build or improve a new ATR engine, nor to find a way to automatically align a pre-existing corpus with an image to quickly create ground truths for training. This paper aims to help humanists develop models for medieval manuscripts, create and gather training data by knowing the issues underlying their choices. The objective is also to show the importance of data consistency as a prerequisite for building homogeneous corpora and training more accurate models. We will present an overview of our work and experiment in the CREMMALab project (2021-2022), showing first how we ensure the consistency of the data and then how we have developed a generic model for medieval French manuscripts from the 13th to the 15th century, ready to be shared (more than 94% accuracy) and/or fine-tuned by other projects.\n
\n\n\n
\n\n\n
\n \n\n \n \n \n \n \n \n Generic HTR Models for Medieval Manuscripts The CREMMALab Project.\n \n \n \n \n\n\n \n Pinche, A.\n\n\n \n\n\n\n February 2023.\n \n\n\n\n
\n\n\n\n \n \n \"GenericPaper\n  \n \n\n \n\n \n link\n  \n \n\n bibtex\n \n\n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n\n\n\n
\n
@misc{pinche_generic_2023,\n\ttitle = {Generic {HTR} {Models} for {Medieval} {Manuscripts} {The} {CREMMALab} {Project}},\n\turl = {https://hal.science/hal-03837519},\n\tlanguage = {en},\n\turldate = {2023-02-21},\n\tauthor = {Pinche, Ariane},\n\tmonth = feb,\n\tyear = {2023},\n\tkeywords = {HTR, dataset, medieval, model, text, transcription},\n}\n\n
\n
\n\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Polosukhin, I.\n \n \n (1)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n \n Attention Is All You Need.\n \n \n \n \n\n\n \n Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A. N.; Kaiser, L.; and Polosukhin, I.\n\n\n \n\n\n\n December 2017.\n arXiv:1706.03762 [cs]\n\n\n\n
\n\n\n\n \n \n \"AttentionPaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n \n \n \n \n\n\n\n
\n
@misc{vaswani2017,\n\ttitle = {Attention {Is} {All} {You} {Need}},\n\turl = {http://arxiv.org/abs/1706.03762},\n\tdoi = {10.48550/arXiv.1706.03762},\n\tabstract = {The dominant sequence transduction models are based on complex recurrent or convolutional neural networks in an encoder-decoder configuration. The best performing models also connect the encoder and decoder through an attention mechanism. We propose a new simple network architecture, the Transformer, based solely on attention mechanisms, dispensing with recurrence and convolutions entirely. Experiments on two machine translation tasks show these models to be superior in quality while being more parallelizable and requiring significantly less time to train. Our model achieves 28.4 BLEU on the WMT 2014 English-to-German translation task, improving over the existing best results, including ensembles by over 2 BLEU. On the WMT 2014 English-to-French translation task, our model establishes a new single-model state-of-the-art BLEU score of 41.8 after training for 3.5 days on eight GPUs, a small fraction of the training costs of the best models from the literature. We show that the Transformer generalizes well to other tasks by applying it successfully to English constituency parsing both with large and limited training data.},\n\turldate = {2023-02-02},\n\tpublisher = {arXiv},\n\tauthor = {Vaswani, Ashish and Shazeer, Noam and Parmar, Niki and Uszkoreit, Jakob and Jones, Llion and Gomez, Aidan N. and Kaiser, Lukasz and Polosukhin, Illia},\n\tmonth = dec,\n\tyear = {2017},\n\tnote = {arXiv:1706.03762 [cs]},\n\tkeywords = {Computer Science - Computation and Language, Computer Science - Machine Learning},\n}\n\n
\n
\n\n\n
\n The dominant sequence transduction models are based on complex recurrent or convolutional neural networks in an encoder-decoder configuration. The best performing models also connect the encoder and decoder through an attention mechanism. We propose a new simple network architecture, the Transformer, based solely on attention mechanisms, dispensing with recurrence and convolutions entirely. Experiments on two machine translation tasks show these models to be superior in quality while being more parallelizable and requiring significantly less time to train. Our model achieves 28.4 BLEU on the WMT 2014 English-to-German translation task, improving over the existing best results, including ensembles by over 2 BLEU. On the WMT 2014 English-to-French translation task, our model establishes a new single-model state-of-the-art BLEU score of 41.8 after training for 3.5 days on eight GPUs, a small fraction of the training costs of the best models from the literature. We show that the Transformer generalizes well to other tasks by applying it successfully to English constituency parsing both with large and limited training data.\n
\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Ponce, J.\n \n \n (1)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n \n Unsupervised Layered Image Decomposition into Object Prototypes.\n \n \n \n \n\n\n \n Monnier, T.; Vincent, E.; Ponce, J.; and Aubry, M.\n\n\n \n\n\n\n August 2021.\n arXiv:2104.14575 [cs]\n\n\n\n
\n\n\n\n \n \n \"UnsupervisedPaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n \n \n\n\n\n
\n
@misc{monnier_unsupervised_2021,\n\ttitle = {Unsupervised {Layered} {Image} {Decomposition} into {Object} {Prototypes}},\n\turl = {http://arxiv.org/abs/2104.14575},\n\tdoi = {10.48550/arXiv.2104.14575},\n\tabstract = {We present an unsupervised learning framework for decomposing images into layers of automatically discovered object models. Contrary to recent approaches that model image layers with autoencoder networks, we represent them as explicit transformations of a small set of prototypical images. Our model has three main components: (i) a set of object prototypes in the form of learnable images with a transparency channel, which we refer to as sprites; (ii) differentiable parametric functions predicting occlusions and transformation parameters necessary to instantiate the sprites in a given image; (iii) a layered image formation model with occlusion for compositing these instances into complete images including background. By jointly learning the sprites and occlusion/transformation predictors to reconstruct images, our approach not only yields accurate layered image decompositions, but also identifies object categories and instance parameters. We first validate our approach by providing results on par with the state of the art on standard multi-object synthetic benchmarks (Tetrominoes, Multi-dSprites, CLEVR6). We then demonstrate the applicability of our model to real images in tasks that include clustering (SVHN, GTSRB), cosegmentation (Weizmann Horse) and object discovery from unfiltered social network images. To the best of our knowledge, our approach is the first layered image decomposition algorithm that learns an explicit and shared concept of object type, and is robust enough to be applied to real images.},\n\turldate = {2022-09-30},\n\tpublisher = {arXiv},\n\tauthor = {Monnier, Tom and Vincent, Elliot and Ponce, Jean and Aubry, Mathieu},\n\tmonth = aug,\n\tyear = {2021},\n\tnote = {arXiv:2104.14575 [cs]},\n\tkeywords = {Computer Science - Computer Vision and Pattern Recognition},\n}\n\n
\n
\n\n\n
\n We present an unsupervised learning framework for decomposing images into layers of automatically discovered object models. Contrary to recent approaches that model image layers with autoencoder networks, we represent them as explicit transformations of a small set of prototypical images. Our model has three main components: (i) a set of object prototypes in the form of learnable images with a transparency channel, which we refer to as sprites; (ii) differentiable parametric functions predicting occlusions and transformation parameters necessary to instantiate the sprites in a given image; (iii) a layered image formation model with occlusion for compositing these instances into complete images including background. By jointly learning the sprites and occlusion/transformation predictors to reconstruct images, our approach not only yields accurate layered image decompositions, but also identifies object categories and instance parameters. We first validate our approach by providing results on par with the state of the art on standard multi-object synthetic benchmarks (Tetrominoes, Multi-dSprites, CLEVR6). We then demonstrate the applicability of our model to real images in tasks that include clustering (SVHN, GTSRB), cosegmentation (Weizmann Horse) and object discovery from unfiltered social network images. To the best of our knowledge, our approach is the first layered image decomposition algorithm that learns an explicit and shared concept of object type, and is robust enough to be applied to real images.\n
\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Prag, J.\n \n \n (1)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n \n Restoring and attributing ancient texts using deep neural networks.\n \n \n \n \n\n\n \n Assael, Y.; Sommerschield, T.; Shillingford, B.; Bordbar, M.; Pavlopoulos, J.; Chatzipanagiotou, M.; Androutsopoulos, I.; Prag, J.; and de Freitas, N.\n\n\n \n\n\n\n Nature, 603(7900): 280–283. March 2022.\n Number: 7900 Publisher: Nature Publishing Group\n\n\n\n
\n\n\n\n \n \n \"RestoringPaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n \n \n \n \n \n \n\n\n\n
\n
@article{assael_restoring_2022,\n\ttitle = {Restoring and attributing ancient texts using deep neural networks},\n\tvolume = {603},\n\tcopyright = {2022 The Author(s)},\n\tissn = {1476-4687},\n\turl = {https://www.nature.com/articles/s41586-022-04448-z/},\n\tdoi = {10.1038/s41586-022-04448-z},\n\tabstract = {Ancient history relies on disciplines such as epigraphy—the study of inscribed texts known as inscriptions—for evidence of the thought, language, society and history of past civilizations1. However, over the centuries, many inscriptions have been damaged to the point of illegibility, transported far from their original location and their date of writing is steeped in uncertainty. Here we present Ithaca, a deep neural network for the textual restoration, geographical attribution and chronological attribution of ancient Greek inscriptions. Ithaca is designed to assist and expand the historian’s workflow. The architecture of Ithaca focuses on collaboration, decision support and interpretability. While Ithaca alone achieves 62\\% accuracy when restoring damaged texts, the use of Ithaca by historians improved their accuracy from 25\\% to 72\\%, confirming the synergistic effect of this research tool. Ithaca can attribute inscriptions to their original location with an accuracy of 71\\% and can date them to less than 30 years of their ground-truth ranges, redating key texts of Classical Athens and contributing to topical debates in ancient history. This research shows how models such as Ithaca can unlock the cooperative potential between artificial intelligence and historians, transformationally impacting the way that we study and write about one of the most important periods in human history.},\n\tlanguage = {en},\n\tnumber = {7900},\n\turldate = {2022-09-28},\n\tjournal = {Nature},\n\tauthor = {Assael, Yannis and Sommerschield, Thea and Shillingford, Brendan and Bordbar, Mahyar and Pavlopoulos, John and Chatzipanagiotou, Marita and Androutsopoulos, Ion and Prag, Jonathan and de Freitas, Nando},\n\tmonth = mar,\n\tyear = {2022},\n\tnote = {Number: 7900\nPublisher: Nature Publishing Group},\n\tkeywords = {Archaeology, Computer science, History},\n\tpages = {280--283},\n}\n\n
\n
\n\n\n
\n Ancient history relies on disciplines such as epigraphy—the study of inscribed texts known as inscriptions—for evidence of the thought, language, society and history of past civilizations1. However, over the centuries, many inscriptions have been damaged to the point of illegibility, transported far from their original location and their date of writing is steeped in uncertainty. Here we present Ithaca, a deep neural network for the textual restoration, geographical attribution and chronological attribution of ancient Greek inscriptions. Ithaca is designed to assist and expand the historian’s workflow. The architecture of Ithaca focuses on collaboration, decision support and interpretability. While Ithaca alone achieves 62% accuracy when restoring damaged texts, the use of Ithaca by historians improved their accuracy from 25% to 72%, confirming the synergistic effect of this research tool. Ithaca can attribute inscriptions to their original location with an accuracy of 71% and can date them to less than 30 years of their ground-truth ranges, redating key texts of Classical Athens and contributing to topical debates in ancient history. This research shows how models such as Ithaca can unlock the cooperative potential between artificial intelligence and historians, transformationally impacting the way that we study and write about one of the most important periods in human history.\n
\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Pratikakis, I.\n \n \n (1)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n \n Stavronikita Monastery Greek handwritten document Collection no.53 [Data set].\n \n \n \n \n\n\n \n Pratikakis, I.; Papazoglou, A.; Symeonidis, S.; and Tsochatzidis, L.\n\n\n \n\n\n\n October 2021.\n \n\n\n\n
\n\n\n\n \n \n \"StavronikitaPaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n \n \n \n \n \n \n \n \n\n\n\n
\n
@misc{pratikakis_stavronikita_2021,\n\ttitle = {Stavronikita {Monastery} {Greek} handwritten document {Collection} no.53 [{Data} set]},\n\turl = {https://zenodo.org/record/5595669},\n\tdoi = {10.5281/zenodo.5595669},\n\tabstract = {The collection is one of the oldest Stavronikita Monastery on Mount Athos. It is a parchment, four-gospel manuscript which has been written between 1301 and 1350. It comprises 54 pages with dimensions that are approximately 250x185 mm. The script is elegant minuscule and the use of majuscule letters is rare. Tachygraphical symbols and abbreviations are encountered in the manuscript as well. Furthermore, the manuscript is enriched with chrysography, elegant epititles and initials. The dataset of ΧΦ53 consists of 1038 lines of text containing 5592 words (2374 unique words) that are distributed over 54 scanned handwritten text pages.},\n\turldate = {2023-03-31},\n\tpublisher = {Zenodo},\n\tauthor = {Pratikakis, Ioannis and Papazoglou, Aleksandros and Symeonidis, Symeon and Tsochatzidis, Lazaros},\n\tmonth = oct,\n\tyear = {2021},\n\tkeywords = {greek, handwritten, miniscule, transcription},\n}\n\n
\n
\n\n\n
\n The collection is one of the oldest Stavronikita Monastery on Mount Athos. It is a parchment, four-gospel manuscript which has been written between 1301 and 1350. It comprises 54 pages with dimensions that are approximately 250x185 mm. The script is elegant minuscule and the use of majuscule letters is rare. Tachygraphical symbols and abbreviations are encountered in the manuscript as well. Furthermore, the manuscript is enriched with chrysography, elegant epititles and initials. The dataset of ΧΦ53 consists of 1038 lines of text containing 5592 words (2374 unique words) that are distributed over 54 scanned handwritten text pages.\n
\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Puppe, F.\n \n \n (1)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n Calamari - A High-Performance Tensorflow-based Deep Learning Package for Optical Character Recognition.\n \n \n \n\n\n \n Wick, C.; Reul, C.; and Puppe, F.\n\n\n \n\n\n\n Digital Humanities Quarterly, 14(1). 2020.\n \n\n\n\n
\n\n\n\n \n\n \n\n \n link\n  \n \n\n bibtex\n \n\n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@article{wick2020,\n\ttitle = {Calamari - {A} {High}-{Performance} {Tensorflow}-based {Deep} {Learning} {Package} for {Optical} {Character} {Recognition}},\n\tvolume = {14},\n\tnumber = {1},\n\tjournal = {Digital Humanities Quarterly},\n\tauthor = {Wick, Christoph and Reul, Christian and Puppe, Frank},\n\tyear = {2020},\n}\n\n
\n
\n\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Qiu, S.\n \n \n (2)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n \n A Comprehensive Overview of Large Language Models.\n \n \n \n \n\n\n \n Naveed, H.; Khan, A. U.; Qiu, S.; Saqib, M.; Anwar, S.; Usman, M.; Akhtar, N.; Barnes, N.; and Mian, A.\n\n\n \n\n\n\n December 2023.\n arXiv:2307.06435 [cs]\n\n\n\n
\n\n\n\n \n \n \"APaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n \n \n \n \n \n \n \n \n \n \n\n\n\n
\n
@misc{naveed2023,\n\ttitle = {A {Comprehensive} {Overview} of {Large} {Language} {Models}},\n\turl = {http://arxiv.org/abs/2307.06435},\n\tdoi = {10.48550/arXiv.2307.06435},\n\tabstract = {Large Language Models (LLMs) have recently demonstrated remarkable capabilities in natural language processing tasks and beyond. This success of LLMs has led to a large influx of research contributions in this direction. These works encompass diverse topics such as architectural innovations, better training strategies, context length improvements, fine-tuning, multi-modal LLMs, robotics, datasets, benchmarking, efficiency, and more. With the rapid development of techniques and regular breakthroughs in LLM research, it has become considerably challenging to perceive the bigger picture of the advances in this direction. Considering the rapidly emerging plethora of literature on LLMs, it is imperative that the research community is able to benefit from a concise yet comprehensive overview of the recent developments in this field. This article provides an overview of the existing literature on a broad range of LLM-related concepts. Our self-contained comprehensive overview of LLMs discusses relevant background concepts along with covering the advanced topics at the frontier of research in LLMs. This review article is intended to not only provide a systematic survey but also a quick comprehensive reference for the researchers and practitioners to draw insights from extensive informative summaries of the existing works to advance the LLM research.},\n\turldate = {2024-02-19},\n\tpublisher = {arXiv},\n\tauthor = {Naveed, Humza and Khan, Asad Ullah and Qiu, Shi and Saqib, Muhammad and Anwar, Saeed and Usman, Muhammad and Akhtar, Naveed and Barnes, Nick and Mian, Ajmal},\n\tmonth = dec,\n\tyear = {2023},\n\tnote = {arXiv:2307.06435 [cs]},\n\tkeywords = {Computer Science - Computation and Language, LLM, Large Language Model, Overview, Überblick},\n}\n\n
\n
\n\n\n
\n Large Language Models (LLMs) have recently demonstrated remarkable capabilities in natural language processing tasks and beyond. This success of LLMs has led to a large influx of research contributions in this direction. These works encompass diverse topics such as architectural innovations, better training strategies, context length improvements, fine-tuning, multi-modal LLMs, robotics, datasets, benchmarking, efficiency, and more. With the rapid development of techniques and regular breakthroughs in LLM research, it has become considerably challenging to perceive the bigger picture of the advances in this direction. Considering the rapidly emerging plethora of literature on LLMs, it is imperative that the research community is able to benefit from a concise yet comprehensive overview of the recent developments in this field. This article provides an overview of the existing literature on a broad range of LLM-related concepts. Our self-contained comprehensive overview of LLMs discusses relevant background concepts along with covering the advanced topics at the frontier of research in LLMs. This review article is intended to not only provide a systematic survey but also a quick comprehensive reference for the researchers and practitioners to draw insights from extensive informative summaries of the existing works to advance the LLM research.\n
\n\n\n
\n\n\n
\n \n\n \n \n \n \n \n \n A Comprehensive Overview of Large Language Models.\n \n \n \n \n\n\n \n Naveed, H.; Khan, A. U.; Qiu, S.; Saqib, M.; Anwar, S.; Usman, M.; Akhtar, N.; Barnes, N.; and Mian, A.\n\n\n \n\n\n\n December 2023.\n arXiv:2307.06435 [cs]\n\n\n\n
\n\n\n\n \n \n \"APaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n \n \n \n \n \n \n \n \n \n \n\n\n\n
\n
@misc{naveed2023a,\n\ttitle = {A {Comprehensive} {Overview} of {Large} {Language} {Models}},\n\turl = {http://arxiv.org/abs/2307.06435},\n\tdoi = {10.48550/arXiv.2307.06435},\n\tabstract = {Large Language Models (LLMs) have recently demonstrated remarkable capabilities in natural language processing tasks and beyond. This success of LLMs has led to a large influx of research contributions in this direction. These works encompass diverse topics such as architectural innovations, better training strategies, context length improvements, fine-tuning, multi-modal LLMs, robotics, datasets, benchmarking, efficiency, and more. With the rapid development of techniques and regular breakthroughs in LLM research, it has become considerably challenging to perceive the bigger picture of the advances in this direction. Considering the rapidly emerging plethora of literature on LLMs, it is imperative that the research community is able to benefit from a concise yet comprehensive overview of the recent developments in this field. This article provides an overview of the existing literature on a broad range of LLM-related concepts. Our self-contained comprehensive overview of LLMs discusses relevant background concepts along with covering the advanced topics at the frontier of research in LLMs. This review article is intended to not only provide a systematic survey but also a quick comprehensive reference for the researchers and practitioners to draw insights from extensive informative summaries of the existing works to advance the LLM research.},\n\turldate = {2024-02-19},\n\tpublisher = {arXiv},\n\tauthor = {Naveed, Humza and Khan, Asad Ullah and Qiu, Shi and Saqib, Muhammad and Anwar, Saeed and Usman, Muhammad and Akhtar, Naveed and Barnes, Nick and Mian, Ajmal},\n\tmonth = dec,\n\tyear = {2023},\n\tnote = {arXiv:2307.06435 [cs]},\n\tkeywords = {Computer Science - Computation and Language, LLM, Large Language Model, Overview, Überblick},\n}\n\n
\n
\n\n\n
\n Large Language Models (LLMs) have recently demonstrated remarkable capabilities in natural language processing tasks and beyond. This success of LLMs has led to a large influx of research contributions in this direction. These works encompass diverse topics such as architectural innovations, better training strategies, context length improvements, fine-tuning, multi-modal LLMs, robotics, datasets, benchmarking, efficiency, and more. With the rapid development of techniques and regular breakthroughs in LLM research, it has become considerably challenging to perceive the bigger picture of the advances in this direction. Considering the rapidly emerging plethora of literature on LLMs, it is imperative that the research community is able to benefit from a concise yet comprehensive overview of the recent developments in this field. This article provides an overview of the existing literature on a broad range of LLM-related concepts. Our self-contained comprehensive overview of LLMs discusses relevant background concepts along with covering the advanced topics at the frontier of research in LLMs. This review article is intended to not only provide a systematic survey but also a quick comprehensive reference for the researchers and practitioners to draw insights from extensive informative summaries of the existing works to advance the LLM research.\n
\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Radford, A.\n \n \n (1)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n \n Language Models are Unsupervised Multitask Learners.\n \n \n \n \n\n\n \n Radford, A.; Wu, J.; Child, R.; Luan, D.; Amodei, D.; and Sutskever, I.\n\n\n \n\n\n\n In 2019. \n \n\n\n\n
\n\n\n\n \n \n \"LanguagePaper\n  \n \n\n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@inproceedings{radford_language_2019,\n\ttitle = {Language {Models} are {Unsupervised} {Multitask} {Learners}},\n\turl = {https://www.semanticscholar.org/paper/Language-Models-are-Unsupervised-Multitask-Learners-Radford-Wu/9405cc0d6169988371b2755e573cc28650d14dfe},\n\tabstract = {Natural language processing tasks, such as question answering, machine translation, reading comprehension, and summarization, are typically approached with supervised learning on taskspecific datasets. We demonstrate that language models begin to learn these tasks without any explicit supervision when trained on a new dataset of millions of webpages called WebText. When conditioned on a document plus questions, the answers generated by the language model reach 55 F1 on the CoQA dataset matching or exceeding the performance of 3 out of 4 baseline systems without using the 127,000+ training examples. The capacity of the language model is essential to the success of zero-shot task transfer and increasing it improves performance in a log-linear fashion across tasks. Our largest model, GPT-2, is a 1.5B parameter Transformer that achieves state of the art results on 7 out of 8 tested language modeling datasets in a zero-shot setting but still underfits WebText. Samples from the model reflect these improvements and contain coherent paragraphs of text. These findings suggest a promising path towards building language processing systems which learn to perform tasks from their naturally occurring demonstrations.},\n\turldate = {2023-02-02},\n\tauthor = {Radford, Alec and Wu, Jeff and Child, Rewon and Luan, D. and Amodei, Dario and Sutskever, Ilya},\n\tyear = {2019},\n}\n\n
\n
\n\n\n
\n Natural language processing tasks, such as question answering, machine translation, reading comprehension, and summarization, are typically approached with supervised learning on taskspecific datasets. We demonstrate that language models begin to learn these tasks without any explicit supervision when trained on a new dataset of millions of webpages called WebText. When conditioned on a document plus questions, the answers generated by the language model reach 55 F1 on the CoQA dataset matching or exceeding the performance of 3 out of 4 baseline systems without using the 127,000+ training examples. The capacity of the language model is essential to the success of zero-shot task transfer and increasing it improves performance in a log-linear fashion across tasks. Our largest model, GPT-2, is a 1.5B parameter Transformer that achieves state of the art results on 7 out of 8 tested language modeling datasets in a zero-shot setting but still underfits WebText. Samples from the model reflect these improvements and contain coherent paragraphs of text. These findings suggest a promising path towards building language processing systems which learn to perform tasks from their naturally occurring demonstrations.\n
\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Raji, I.\n \n \n (1)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n \n Model Cards for Model Reporting.\n \n \n \n \n\n\n \n Mitchell, M.; Wu, S.; Zaldivar, A.; Barnes, P.; Vasserman, L.; Hutchinson, B.; Spitzer, E.; Raji, I. D.; and Gebru, T.\n\n\n \n\n\n\n Proceedings of the Conference on Fairness, Accountability, and Transparency,220–229. January 2019.\n arXiv: 1810.03993\n\n\n\n
\n\n\n\n \n \n \"ModelPaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n \n \n \n \n\n\n\n
\n
@article{mitchell_model_2019,\n\ttitle = {Model {Cards} for {Model} {Reporting}},\n\turl = {http://arxiv.org/abs/1810.03993},\n\tdoi = {10.1145/3287560.3287596},\n\tabstract = {Trained machine learning models are increasingly used to perform high-impact tasks in areas such as law enforcement, medicine, education, and employment. In order to clarify the intended use cases of machine learning models and minimize their usage in contexts for which they are not well suited, we recommend that released models be accompanied by documentation detailing their performance characteristics. In this paper, we propose a framework that we call model cards, to encourage such transparent model reporting. Model cards are short documents accompanying trained machine learning models that provide benchmarked evaluation in a variety of conditions, such as across different cultural, demographic, or phenotypic groups (e.g., race, geographic location, sex, Fitzpatrick skin type) and intersectional groups (e.g., age and race, or sex and Fitzpatrick skin type) that are relevant to the intended application domains. Model cards also disclose the context in which models are intended to be used, details of the performance evaluation procedures, and other relevant information. While we focus primarily on human-centered machine learning models in the application fields of computer vision and natural language processing, this framework can be used to document any trained machine learning model. To solidify the concept, we provide cards for two supervised models: One trained to detect smiling faces in images, and one trained to detect toxic comments in text. We propose model cards as a step towards the responsible democratization of machine learning and related AI technology, increasing transparency into how well AI technology works. We hope this work encourages those releasing trained machine learning models to accompany model releases with similar detailed evaluation numbers and other relevant documentation.},\n\turldate = {2022-01-24},\n\tjournal = {Proceedings of the Conference on Fairness, Accountability, and Transparency},\n\tauthor = {Mitchell, Margaret and Wu, Simone and Zaldivar, Andrew and Barnes, Parker and Vasserman, Lucy and Hutchinson, Ben and Spitzer, Elena and Raji, Inioluwa Deborah and Gebru, Timnit},\n\tmonth = jan,\n\tyear = {2019},\n\tnote = {arXiv: 1810.03993},\n\tkeywords = {Computer Science - Artificial Intelligence, Computer Science - Machine Learning},\n\tpages = {220--229},\n}\n\n
\n
\n\n\n
\n Trained machine learning models are increasingly used to perform high-impact tasks in areas such as law enforcement, medicine, education, and employment. In order to clarify the intended use cases of machine learning models and minimize their usage in contexts for which they are not well suited, we recommend that released models be accompanied by documentation detailing their performance characteristics. In this paper, we propose a framework that we call model cards, to encourage such transparent model reporting. Model cards are short documents accompanying trained machine learning models that provide benchmarked evaluation in a variety of conditions, such as across different cultural, demographic, or phenotypic groups (e.g., race, geographic location, sex, Fitzpatrick skin type) and intersectional groups (e.g., age and race, or sex and Fitzpatrick skin type) that are relevant to the intended application domains. Model cards also disclose the context in which models are intended to be used, details of the performance evaluation procedures, and other relevant information. While we focus primarily on human-centered machine learning models in the application fields of computer vision and natural language processing, this framework can be used to document any trained machine learning model. To solidify the concept, we provide cards for two supervised models: One trained to detect smiling faces in images, and one trained to detect toxic comments in text. We propose model cards as a step towards the responsible democratization of machine learning and related AI technology, increasing transparency into how well AI technology works. We hope this work encourages those releasing trained machine learning models to accompany model releases with similar detailed evaluation numbers and other relevant documentation.\n
\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Reul, C.\n \n \n (1)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n Calamari - A High-Performance Tensorflow-based Deep Learning Package for Optical Character Recognition.\n \n \n \n\n\n \n Wick, C.; Reul, C.; and Puppe, F.\n\n\n \n\n\n\n Digital Humanities Quarterly, 14(1). 2020.\n \n\n\n\n
\n\n\n\n \n\n \n\n \n link\n  \n \n\n bibtex\n \n\n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@article{wick2020,\n\ttitle = {Calamari - {A} {High}-{Performance} {Tensorflow}-based {Deep} {Learning} {Package} for {Optical} {Character} {Recognition}},\n\tvolume = {14},\n\tnumber = {1},\n\tjournal = {Digital Humanities Quarterly},\n\tauthor = {Wick, Christoph and Reul, Christian and Puppe, Frank},\n\tyear = {2020},\n}\n\n
\n
\n\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Romary, L.\n \n \n (1)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n \n HTR-United: Mutualisons la vérité de terrain!.\n \n \n \n \n\n\n \n Chagué, A.; Clérice, T.; and Romary, L.\n\n\n \n\n\n\n In DHNord2021-Publier, partager, réutiliser les données de la recherche: les data papers et leurs enjeux, 2021. \n \n\n\n\n
\n\n\n\n \n \n \"HTR-United:Paper\n  \n \n\n \n\n \n link\n  \n \n\n bibtex\n \n\n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@inproceedings{chague_htr-united_2021,\n\ttitle = {{HTR}-{United}: {Mutualisons} la vérité de terrain!},\n\tshorttitle = {{HTR}-{United}},\n\turl = {https://hal.science/hal-03398740/document},\n\turldate = {2023-10-27},\n\tbooktitle = {{DHNord2021}-{Publier}, partager, réutiliser les données de la recherche: les data papers et leurs enjeux},\n\tauthor = {Chagué, Alix and Clérice, Thibault and Romary, Laurent},\n\tyear = {2021},\n}\n\n
\n
\n\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Russell, S.\n \n \n (1)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n Artificial intelligence: a modern approach.\n \n \n \n\n\n \n Russell, S. J.; Norvig, P.; Chang, M.; Devlin, J.; Dragan, A.; Forsyth, D.; Goodfellow, I.; Malik, J.; Mansinghka, V.; Pearl, J.; and Wooldridge, M. J.\n\n\n \n\n\n\n of Pearson series in artificial intelligencePearson, Harlow, Fourth edition, global edition edition, 2022.\n \n\n\n\n
\n\n\n\n \n\n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@book{russell_artificial_2022,\n\taddress = {Harlow},\n\tedition = {Fourth edition, global edition},\n\tseries = {Pearson series in artificial intelligence},\n\ttitle = {Artificial intelligence: a modern approach},\n\tisbn = {978-1-292-40113-3},\n\tshorttitle = {Artificial intelligence},\n\tabstract = {"Updated edition of popular textbook on Artificial Intelligence. This edition specific looks at ways of keeping artificial intelligence under control"},\n\tlanguage = {eng},\n\tpublisher = {Pearson},\n\tauthor = {Russell, Stuart J. and Norvig, Peter and Chang, Ming-wei and Devlin, Jacob and Dragan, Anca and Forsyth, David and Goodfellow, Ian and Malik, Jitendra and Mansinghka, Vikas and Pearl, Judea and Wooldridge, Michael J.},\n\tyear = {2022},\n}\n\n
\n
\n\n\n
\n \"Updated edition of popular textbook on Artificial Intelligence. This edition specific looks at ways of keeping artificial intelligence under control\"\n
\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Saqib, M.\n \n \n (2)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n \n A Comprehensive Overview of Large Language Models.\n \n \n \n \n\n\n \n Naveed, H.; Khan, A. U.; Qiu, S.; Saqib, M.; Anwar, S.; Usman, M.; Akhtar, N.; Barnes, N.; and Mian, A.\n\n\n \n\n\n\n December 2023.\n arXiv:2307.06435 [cs]\n\n\n\n
\n\n\n\n \n \n \"APaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n \n \n \n \n \n \n \n \n \n \n\n\n\n
\n
@misc{naveed2023,\n\ttitle = {A {Comprehensive} {Overview} of {Large} {Language} {Models}},\n\turl = {http://arxiv.org/abs/2307.06435},\n\tdoi = {10.48550/arXiv.2307.06435},\n\tabstract = {Large Language Models (LLMs) have recently demonstrated remarkable capabilities in natural language processing tasks and beyond. This success of LLMs has led to a large influx of research contributions in this direction. These works encompass diverse topics such as architectural innovations, better training strategies, context length improvements, fine-tuning, multi-modal LLMs, robotics, datasets, benchmarking, efficiency, and more. With the rapid development of techniques and regular breakthroughs in LLM research, it has become considerably challenging to perceive the bigger picture of the advances in this direction. Considering the rapidly emerging plethora of literature on LLMs, it is imperative that the research community is able to benefit from a concise yet comprehensive overview of the recent developments in this field. This article provides an overview of the existing literature on a broad range of LLM-related concepts. Our self-contained comprehensive overview of LLMs discusses relevant background concepts along with covering the advanced topics at the frontier of research in LLMs. This review article is intended to not only provide a systematic survey but also a quick comprehensive reference for the researchers and practitioners to draw insights from extensive informative summaries of the existing works to advance the LLM research.},\n\turldate = {2024-02-19},\n\tpublisher = {arXiv},\n\tauthor = {Naveed, Humza and Khan, Asad Ullah and Qiu, Shi and Saqib, Muhammad and Anwar, Saeed and Usman, Muhammad and Akhtar, Naveed and Barnes, Nick and Mian, Ajmal},\n\tmonth = dec,\n\tyear = {2023},\n\tnote = {arXiv:2307.06435 [cs]},\n\tkeywords = {Computer Science - Computation and Language, LLM, Large Language Model, Overview, Überblick},\n}\n\n
\n
\n\n\n
\n Large Language Models (LLMs) have recently demonstrated remarkable capabilities in natural language processing tasks and beyond. This success of LLMs has led to a large influx of research contributions in this direction. These works encompass diverse topics such as architectural innovations, better training strategies, context length improvements, fine-tuning, multi-modal LLMs, robotics, datasets, benchmarking, efficiency, and more. With the rapid development of techniques and regular breakthroughs in LLM research, it has become considerably challenging to perceive the bigger picture of the advances in this direction. Considering the rapidly emerging plethora of literature on LLMs, it is imperative that the research community is able to benefit from a concise yet comprehensive overview of the recent developments in this field. This article provides an overview of the existing literature on a broad range of LLM-related concepts. Our self-contained comprehensive overview of LLMs discusses relevant background concepts along with covering the advanced topics at the frontier of research in LLMs. This review article is intended to not only provide a systematic survey but also a quick comprehensive reference for the researchers and practitioners to draw insights from extensive informative summaries of the existing works to advance the LLM research.\n
\n\n\n
\n\n\n
\n \n\n \n \n \n \n \n \n A Comprehensive Overview of Large Language Models.\n \n \n \n \n\n\n \n Naveed, H.; Khan, A. U.; Qiu, S.; Saqib, M.; Anwar, S.; Usman, M.; Akhtar, N.; Barnes, N.; and Mian, A.\n\n\n \n\n\n\n December 2023.\n arXiv:2307.06435 [cs]\n\n\n\n
\n\n\n\n \n \n \"APaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n \n \n \n \n \n \n \n \n \n \n\n\n\n
\n
@misc{naveed2023a,\n\ttitle = {A {Comprehensive} {Overview} of {Large} {Language} {Models}},\n\turl = {http://arxiv.org/abs/2307.06435},\n\tdoi = {10.48550/arXiv.2307.06435},\n\tabstract = {Large Language Models (LLMs) have recently demonstrated remarkable capabilities in natural language processing tasks and beyond. This success of LLMs has led to a large influx of research contributions in this direction. These works encompass diverse topics such as architectural innovations, better training strategies, context length improvements, fine-tuning, multi-modal LLMs, robotics, datasets, benchmarking, efficiency, and more. With the rapid development of techniques and regular breakthroughs in LLM research, it has become considerably challenging to perceive the bigger picture of the advances in this direction. Considering the rapidly emerging plethora of literature on LLMs, it is imperative that the research community is able to benefit from a concise yet comprehensive overview of the recent developments in this field. This article provides an overview of the existing literature on a broad range of LLM-related concepts. Our self-contained comprehensive overview of LLMs discusses relevant background concepts along with covering the advanced topics at the frontier of research in LLMs. This review article is intended to not only provide a systematic survey but also a quick comprehensive reference for the researchers and practitioners to draw insights from extensive informative summaries of the existing works to advance the LLM research.},\n\turldate = {2024-02-19},\n\tpublisher = {arXiv},\n\tauthor = {Naveed, Humza and Khan, Asad Ullah and Qiu, Shi and Saqib, Muhammad and Anwar, Saeed and Usman, Muhammad and Akhtar, Naveed and Barnes, Nick and Mian, Ajmal},\n\tmonth = dec,\n\tyear = {2023},\n\tnote = {arXiv:2307.06435 [cs]},\n\tkeywords = {Computer Science - Computation and Language, LLM, Large Language Model, Overview, Überblick},\n}\n\n
\n
\n\n\n
\n Large Language Models (LLMs) have recently demonstrated remarkable capabilities in natural language processing tasks and beyond. This success of LLMs has led to a large influx of research contributions in this direction. These works encompass diverse topics such as architectural innovations, better training strategies, context length improvements, fine-tuning, multi-modal LLMs, robotics, datasets, benchmarking, efficiency, and more. With the rapid development of techniques and regular breakthroughs in LLM research, it has become considerably challenging to perceive the bigger picture of the advances in this direction. Considering the rapidly emerging plethora of literature on LLMs, it is imperative that the research community is able to benefit from a concise yet comprehensive overview of the recent developments in this field. This article provides an overview of the existing literature on a broad range of LLM-related concepts. Our self-contained comprehensive overview of LLMs discusses relevant background concepts along with covering the advanced topics at the frontier of research in LLMs. This review article is intended to not only provide a systematic survey but also a quick comprehensive reference for the researchers and practitioners to draw insights from extensive informative summaries of the existing works to advance the LLM research.\n
\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Sarker, M.\n \n \n (1)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n \n Neuro-symbolic approaches in artificial intelligence.\n \n \n \n \n\n\n \n Hitzler, P.; Eberhart, A.; Ebrahimi, M.; Sarker, M. K.; and Zhou, L.\n\n\n \n\n\n\n National Science Review, 9(6): nwac035. June 2022.\n \n\n\n\n
\n\n\n\n \n \n \"Neuro-symbolicPaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@article{hitzler2022,\n\ttitle = {Neuro-symbolic approaches in artificial intelligence},\n\tvolume = {9},\n\tissn = {2095-5138},\n\turl = {https://doi.org/10.1093/nsr/nwac035},\n\tdoi = {10.1093/nsr/nwac035},\n\tnumber = {6},\n\turldate = {2024-01-23},\n\tjournal = {National Science Review},\n\tauthor = {Hitzler, Pascal and Eberhart, Aaron and Ebrahimi, Monireh and Sarker, Md Kamruzzaman and Zhou, Lu},\n\tmonth = jun,\n\tyear = {2022},\n\tpages = {nwac035},\n}\n\n
\n
\n\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Scheurer, P.\n \n \n (1)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n \n Bullingers Briefwechsel zugänglich machen: Stand der Handschriftenerkennung.\n \n \n \n \n\n\n \n Ströbel, P.; Hodel, T.; Fischer, A.; Scius, A.; Wolf, B.; Janka, A.; Widmer, J.; Scheurer, P.; and Volk, M.\n\n\n \n\n\n\n . March 2023.\n Publisher: [object Object]\n\n\n\n
\n\n\n\n \n \n \"BullingersPaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n\n\n\n
\n
@article{strobel2023a,\n\ttitle = {Bullingers {Briefwechsel} zugänglich machen: {Stand} der {Handschriftenerkennung}},\n\tcopyright = {Creative Commons Attribution 4.0 International, Open Access},\n\tshorttitle = {Bullingers {Briefwechsel} zugänglich machen},\n\turl = {https://zenodo.org/record/7715357},\n\tdoi = {10.5281/ZENODO.7715357},\n\tabstract = {"Anhand des Briefwechsels Heinrich Bullingers (1504-1575), das rund 10'000 Briefe umfasst, demonstrieren wir den Stand der Forschung in automatisierter Handschriftenerkennung. Es finden sich mehr als hundert unterschiedliche Schreiberhände in den Briefen mit sehr unterschiedlicher Verteilung. Das Korpus ist zweisprachig (Latein/Deutsch) und teilweise findet der Sprachwechsel innerhalb von Abschnitten oder gar Sätzen statt. Auf Grund dieser Vielfalt eignet sich der Briefwechsel optimal als Testumgebung für entsprechende Algorithmen und ist aufschlussreiche für Forschungsprojekte und Erinnerungsinstitutionen mit ähnlichen Problemstellungen. Im Paper werden drei Verfahren gegeneinander gestellt und abgewogen. Im folgenden werde drei Ansätze an dem Korpus getestet, die Aufschlüsse zum Stand und möglichen Entwicklungen im Bereich der Handschriftenerkennung versprechen. Erstens wird mit Transkribus eine etablierte Plattform genutzt, die zwei Engines (HTR+ und PyLaia) anbietet. Zweitens wird mit Hilfe von Data Augmentation versucht die Erkennung mit der state-of-the-art Engine HTRFlor zu verbessern und drittens werden neue Transformer-basierte Modelle (TrOCR) eingesetzt." Ein Beitrag zur 9. Tagung des Verbands "Digital Humanities im deutschsprachigen Raum" - DHd 2023 Open Humanities Open Culture.},\n\turldate = {2024-04-22},\n\tauthor = {Ströbel, Phillip and Hodel, Tobias and Fischer, Andreas and Scius, Anna and Wolf, Beat and Janka, Anna and Widmer, Jonas and Scheurer, Patricia and Volk, Martin},\n\tcollaborator = {Trilcke, Peer and Busch, Anna and Helling, Patrick and Plum, Alistair and Wolter, Vivien and Weis, Joëlle and Chudoba, Hendrik},\n\tmonth = mar,\n\tyear = {2023},\n\tnote = {Publisher: [object Object]},\n\tkeywords = {Annotieren, Bewertung, DHd2023, Data augmentation, Daten, Handschriftenerkennung, Manuskript, Transkription, maschinelles Lernen},\n}\n\n
\n
\n\n\n
\n \"Anhand des Briefwechsels Heinrich Bullingers (1504-1575), das rund 10'000 Briefe umfasst, demonstrieren wir den Stand der Forschung in automatisierter Handschriftenerkennung. Es finden sich mehr als hundert unterschiedliche Schreiberhände in den Briefen mit sehr unterschiedlicher Verteilung. Das Korpus ist zweisprachig (Latein/Deutsch) und teilweise findet der Sprachwechsel innerhalb von Abschnitten oder gar Sätzen statt. Auf Grund dieser Vielfalt eignet sich der Briefwechsel optimal als Testumgebung für entsprechende Algorithmen und ist aufschlussreiche für Forschungsprojekte und Erinnerungsinstitutionen mit ähnlichen Problemstellungen. Im Paper werden drei Verfahren gegeneinander gestellt und abgewogen. Im folgenden werde drei Ansätze an dem Korpus getestet, die Aufschlüsse zum Stand und möglichen Entwicklungen im Bereich der Handschriftenerkennung versprechen. Erstens wird mit Transkribus eine etablierte Plattform genutzt, die zwei Engines (HTR+ und PyLaia) anbietet. Zweitens wird mit Hilfe von Data Augmentation versucht die Erkennung mit der state-of-the-art Engine HTRFlor zu verbessern und drittens werden neue Transformer-basierte Modelle (TrOCR) eingesetzt.\" Ein Beitrag zur 9. Tagung des Verbands \"Digital Humanities im deutschsprachigen Raum\" - DHd 2023 Open Humanities Open Culture.\n
\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Schmidhuber, J.\n \n \n (2)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n \n LSTM can solve hard long time lag problems.\n \n \n \n \n\n\n \n Hochreiter, S.; and Schmidhuber, J.\n\n\n \n\n\n\n Advances in neural information processing systems, 9. 1996.\n \n\n\n\n
\n\n\n\n \n \n \"LSTMPaper\n  \n \n\n \n\n \n link\n  \n \n\n bibtex\n \n\n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@article{hochreiter_lstm_1996,\n\ttitle = {{LSTM} can solve hard long time lag problems},\n\tvolume = {9},\n\turl = {https://proceedings.neurips.cc/paper/1996/hash/a4d2f0d23dcc84ce983ff9157f8b7f88-Abstract.html},\n\turldate = {2023-09-27},\n\tjournal = {Advances in neural information processing systems},\n\tauthor = {Hochreiter, Sepp and Schmidhuber, Jürgen},\n\tyear = {1996},\n}\n\n
\n
\n\n\n\n
\n\n\n
\n \n\n \n \n \n \n \n \n A Local Learning Algorithm for Dynamic Feedforward and Recurrent Networks.\n \n \n \n \n\n\n \n Schmidhuber, J.\n\n\n \n\n\n\n Connection Science, 1(4): 403–412. January 1989.\n \n\n\n\n
\n\n\n\n \n \n \"APaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@article{schmidhuber_local_1989,\n\ttitle = {A {Local} {Learning} {Algorithm} for {Dynamic} {Feedforward} and {Recurrent} {Networks}},\n\tvolume = {1},\n\tissn = {0954-0091, 1360-0494},\n\turl = {https://www.tandfonline.com/doi/full/10.1080/09540098908915650},\n\tdoi = {10.1080/09540098908915650},\n\tlanguage = {en},\n\tnumber = {4},\n\turldate = {2023-09-27},\n\tjournal = {Connection Science},\n\tauthor = {Schmidhuber, Jurgen},\n\tmonth = jan,\n\tyear = {1989},\n\tpages = {403--412},\n}\n\n
\n
\n\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Schwägerl-Melchior, V.\n \n \n (1)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n \n Automatic Writer Identification in Historical Documents: A Case Study.\n \n \n \n \n\n\n \n Christlein, V.; Diem, M.; Kleber, F.; Mühlberger, G.; Schwägerl-Melchior, V.; Van Gelder, E.; and Maier, A.\n\n\n \n\n\n\n Zeitschrift für digitale Geisteswissenschaften. 2016.\n Publisher: HAB - Herzog August Bibliothek\n\n\n\n
\n\n\n\n \n \n \"AutomaticPaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@article{christleinAutomaticWriterIdentification2016,\n\ttitle = {Automatic {Writer} {Identification} in {Historical} {Documents}: {A} {Case} {Study}},\n\tshorttitle = {Automatic {Writer} {Identification} in {Historical} {Documents}},\n\turl = {http://www.zfdg.de/2016_002},\n\tdoi = {10.17175/2016_002},\n\tlanguage = {en},\n\turldate = {2023-11-17},\n\tjournal = {Zeitschrift für digitale Geisteswissenschaften},\n\tauthor = {Christlein, Vincent and Diem, Markus and Kleber, Florian and Mühlberger, Günter and Schwägerl-Melchior, Verena and Van Gelder, Esther and Maier, Andreas},\n\tyear = {2016},\n\tnote = {Publisher: HAB - Herzog August Bibliothek},\n}\n\n
\n
\n\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Scius, A.\n \n \n (1)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n \n Bullingers Briefwechsel zugänglich machen: Stand der Handschriftenerkennung.\n \n \n \n \n\n\n \n Ströbel, P.; Hodel, T.; Fischer, A.; Scius, A.; Wolf, B.; Janka, A.; Widmer, J.; Scheurer, P.; and Volk, M.\n\n\n \n\n\n\n . March 2023.\n Publisher: [object Object]\n\n\n\n
\n\n\n\n \n \n \"BullingersPaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n\n\n\n
\n
@article{strobel2023a,\n\ttitle = {Bullingers {Briefwechsel} zugänglich machen: {Stand} der {Handschriftenerkennung}},\n\tcopyright = {Creative Commons Attribution 4.0 International, Open Access},\n\tshorttitle = {Bullingers {Briefwechsel} zugänglich machen},\n\turl = {https://zenodo.org/record/7715357},\n\tdoi = {10.5281/ZENODO.7715357},\n\tabstract = {"Anhand des Briefwechsels Heinrich Bullingers (1504-1575), das rund 10'000 Briefe umfasst, demonstrieren wir den Stand der Forschung in automatisierter Handschriftenerkennung. Es finden sich mehr als hundert unterschiedliche Schreiberhände in den Briefen mit sehr unterschiedlicher Verteilung. Das Korpus ist zweisprachig (Latein/Deutsch) und teilweise findet der Sprachwechsel innerhalb von Abschnitten oder gar Sätzen statt. Auf Grund dieser Vielfalt eignet sich der Briefwechsel optimal als Testumgebung für entsprechende Algorithmen und ist aufschlussreiche für Forschungsprojekte und Erinnerungsinstitutionen mit ähnlichen Problemstellungen. Im Paper werden drei Verfahren gegeneinander gestellt und abgewogen. Im folgenden werde drei Ansätze an dem Korpus getestet, die Aufschlüsse zum Stand und möglichen Entwicklungen im Bereich der Handschriftenerkennung versprechen. Erstens wird mit Transkribus eine etablierte Plattform genutzt, die zwei Engines (HTR+ und PyLaia) anbietet. Zweitens wird mit Hilfe von Data Augmentation versucht die Erkennung mit der state-of-the-art Engine HTRFlor zu verbessern und drittens werden neue Transformer-basierte Modelle (TrOCR) eingesetzt." Ein Beitrag zur 9. Tagung des Verbands "Digital Humanities im deutschsprachigen Raum" - DHd 2023 Open Humanities Open Culture.},\n\turldate = {2024-04-22},\n\tauthor = {Ströbel, Phillip and Hodel, Tobias and Fischer, Andreas and Scius, Anna and Wolf, Beat and Janka, Anna and Widmer, Jonas and Scheurer, Patricia and Volk, Martin},\n\tcollaborator = {Trilcke, Peer and Busch, Anna and Helling, Patrick and Plum, Alistair and Wolter, Vivien and Weis, Joëlle and Chudoba, Hendrik},\n\tmonth = mar,\n\tyear = {2023},\n\tnote = {Publisher: [object Object]},\n\tkeywords = {Annotieren, Bewertung, DHd2023, Data augmentation, Daten, Handschriftenerkennung, Manuskript, Transkription, maschinelles Lernen},\n}\n\n
\n
\n\n\n
\n \"Anhand des Briefwechsels Heinrich Bullingers (1504-1575), das rund 10'000 Briefe umfasst, demonstrieren wir den Stand der Forschung in automatisierter Handschriftenerkennung. Es finden sich mehr als hundert unterschiedliche Schreiberhände in den Briefen mit sehr unterschiedlicher Verteilung. Das Korpus ist zweisprachig (Latein/Deutsch) und teilweise findet der Sprachwechsel innerhalb von Abschnitten oder gar Sätzen statt. Auf Grund dieser Vielfalt eignet sich der Briefwechsel optimal als Testumgebung für entsprechende Algorithmen und ist aufschlussreiche für Forschungsprojekte und Erinnerungsinstitutionen mit ähnlichen Problemstellungen. Im Paper werden drei Verfahren gegeneinander gestellt und abgewogen. Im folgenden werde drei Ansätze an dem Korpus getestet, die Aufschlüsse zum Stand und möglichen Entwicklungen im Bereich der Handschriftenerkennung versprechen. Erstens wird mit Transkribus eine etablierte Plattform genutzt, die zwei Engines (HTR+ und PyLaia) anbietet. Zweitens wird mit Hilfe von Data Augmentation versucht die Erkennung mit der state-of-the-art Engine HTRFlor zu verbessern und drittens werden neue Transformer-basierte Modelle (TrOCR) eingesetzt.\" Ein Beitrag zur 9. Tagung des Verbands \"Digital Humanities im deutschsprachigen Raum\" - DHd 2023 Open Humanities Open Culture.\n
\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Scius-Bertrand, A.\n \n \n (1)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n \n The Bullinger Dataset: A Writer Adaptation Challenge.\n \n \n \n \n\n\n \n Scius-Bertrand, A.; Ströbel, P.; Volk, M.; Hodel, T.; and Fischer, A.\n\n\n \n\n\n\n In Fink, G. A.; Jain, R.; Kise, K.; and Zanibbi, R., editor(s), Document Analysis and Recognition - ICDAR 2023, volume 14187, pages 397–410. Springer Nature Switzerland, Cham, 2023.\n Series Title: Lecture Notes in Computer Science\n\n\n\n
\n\n\n\n \n \n \"ThePaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@incollection{fink_bullinger_2023,\n\taddress = {Cham},\n\ttitle = {The {Bullinger} {Dataset}: {A} {Writer} {Adaptation} {Challenge}},\n\tvolume = {14187},\n\tisbn = {978-3-031-41675-0 978-3-031-41676-7},\n\tshorttitle = {The {Bullinger} {Dataset}},\n\turl = {https://link.springer.com/10.1007/978-3-031-41676-7_23},\n\tlanguage = {en},\n\turldate = {2023-08-24},\n\tbooktitle = {Document {Analysis} and {Recognition} - {ICDAR} 2023},\n\tpublisher = {Springer Nature Switzerland},\n\tauthor = {Scius-Bertrand, Anna and Ströbel, Phillip and Volk, Martin and Hodel, Tobias and Fischer, Andreas},\n\teditor = {Fink, Gernot A. and Jain, Rajiv and Kise, Koichi and Zanibbi, Richard},\n\tyear = {2023},\n\tdoi = {10.1007/978-3-031-41676-7_23},\n\tnote = {Series Title: Lecture Notes in Computer Science},\n\tpages = {397--410},\n}\n\n
\n
\n\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Shazeer, N.\n \n \n (1)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n \n Attention Is All You Need.\n \n \n \n \n\n\n \n Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A. N.; Kaiser, L.; and Polosukhin, I.\n\n\n \n\n\n\n December 2017.\n arXiv:1706.03762 [cs]\n\n\n\n
\n\n\n\n \n \n \"AttentionPaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n \n \n \n \n\n\n\n
\n
@misc{vaswani2017,\n\ttitle = {Attention {Is} {All} {You} {Need}},\n\turl = {http://arxiv.org/abs/1706.03762},\n\tdoi = {10.48550/arXiv.1706.03762},\n\tabstract = {The dominant sequence transduction models are based on complex recurrent or convolutional neural networks in an encoder-decoder configuration. The best performing models also connect the encoder and decoder through an attention mechanism. We propose a new simple network architecture, the Transformer, based solely on attention mechanisms, dispensing with recurrence and convolutions entirely. Experiments on two machine translation tasks show these models to be superior in quality while being more parallelizable and requiring significantly less time to train. Our model achieves 28.4 BLEU on the WMT 2014 English-to-German translation task, improving over the existing best results, including ensembles by over 2 BLEU. On the WMT 2014 English-to-French translation task, our model establishes a new single-model state-of-the-art BLEU score of 41.8 after training for 3.5 days on eight GPUs, a small fraction of the training costs of the best models from the literature. We show that the Transformer generalizes well to other tasks by applying it successfully to English constituency parsing both with large and limited training data.},\n\turldate = {2023-02-02},\n\tpublisher = {arXiv},\n\tauthor = {Vaswani, Ashish and Shazeer, Noam and Parmar, Niki and Uszkoreit, Jakob and Jones, Llion and Gomez, Aidan N. and Kaiser, Lukasz and Polosukhin, Illia},\n\tmonth = dec,\n\tyear = {2017},\n\tnote = {arXiv:1706.03762 [cs]},\n\tkeywords = {Computer Science - Computation and Language, Computer Science - Machine Learning},\n}\n\n
\n
\n\n\n
\n The dominant sequence transduction models are based on complex recurrent or convolutional neural networks in an encoder-decoder configuration. The best performing models also connect the encoder and decoder through an attention mechanism. We propose a new simple network architecture, the Transformer, based solely on attention mechanisms, dispensing with recurrence and convolutions entirely. Experiments on two machine translation tasks show these models to be superior in quality while being more parallelizable and requiring significantly less time to train. Our model achieves 28.4 BLEU on the WMT 2014 English-to-German translation task, improving over the existing best results, including ensembles by over 2 BLEU. On the WMT 2014 English-to-French translation task, our model establishes a new single-model state-of-the-art BLEU score of 41.8 after training for 3.5 days on eight GPUs, a small fraction of the training costs of the best models from the literature. We show that the Transformer generalizes well to other tasks by applying it successfully to English constituency parsing both with large and limited training data.\n
\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Shillingford, B.\n \n \n (1)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n \n Restoring and attributing ancient texts using deep neural networks.\n \n \n \n \n\n\n \n Assael, Y.; Sommerschield, T.; Shillingford, B.; Bordbar, M.; Pavlopoulos, J.; Chatzipanagiotou, M.; Androutsopoulos, I.; Prag, J.; and de Freitas, N.\n\n\n \n\n\n\n Nature, 603(7900): 280–283. March 2022.\n Number: 7900 Publisher: Nature Publishing Group\n\n\n\n
\n\n\n\n \n \n \"RestoringPaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n \n \n \n \n \n \n\n\n\n
\n
@article{assael_restoring_2022,\n\ttitle = {Restoring and attributing ancient texts using deep neural networks},\n\tvolume = {603},\n\tcopyright = {2022 The Author(s)},\n\tissn = {1476-4687},\n\turl = {https://www.nature.com/articles/s41586-022-04448-z/},\n\tdoi = {10.1038/s41586-022-04448-z},\n\tabstract = {Ancient history relies on disciplines such as epigraphy—the study of inscribed texts known as inscriptions—for evidence of the thought, language, society and history of past civilizations1. However, over the centuries, many inscriptions have been damaged to the point of illegibility, transported far from their original location and their date of writing is steeped in uncertainty. Here we present Ithaca, a deep neural network for the textual restoration, geographical attribution and chronological attribution of ancient Greek inscriptions. Ithaca is designed to assist and expand the historian’s workflow. The architecture of Ithaca focuses on collaboration, decision support and interpretability. While Ithaca alone achieves 62\\% accuracy when restoring damaged texts, the use of Ithaca by historians improved their accuracy from 25\\% to 72\\%, confirming the synergistic effect of this research tool. Ithaca can attribute inscriptions to their original location with an accuracy of 71\\% and can date them to less than 30 years of their ground-truth ranges, redating key texts of Classical Athens and contributing to topical debates in ancient history. This research shows how models such as Ithaca can unlock the cooperative potential between artificial intelligence and historians, transformationally impacting the way that we study and write about one of the most important periods in human history.},\n\tlanguage = {en},\n\tnumber = {7900},\n\turldate = {2022-09-28},\n\tjournal = {Nature},\n\tauthor = {Assael, Yannis and Sommerschield, Thea and Shillingford, Brendan and Bordbar, Mahyar and Pavlopoulos, John and Chatzipanagiotou, Marita and Androutsopoulos, Ion and Prag, Jonathan and de Freitas, Nando},\n\tmonth = mar,\n\tyear = {2022},\n\tnote = {Number: 7900\nPublisher: Nature Publishing Group},\n\tkeywords = {Archaeology, Computer science, History},\n\tpages = {280--283},\n}\n\n
\n
\n\n\n
\n Ancient history relies on disciplines such as epigraphy—the study of inscribed texts known as inscriptions—for evidence of the thought, language, society and history of past civilizations1. However, over the centuries, many inscriptions have been damaged to the point of illegibility, transported far from their original location and their date of writing is steeped in uncertainty. Here we present Ithaca, a deep neural network for the textual restoration, geographical attribution and chronological attribution of ancient Greek inscriptions. Ithaca is designed to assist and expand the historian’s workflow. The architecture of Ithaca focuses on collaboration, decision support and interpretability. While Ithaca alone achieves 62% accuracy when restoring damaged texts, the use of Ithaca by historians improved their accuracy from 25% to 72%, confirming the synergistic effect of this research tool. Ithaca can attribute inscriptions to their original location with an accuracy of 71% and can date them to less than 30 years of their ground-truth ranges, redating key texts of Classical Athens and contributing to topical debates in ancient history. This research shows how models such as Ithaca can unlock the cooperative potential between artificial intelligence and historians, transformationally impacting the way that we study and write about one of the most important periods in human history.\n
\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Sommerschield, T.\n \n \n (1)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n \n Restoring and attributing ancient texts using deep neural networks.\n \n \n \n \n\n\n \n Assael, Y.; Sommerschield, T.; Shillingford, B.; Bordbar, M.; Pavlopoulos, J.; Chatzipanagiotou, M.; Androutsopoulos, I.; Prag, J.; and de Freitas, N.\n\n\n \n\n\n\n Nature, 603(7900): 280–283. March 2022.\n Number: 7900 Publisher: Nature Publishing Group\n\n\n\n
\n\n\n\n \n \n \"RestoringPaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n \n \n \n \n \n \n\n\n\n
\n
@article{assael_restoring_2022,\n\ttitle = {Restoring and attributing ancient texts using deep neural networks},\n\tvolume = {603},\n\tcopyright = {2022 The Author(s)},\n\tissn = {1476-4687},\n\turl = {https://www.nature.com/articles/s41586-022-04448-z/},\n\tdoi = {10.1038/s41586-022-04448-z},\n\tabstract = {Ancient history relies on disciplines such as epigraphy—the study of inscribed texts known as inscriptions—for evidence of the thought, language, society and history of past civilizations1. However, over the centuries, many inscriptions have been damaged to the point of illegibility, transported far from their original location and their date of writing is steeped in uncertainty. Here we present Ithaca, a deep neural network for the textual restoration, geographical attribution and chronological attribution of ancient Greek inscriptions. Ithaca is designed to assist and expand the historian’s workflow. The architecture of Ithaca focuses on collaboration, decision support and interpretability. While Ithaca alone achieves 62\\% accuracy when restoring damaged texts, the use of Ithaca by historians improved their accuracy from 25\\% to 72\\%, confirming the synergistic effect of this research tool. Ithaca can attribute inscriptions to their original location with an accuracy of 71\\% and can date them to less than 30 years of their ground-truth ranges, redating key texts of Classical Athens and contributing to topical debates in ancient history. This research shows how models such as Ithaca can unlock the cooperative potential between artificial intelligence and historians, transformationally impacting the way that we study and write about one of the most important periods in human history.},\n\tlanguage = {en},\n\tnumber = {7900},\n\turldate = {2022-09-28},\n\tjournal = {Nature},\n\tauthor = {Assael, Yannis and Sommerschield, Thea and Shillingford, Brendan and Bordbar, Mahyar and Pavlopoulos, John and Chatzipanagiotou, Marita and Androutsopoulos, Ion and Prag, Jonathan and de Freitas, Nando},\n\tmonth = mar,\n\tyear = {2022},\n\tnote = {Number: 7900\nPublisher: Nature Publishing Group},\n\tkeywords = {Archaeology, Computer science, History},\n\tpages = {280--283},\n}\n\n
\n
\n\n\n
\n Ancient history relies on disciplines such as epigraphy—the study of inscribed texts known as inscriptions—for evidence of the thought, language, society and history of past civilizations1. However, over the centuries, many inscriptions have been damaged to the point of illegibility, transported far from their original location and their date of writing is steeped in uncertainty. Here we present Ithaca, a deep neural network for the textual restoration, geographical attribution and chronological attribution of ancient Greek inscriptions. Ithaca is designed to assist and expand the historian’s workflow. The architecture of Ithaca focuses on collaboration, decision support and interpretability. While Ithaca alone achieves 62% accuracy when restoring damaged texts, the use of Ithaca by historians improved their accuracy from 25% to 72%, confirming the synergistic effect of this research tool. Ithaca can attribute inscriptions to their original location with an accuracy of 71% and can date them to less than 30 years of their ground-truth ranges, redating key texts of Classical Athens and contributing to topical debates in ancient history. This research shows how models such as Ithaca can unlock the cooperative potential between artificial intelligence and historians, transformationally impacting the way that we study and write about one of the most important periods in human history.\n
\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Soullard, Y.\n \n \n (1)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n A Light Transformer-Based Architecture for Handwritten Text Recognition.\n \n \n \n\n\n \n Barrere, K.; Soullard, Y.; Lemaitre, A.; and Coüasnon, B.\n\n\n \n\n\n\n In Uchida, S.; Barney, E.; and Eglin, V., editor(s), Document Analysis Systems, of Lecture Notes in Computer Science, pages 275–290, Cham, 2022. Springer International Publishing\n \n\n\n\n
\n\n\n\n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n \n \n \n \n \n \n \n \n \n \n\n\n\n
\n
@inproceedings{barrere_light_2022,\n\taddress = {Cham},\n\tseries = {Lecture {Notes} in {Computer} {Science}},\n\ttitle = {A {Light} {Transformer}-{Based} {Architecture} for {Handwritten} {Text} {Recognition}},\n\tisbn = {978-3-031-06555-2},\n\tdoi = {10.1007/978-3-031-06555-2_19},\n\tabstract = {Transformer models have been showing ground-breaking results in the domain of natural language processing. More recently, they started to gain interest in many others fields as in computer vision. Traditional Transformer models typically require a significant amount of training data to achieve satisfactory results. However, in the domain of handwritten text recognition, annotated data acquisition remains costly resulting in small datasets compared to those commonly used to train a Transformer-based model. Hence, training Transformer models able to transcribe handwritten text from images remains challenging. We propose a light encoder-decoder Transformer-based architecture for handwriting text recognition, containing a small number of parameters compared to traditional Transformer architectures. We trained our architecture using a hybrid loss, combining the well-known connectionist temporal classification with the cross-entropy. Experiments are conducted on the well-known IAM dataset with and without the use of additional synthetic data. We show that our network reaches state-of-the-art results in both cases, compared with other larger Transformer-based models.},\n\tlanguage = {en},\n\tbooktitle = {Document {Analysis} {Systems}},\n\tpublisher = {Springer International Publishing},\n\tauthor = {Barrere, Killian and Soullard, Yann and Lemaitre, Aurélie and Coüasnon, Bertrand},\n\teditor = {Uchida, Seiichi and Barney, Elisa and Eglin, Véronique},\n\tyear = {2022},\n\tkeywords = {Handwritten text recognition, Hybrid loss, Light network, Neural networks, Transformer},\n\tpages = {275--290},\n}\n\n
\n
\n\n\n
\n Transformer models have been showing ground-breaking results in the domain of natural language processing. More recently, they started to gain interest in many others fields as in computer vision. Traditional Transformer models typically require a significant amount of training data to achieve satisfactory results. However, in the domain of handwritten text recognition, annotated data acquisition remains costly resulting in small datasets compared to those commonly used to train a Transformer-based model. Hence, training Transformer models able to transcribe handwritten text from images remains challenging. We propose a light encoder-decoder Transformer-based architecture for handwriting text recognition, containing a small number of parameters compared to traditional Transformer architectures. We trained our architecture using a hybrid loss, combining the well-known connectionist temporal classification with the cross-entropy. Experiments are conducted on the well-known IAM dataset with and without the use of additional synthetic data. We show that our network reaches state-of-the-art results in both cases, compared with other larger Transformer-based models.\n
\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Spitzer, E.\n \n \n (1)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n \n Model Cards for Model Reporting.\n \n \n \n \n\n\n \n Mitchell, M.; Wu, S.; Zaldivar, A.; Barnes, P.; Vasserman, L.; Hutchinson, B.; Spitzer, E.; Raji, I. D.; and Gebru, T.\n\n\n \n\n\n\n Proceedings of the Conference on Fairness, Accountability, and Transparency,220–229. January 2019.\n arXiv: 1810.03993\n\n\n\n
\n\n\n\n \n \n \"ModelPaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n \n \n \n \n\n\n\n
\n
@article{mitchell_model_2019,\n\ttitle = {Model {Cards} for {Model} {Reporting}},\n\turl = {http://arxiv.org/abs/1810.03993},\n\tdoi = {10.1145/3287560.3287596},\n\tabstract = {Trained machine learning models are increasingly used to perform high-impact tasks in areas such as law enforcement, medicine, education, and employment. In order to clarify the intended use cases of machine learning models and minimize their usage in contexts for which they are not well suited, we recommend that released models be accompanied by documentation detailing their performance characteristics. In this paper, we propose a framework that we call model cards, to encourage such transparent model reporting. Model cards are short documents accompanying trained machine learning models that provide benchmarked evaluation in a variety of conditions, such as across different cultural, demographic, or phenotypic groups (e.g., race, geographic location, sex, Fitzpatrick skin type) and intersectional groups (e.g., age and race, or sex and Fitzpatrick skin type) that are relevant to the intended application domains. Model cards also disclose the context in which models are intended to be used, details of the performance evaluation procedures, and other relevant information. While we focus primarily on human-centered machine learning models in the application fields of computer vision and natural language processing, this framework can be used to document any trained machine learning model. To solidify the concept, we provide cards for two supervised models: One trained to detect smiling faces in images, and one trained to detect toxic comments in text. We propose model cards as a step towards the responsible democratization of machine learning and related AI technology, increasing transparency into how well AI technology works. We hope this work encourages those releasing trained machine learning models to accompany model releases with similar detailed evaluation numbers and other relevant documentation.},\n\turldate = {2022-01-24},\n\tjournal = {Proceedings of the Conference on Fairness, Accountability, and Transparency},\n\tauthor = {Mitchell, Margaret and Wu, Simone and Zaldivar, Andrew and Barnes, Parker and Vasserman, Lucy and Hutchinson, Ben and Spitzer, Elena and Raji, Inioluwa Deborah and Gebru, Timnit},\n\tmonth = jan,\n\tyear = {2019},\n\tnote = {arXiv: 1810.03993},\n\tkeywords = {Computer Science - Artificial Intelligence, Computer Science - Machine Learning},\n\tpages = {220--229},\n}\n\n
\n
\n\n\n
\n Trained machine learning models are increasingly used to perform high-impact tasks in areas such as law enforcement, medicine, education, and employment. In order to clarify the intended use cases of machine learning models and minimize their usage in contexts for which they are not well suited, we recommend that released models be accompanied by documentation detailing their performance characteristics. In this paper, we propose a framework that we call model cards, to encourage such transparent model reporting. Model cards are short documents accompanying trained machine learning models that provide benchmarked evaluation in a variety of conditions, such as across different cultural, demographic, or phenotypic groups (e.g., race, geographic location, sex, Fitzpatrick skin type) and intersectional groups (e.g., age and race, or sex and Fitzpatrick skin type) that are relevant to the intended application domains. Model cards also disclose the context in which models are intended to be used, details of the performance evaluation procedures, and other relevant information. While we focus primarily on human-centered machine learning models in the application fields of computer vision and natural language processing, this framework can be used to document any trained machine learning model. To solidify the concept, we provide cards for two supervised models: One trained to detect smiling faces in images, and one trained to detect toxic comments in text. We propose model cards as a step towards the responsible democratization of machine learning and related AI technology, increasing transparency into how well AI technology works. We hope this work encourages those releasing trained machine learning models to accompany model releases with similar detailed evaluation numbers and other relevant documentation.\n
\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Strauss, T.\n \n \n (1)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n \n System Description of CITlab's Recognition & Retrieval Engine for ICDAR2017 Competition on Information Extraction in Historical Handwritten Records.\n \n \n \n \n\n\n \n Strauss, T.; Weidemann, M.; Michael, J.; Leifert, G.; Grüning, T.; and Labahn, R.\n\n\n \n\n\n\n CoRR, abs/1804.09943. 2018.\n \n\n\n\n
\n\n\n\n \n \n \"SystemPaper\n  \n \n\n \n\n \n link\n  \n \n\n bibtex\n \n\n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@article{strauss_system_2018,\n\ttitle = {System {Description} of {CITlab}'s {Recognition} \\& {Retrieval} {Engine} for {ICDAR2017} {Competition} on {Information} {Extraction} in {Historical} {Handwritten} {Records}},\n\tvolume = {abs/1804.09943},\n\turl = {http://arxiv.org/abs/1804.09943},\n\turldate = {2018-06-29},\n\tjournal = {CoRR},\n\tauthor = {Strauss, Tobias and Weidemann, Max and Michael, Johannes and Leifert, Gundram and Grüning, Tobias and Labahn, Roger},\n\tyear = {2018},\n}\n\n
\n
\n\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Strauß, T.\n \n \n (2)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n \n Cells in Multidimensional Recurrent Neural Networks.\n \n \n \n \n\n\n \n Leifert, G.; Strauß, T.; Grüning, T.; Wustlich, W.; and Labahn, R.\n\n\n \n\n\n\n Journal of Machine Learning Research, 17: 97:1–97:37. 2016.\n \n\n\n\n
\n\n\n\n \n \n \"CellsPaper\n  \n \n\n \n\n \n link\n  \n \n\n bibtex\n \n\n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@article{leifert_cells_2016,\n\ttitle = {Cells in {Multidimensional} {Recurrent} {Neural} {Networks}},\n\tvolume = {17},\n\turl = {http://jmlr.org/papers/v17/14-203.html},\n\turldate = {2018-06-29},\n\tjournal = {Journal of Machine Learning Research},\n\tauthor = {Leifert, Gundram and Strauß, Tobias and Grüning, Tobias and Wustlich, Welf and Labahn, Roger},\n\tyear = {2016},\n\tpages = {97:1--97:37},\n}\n\n
\n
\n\n\n\n
\n\n\n
\n \n\n \n \n \n \n \n \n Regular expressions for decoding of neural network outputs.\n \n \n \n \n\n\n \n Strauß, T.; Leifert, G.; Grüning, T.; and Labahn, R.\n\n\n \n\n\n\n CoRR, abs/1509.04438. 2015.\n \n\n\n\n
\n\n\n\n \n \n \"RegularPaper\n  \n \n\n \n\n \n link\n  \n \n\n bibtex\n \n\n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@article{straus_regular_2015,\n\ttitle = {Regular expressions for decoding of neural network outputs},\n\tvolume = {abs/1509.04438},\n\turl = {http://arxiv.org/abs/1509.04438},\n\turldate = {2018-06-29},\n\tjournal = {CoRR},\n\tauthor = {Strauß, Tobias and Leifert, Gundram and Grüning, Tobias and Labahn, Roger},\n\tyear = {2015},\n}\n\n
\n
\n\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Ströbel, P.\n \n \n (3)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n \n Bullingers Briefwechsel zugänglich machen: Stand der Handschriftenerkennung.\n \n \n \n \n\n\n \n Ströbel, P.; Hodel, T.; Fischer, A.; Scius, A.; Wolf, B.; Janka, A.; Widmer, J.; Scheurer, P.; and Volk, M.\n\n\n \n\n\n\n . March 2023.\n Publisher: [object Object]\n\n\n\n
\n\n\n\n \n \n \"BullingersPaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n\n\n\n
\n
@article{strobel2023a,\n\ttitle = {Bullingers {Briefwechsel} zugänglich machen: {Stand} der {Handschriftenerkennung}},\n\tcopyright = {Creative Commons Attribution 4.0 International, Open Access},\n\tshorttitle = {Bullingers {Briefwechsel} zugänglich machen},\n\turl = {https://zenodo.org/record/7715357},\n\tdoi = {10.5281/ZENODO.7715357},\n\tabstract = {"Anhand des Briefwechsels Heinrich Bullingers (1504-1575), das rund 10'000 Briefe umfasst, demonstrieren wir den Stand der Forschung in automatisierter Handschriftenerkennung. Es finden sich mehr als hundert unterschiedliche Schreiberhände in den Briefen mit sehr unterschiedlicher Verteilung. Das Korpus ist zweisprachig (Latein/Deutsch) und teilweise findet der Sprachwechsel innerhalb von Abschnitten oder gar Sätzen statt. Auf Grund dieser Vielfalt eignet sich der Briefwechsel optimal als Testumgebung für entsprechende Algorithmen und ist aufschlussreiche für Forschungsprojekte und Erinnerungsinstitutionen mit ähnlichen Problemstellungen. Im Paper werden drei Verfahren gegeneinander gestellt und abgewogen. Im folgenden werde drei Ansätze an dem Korpus getestet, die Aufschlüsse zum Stand und möglichen Entwicklungen im Bereich der Handschriftenerkennung versprechen. Erstens wird mit Transkribus eine etablierte Plattform genutzt, die zwei Engines (HTR+ und PyLaia) anbietet. Zweitens wird mit Hilfe von Data Augmentation versucht die Erkennung mit der state-of-the-art Engine HTRFlor zu verbessern und drittens werden neue Transformer-basierte Modelle (TrOCR) eingesetzt." Ein Beitrag zur 9. Tagung des Verbands "Digital Humanities im deutschsprachigen Raum" - DHd 2023 Open Humanities Open Culture.},\n\turldate = {2024-04-22},\n\tauthor = {Ströbel, Phillip and Hodel, Tobias and Fischer, Andreas and Scius, Anna and Wolf, Beat and Janka, Anna and Widmer, Jonas and Scheurer, Patricia and Volk, Martin},\n\tcollaborator = {Trilcke, Peer and Busch, Anna and Helling, Patrick and Plum, Alistair and Wolter, Vivien and Weis, Joëlle and Chudoba, Hendrik},\n\tmonth = mar,\n\tyear = {2023},\n\tnote = {Publisher: [object Object]},\n\tkeywords = {Annotieren, Bewertung, DHd2023, Data augmentation, Daten, Handschriftenerkennung, Manuskript, Transkription, maschinelles Lernen},\n}\n\n
\n
\n\n\n
\n \"Anhand des Briefwechsels Heinrich Bullingers (1504-1575), das rund 10'000 Briefe umfasst, demonstrieren wir den Stand der Forschung in automatisierter Handschriftenerkennung. Es finden sich mehr als hundert unterschiedliche Schreiberhände in den Briefen mit sehr unterschiedlicher Verteilung. Das Korpus ist zweisprachig (Latein/Deutsch) und teilweise findet der Sprachwechsel innerhalb von Abschnitten oder gar Sätzen statt. Auf Grund dieser Vielfalt eignet sich der Briefwechsel optimal als Testumgebung für entsprechende Algorithmen und ist aufschlussreiche für Forschungsprojekte und Erinnerungsinstitutionen mit ähnlichen Problemstellungen. Im Paper werden drei Verfahren gegeneinander gestellt und abgewogen. Im folgenden werde drei Ansätze an dem Korpus getestet, die Aufschlüsse zum Stand und möglichen Entwicklungen im Bereich der Handschriftenerkennung versprechen. Erstens wird mit Transkribus eine etablierte Plattform genutzt, die zwei Engines (HTR+ und PyLaia) anbietet. Zweitens wird mit Hilfe von Data Augmentation versucht die Erkennung mit der state-of-the-art Engine HTRFlor zu verbessern und drittens werden neue Transformer-basierte Modelle (TrOCR) eingesetzt.\" Ein Beitrag zur 9. Tagung des Verbands \"Digital Humanities im deutschsprachigen Raum\" - DHd 2023 Open Humanities Open Culture.\n
\n\n\n
\n\n\n
\n \n\n \n \n \n \n \n \n The Adaptability of a Transformer-Based OCR Model for Historical Documents.\n \n \n \n \n\n\n \n Ströbel, P. B.; Hodel, T.; Boente, W.; and Volk, M.\n\n\n \n\n\n\n In Coustaty, M.; and Fornés, A., editor(s), Document Analysis and Recognition – ICDAR 2023 Workshops, volume 14193, pages 34–48. Springer Nature Switzerland, Cham, 2023.\n Series Title: Lecture Notes in Computer Science\n\n\n\n
\n\n\n\n \n \n \"ThePaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@incollection{coustaty_adaptability_2023,\n\taddress = {Cham},\n\ttitle = {The {Adaptability} of a {Transformer}-{Based} {OCR} {Model} for {Historical} {Documents}},\n\tvolume = {14193},\n\tisbn = {978-3-031-41497-8 978-3-031-41498-5},\n\turl = {https://link.springer.com/10.1007/978-3-031-41498-5_3},\n\tlanguage = {en},\n\turldate = {2023-10-17},\n\tbooktitle = {Document {Analysis} and {Recognition} – {ICDAR} 2023 {Workshops}},\n\tpublisher = {Springer Nature Switzerland},\n\tauthor = {Ströbel, Phillip Benjamin and Hodel, Tobias and Boente, Walter and Volk, Martin},\n\teditor = {Coustaty, Mickael and Fornés, Alicia},\n\tyear = {2023},\n\tdoi = {10.1007/978-3-031-41498-5_3},\n\tnote = {Series Title: Lecture Notes in Computer Science},\n\tpages = {34--48},\n}\n\n
\n
\n\n\n\n
\n\n\n
\n \n\n \n \n \n \n \n \n The Bullinger Dataset: A Writer Adaptation Challenge.\n \n \n \n \n\n\n \n Scius-Bertrand, A.; Ströbel, P.; Volk, M.; Hodel, T.; and Fischer, A.\n\n\n \n\n\n\n In Fink, G. A.; Jain, R.; Kise, K.; and Zanibbi, R., editor(s), Document Analysis and Recognition - ICDAR 2023, volume 14187, pages 397–410. Springer Nature Switzerland, Cham, 2023.\n Series Title: Lecture Notes in Computer Science\n\n\n\n
\n\n\n\n \n \n \"ThePaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@incollection{fink_bullinger_2023,\n\taddress = {Cham},\n\ttitle = {The {Bullinger} {Dataset}: {A} {Writer} {Adaptation} {Challenge}},\n\tvolume = {14187},\n\tisbn = {978-3-031-41675-0 978-3-031-41676-7},\n\tshorttitle = {The {Bullinger} {Dataset}},\n\turl = {https://link.springer.com/10.1007/978-3-031-41676-7_23},\n\tlanguage = {en},\n\turldate = {2023-08-24},\n\tbooktitle = {Document {Analysis} and {Recognition} - {ICDAR} 2023},\n\tpublisher = {Springer Nature Switzerland},\n\tauthor = {Scius-Bertrand, Anna and Ströbel, Phillip and Volk, Martin and Hodel, Tobias and Fischer, Andreas},\n\teditor = {Fink, Gernot A. and Jain, Rajiv and Kise, Koichi and Zanibbi, Richard},\n\tyear = {2023},\n\tdoi = {10.1007/978-3-031-41676-7_23},\n\tnote = {Series Title: Lecture Notes in Computer Science},\n\tpages = {397--410},\n}\n\n
\n
\n\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Sudholt, S.\n \n \n (1)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n \n PHOCNet: A Deep Convolutional Neural Network for Word Spotting in Handwritten Documents.\n \n \n \n \n\n\n \n Sudholt, S.; and Fink, G. A.\n\n\n \n\n\n\n December 2017.\n arXiv:1604.00187 [cs]\n\n\n\n
\n\n\n\n \n \n \"PHOCNet:Paper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n \n \n\n\n\n
\n
@misc{sudholt2017,\n\ttitle = {{PHOCNet}: {A} {Deep} {Convolutional} {Neural} {Network} for {Word} {Spotting} in {Handwritten} {Documents}},\n\tshorttitle = {{PHOCNet}},\n\turl = {http://arxiv.org/abs/1604.00187},\n\tdoi = {10.48550/arXiv.1604.00187},\n\tabstract = {In recent years, deep convolutional neural networks have achieved state of the art performance in various computer vision task such as classification, detection or segmentation. Due to their outstanding performance, CNNs are more and more used in the field of document image analysis as well. In this work, we present a CNN architecture that is trained with the recently proposed PHOC representation. We show empirically that our CNN architecture is able to outperform state of the art results for various word spotting benchmarks while exhibiting short training and test times.},\n\turldate = {2023-11-17},\n\tpublisher = {arXiv},\n\tauthor = {Sudholt, Sebastian and Fink, Gernot A.},\n\tmonth = dec,\n\tyear = {2017},\n\tnote = {arXiv:1604.00187 [cs]},\n\tkeywords = {Computer Science - Computer Vision and Pattern Recognition},\n}\n\n
\n
\n\n\n
\n In recent years, deep convolutional neural networks have achieved state of the art performance in various computer vision task such as classification, detection or segmentation. Due to their outstanding performance, CNNs are more and more used in the field of document image analysis as well. In this work, we present a CNN architecture that is trained with the recently proposed PHOC representation. We show empirically that our CNN architecture is able to outperform state of the art results for various word spotting benchmarks while exhibiting short training and test times.\n
\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Sutskever, I.\n \n \n (1)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n \n Language Models are Unsupervised Multitask Learners.\n \n \n \n \n\n\n \n Radford, A.; Wu, J.; Child, R.; Luan, D.; Amodei, D.; and Sutskever, I.\n\n\n \n\n\n\n In 2019. \n \n\n\n\n
\n\n\n\n \n \n \"LanguagePaper\n  \n \n\n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@inproceedings{radford_language_2019,\n\ttitle = {Language {Models} are {Unsupervised} {Multitask} {Learners}},\n\turl = {https://www.semanticscholar.org/paper/Language-Models-are-Unsupervised-Multitask-Learners-Radford-Wu/9405cc0d6169988371b2755e573cc28650d14dfe},\n\tabstract = {Natural language processing tasks, such as question answering, machine translation, reading comprehension, and summarization, are typically approached with supervised learning on taskspecific datasets. We demonstrate that language models begin to learn these tasks without any explicit supervision when trained on a new dataset of millions of webpages called WebText. When conditioned on a document plus questions, the answers generated by the language model reach 55 F1 on the CoQA dataset matching or exceeding the performance of 3 out of 4 baseline systems without using the 127,000+ training examples. The capacity of the language model is essential to the success of zero-shot task transfer and increasing it improves performance in a log-linear fashion across tasks. Our largest model, GPT-2, is a 1.5B parameter Transformer that achieves state of the art results on 7 out of 8 tested language modeling datasets in a zero-shot setting but still underfits WebText. Samples from the model reflect these improvements and contain coherent paragraphs of text. These findings suggest a promising path towards building language processing systems which learn to perform tasks from their naturally occurring demonstrations.},\n\turldate = {2023-02-02},\n\tauthor = {Radford, Alec and Wu, Jeff and Child, Rewon and Luan, D. and Amodei, Dario and Sutskever, Ilya},\n\tyear = {2019},\n}\n\n
\n
\n\n\n
\n Natural language processing tasks, such as question answering, machine translation, reading comprehension, and summarization, are typically approached with supervised learning on taskspecific datasets. We demonstrate that language models begin to learn these tasks without any explicit supervision when trained on a new dataset of millions of webpages called WebText. When conditioned on a document plus questions, the answers generated by the language model reach 55 F1 on the CoQA dataset matching or exceeding the performance of 3 out of 4 baseline systems without using the 127,000+ training examples. The capacity of the language model is essential to the success of zero-shot task transfer and increasing it improves performance in a log-linear fashion across tasks. Our largest model, GPT-2, is a 1.5B parameter Transformer that achieves state of the art results on 7 out of 8 tested language modeling datasets in a zero-shot setting but still underfits WebText. Samples from the model reflect these improvements and contain coherent paragraphs of text. These findings suggest a promising path towards building language processing systems which learn to perform tasks from their naturally occurring demonstrations.\n
\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Symeonidis, S.\n \n \n (1)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n \n Stavronikita Monastery Greek handwritten document Collection no.53 [Data set].\n \n \n \n \n\n\n \n Pratikakis, I.; Papazoglou, A.; Symeonidis, S.; and Tsochatzidis, L.\n\n\n \n\n\n\n October 2021.\n \n\n\n\n
\n\n\n\n \n \n \"StavronikitaPaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n \n \n \n \n \n \n \n \n\n\n\n
\n
@misc{pratikakis_stavronikita_2021,\n\ttitle = {Stavronikita {Monastery} {Greek} handwritten document {Collection} no.53 [{Data} set]},\n\turl = {https://zenodo.org/record/5595669},\n\tdoi = {10.5281/zenodo.5595669},\n\tabstract = {The collection is one of the oldest Stavronikita Monastery on Mount Athos. It is a parchment, four-gospel manuscript which has been written between 1301 and 1350. It comprises 54 pages with dimensions that are approximately 250x185 mm. The script is elegant minuscule and the use of majuscule letters is rare. Tachygraphical symbols and abbreviations are encountered in the manuscript as well. Furthermore, the manuscript is enriched with chrysography, elegant epititles and initials. The dataset of ΧΦ53 consists of 1038 lines of text containing 5592 words (2374 unique words) that are distributed over 54 scanned handwritten text pages.},\n\turldate = {2023-03-31},\n\tpublisher = {Zenodo},\n\tauthor = {Pratikakis, Ioannis and Papazoglou, Aleksandros and Symeonidis, Symeon and Tsochatzidis, Lazaros},\n\tmonth = oct,\n\tyear = {2021},\n\tkeywords = {greek, handwritten, miniscule, transcription},\n}\n\n
\n
\n\n\n
\n The collection is one of the oldest Stavronikita Monastery on Mount Athos. It is a parchment, four-gospel manuscript which has been written between 1301 and 1350. It comprises 54 pages with dimensions that are approximately 250x185 mm. The script is elegant minuscule and the use of majuscule letters is rare. Tachygraphical symbols and abbreviations are encountered in the manuscript as well. Furthermore, the manuscript is enriched with chrysography, elegant epititles and initials. The dataset of ΧΦ53 consists of 1038 lines of text containing 5592 words (2374 unique words) that are distributed over 54 scanned handwritten text pages.\n
\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Tannier, X.\n \n \n (1)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n \n A Named Entity Recognition Model for Medieval Latin Charters.\n \n \n \n \n\n\n \n Chastang, P.; Aguilar, S. T.; and Tannier, X.\n\n\n \n\n\n\n Digital Humanities Quarterly, 15(4). 2021.\n \n\n\n\n
\n\n\n\n \n \n \"APaper\n  \n \n\n \n\n \n link\n  \n \n\n bibtex\n \n\n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@article{chastang_named_2021,\n\ttitle = {A {Named} {Entity} {Recognition} {Model} for {Medieval} {Latin} {Charters}},\n\tvolume = {15},\n\tissn = {1938-4122},\n\turl = {http://www.digitalhumanities.org/dhq/vol/15/4/000574/000574.html},\n\tnumber = {4},\n\tjournal = {Digital Humanities Quarterly},\n\tauthor = {Chastang, Pierre and Aguilar, Sergio Torres and Tannier, Xavier},\n\tyear = {2021},\n}\n\n
\n
\n\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Tao, Y.\n \n \n (1)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n \n TRIG: Transformer-Based Text Recognizer with Initial Embedding Guidance.\n \n \n \n \n\n\n \n Tao, Y.; Jia, Z.; Ma, R.; and Xu, S.\n\n\n \n\n\n\n Electronics, 10(22): 2780. January 2021.\n Number: 22 Publisher: Multidisciplinary Digital Publishing Institute\n\n\n\n
\n\n\n\n \n \n \"TRIG:Paper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n \n \n \n \n \n \n \n \n \n \n\n\n\n
\n
@article{tao_trig_2021,\n\ttitle = {{TRIG}: {Transformer}-{Based} {Text} {Recognizer} with {Initial} {Embedding} {Guidance}},\n\tvolume = {10},\n\tcopyright = {http://creativecommons.org/licenses/by/3.0/},\n\tissn = {2079-9292},\n\tshorttitle = {{TRIG}},\n\turl = {https://www.mdpi.com/2079-9292/10/22/2780},\n\tdoi = {10.3390/electronics10222780},\n\tabstract = {Scene text recognition (STR) is an important bridge between images and text, attracting abundant research attention. While convolutional neural networks (CNNS) have achieved remarkable progress in this task, most of the existing works need an extra module (context modeling module) to help CNN to capture global dependencies to solve the inductive bias and strengthen the relationship between text features. Recently, the transformer has been proposed as a promising network for global context modeling by self-attention mechanism, but one of the main short-comings, when applied to recognition, is the efficiency. We propose a 1-D split to address the challenges of complexity and replace the CNN with the transformer encoder to reduce the need for a context modeling module. Furthermore, recent methods use a frozen initial embedding to guide the decoder to decode the features to text, leading to a loss of accuracy. We propose to use a learnable initial embedding learned from the transformer encoder to make it adaptive to different input images. Above all, we introduce a novel architecture for text recognition, named TRansformer-based text recognizer with Initial embedding Guidance (TRIG), composed of three stages (transformation, feature extraction, and prediction). Extensive experiments show that our approach can achieve state-of-the-art on text recognition benchmarks.},\n\tlanguage = {en},\n\tnumber = {22},\n\turldate = {2023-09-29},\n\tjournal = {Electronics},\n\tauthor = {Tao, Yue and Jia, Zhiwei and Ma, Runze and Xu, Shugong},\n\tmonth = jan,\n\tyear = {2021},\n\tnote = {Number: 22\nPublisher: Multidisciplinary Digital Publishing Institute},\n\tkeywords = {1-D split, initial embedding, scene text recognition, self-attention, transformer},\n\tpages = {2780},\n}\n\n
\n
\n\n\n
\n Scene text recognition (STR) is an important bridge between images and text, attracting abundant research attention. While convolutional neural networks (CNNS) have achieved remarkable progress in this task, most of the existing works need an extra module (context modeling module) to help CNN to capture global dependencies to solve the inductive bias and strengthen the relationship between text features. Recently, the transformer has been proposed as a promising network for global context modeling by self-attention mechanism, but one of the main short-comings, when applied to recognition, is the efficiency. We propose a 1-D split to address the challenges of complexity and replace the CNN with the transformer encoder to reduce the need for a context modeling module. Furthermore, recent methods use a frozen initial embedding to guide the decoder to decode the features to text, leading to a loss of accuracy. We propose to use a learnable initial embedding learned from the transformer encoder to make it adaptive to different input images. Above all, we introduce a novel architecture for text recognition, named TRansformer-based text recognizer with Initial embedding Guidance (TRIG), composed of three stages (transformation, feature extraction, and prediction). Extensive experiments show that our approach can achieve state-of-the-art on text recognition benchmarks.\n
\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Tranouez, P.\n \n \n (1)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n Recognition and Information Extraction in Historical Handwritten Tables: Toward Understanding Early $$20\\textasciicircum\\th\\$$Century Paris Census.\n \n \n \n\n\n \n Constum, T.; Kempf, N.; Paquet, T.; Tranouez, P.; Chatelain, C.; Brée, S.; and Merveille, F.\n\n\n \n\n\n\n In Uchida, S.; Barney, E.; and Eglin, V., editor(s), Document Analysis Systems, pages 143–157, Cham, 2022. Springer International Publishing\n \n\n\n\n
\n\n\n\n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n\n\n\n
\n
@inproceedings{constumRecognitionInformationExtraction2022,\n\taddress = {Cham},\n\ttitle = {Recognition and {Information} {Extraction} in {Historical} {Handwritten} {Tables}: {Toward} {Understanding} {Early} \\$\\$20{\\textasciicircum}\\{th\\}\\$\\${Century} {Paris} {Census}},\n\tisbn = {978-3-031-06555-2},\n\tshorttitle = {Recognition and {Information} {Extraction} in {Historical} {Handwritten} {Tables}},\n\tdoi = {10.1007/978-3-031-06555-2_10},\n\tabstract = {We aim to build a vast database (up to 9 million individuals) from the handwritten tabular nominal census of Paris of 1926, 1931 and 1936, each composed of about 100,000 handwritten simple pages in a tabular format. We created a complete pipeline that goes from the scan of double pages to text prediction while minimizing the need for segmentation labels. We describe how weighted finite state transducers, writer specialization and self-training further improved our results. We also introduce through this communication two annotated datasets for handwriting recognition that are now publicly available, and an open-source toolkit to apply WFST on CTC lattices.},\n\tlanguage = {en},\n\tbooktitle = {Document {Analysis} {Systems}},\n\tpublisher = {Springer International Publishing},\n\tauthor = {Constum, Thomas and Kempf, Nicolas and Paquet, Thierry and Tranouez, Pierrick and Chatelain, Clément and Brée, Sandra and Merveille, François},\n\teditor = {Uchida, Seiichi and Barney, Elisa and Eglin, Véronique},\n\tyear = {2022},\n\tkeywords = {Document layout analysis, Handwriting recognition, Self-training, Semi-supervised learning, Table analysis, WFST, handwritten text recognition, table recognition},\n\tpages = {143--157},\n}\n\n
\n
\n\n\n
\n We aim to build a vast database (up to 9 million individuals) from the handwritten tabular nominal census of Paris of 1926, 1931 and 1936, each composed of about 100,000 handwritten simple pages in a tabular format. We created a complete pipeline that goes from the scan of double pages to text prediction while minimizing the need for segmentation labels. We describe how weighted finite state transducers, writer specialization and self-training further improved our results. We also introduce through this communication two annotated datasets for handwriting recognition that are now publicly available, and an open-source toolkit to apply WFST on CTC lattices.\n
\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Trickett, T.\n \n \n (1)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n \n Turing’s Genius – Defining an apt microcosm.\n \n \n \n \n\n\n \n Bowen, J.; Trickett, T.; Green, J. B. A.; and Lomas, A.\n\n\n \n\n\n\n In July 2018. BCS Learning & Development\n \n\n\n\n
\n\n\n\n \n \n \"Turing’sPaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@inproceedings{bowen_turings_2018,\n\ttitle = {Turing’s {Genius} – {Defining} an apt microcosm},\n\turl = {https://www.scienceopen.com/hosted-document?doi=10.14236/ewic/EVA2018.31},\n\tdoi = {10.14236/ewic/EVA2018.31},\n\tabstract = {Alan Turing (1912–1954) is widely acknowledged as a genius. As well as codebreaking during World War II and taking a pioneering role in computer hardware design and software after the War, he also wrote three important foundational papers in the fields of theoretical computer science, artificial intelligence, and mathematical biology. He has been called the father of computer science, but he also admired by mathematicians, philosophers, and perhaps more surprisingly biologists, for his wide-ranging ideas. His influence stretches from scientific to cultural and even political impact. For all these reasons, he was a true polymath. This paper considers the genius of Turing from various angles, both scientific and artistic. The four authors provide position statements on how Turing has influenced and inspired their work, together with short biographies, as a starting point for a panel session and visual music performance.},\n\turldate = {2023-09-27},\n\tpublisher = {BCS Learning \\& Development},\n\tauthor = {Bowen, Jonathan and Trickett, Terry and Green, Jeremy B. A. and Lomas, Andy},\n\tmonth = jul,\n\tyear = {2018},\n}\n\n
\n
\n\n\n
\n Alan Turing (1912–1954) is widely acknowledged as a genius. As well as codebreaking during World War II and taking a pioneering role in computer hardware design and software after the War, he also wrote three important foundational papers in the fields of theoretical computer science, artificial intelligence, and mathematical biology. He has been called the father of computer science, but he also admired by mathematicians, philosophers, and perhaps more surprisingly biologists, for his wide-ranging ideas. His influence stretches from scientific to cultural and even political impact. For all these reasons, he was a true polymath. This paper considers the genius of Turing from various angles, both scientific and artistic. The four authors provide position statements on how Turing has influenced and inspired their work, together with short biographies, as a starting point for a panel session and visual music performance.\n
\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Tsochatzidis, L.\n \n \n (1)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n \n Stavronikita Monastery Greek handwritten document Collection no.53 [Data set].\n \n \n \n \n\n\n \n Pratikakis, I.; Papazoglou, A.; Symeonidis, S.; and Tsochatzidis, L.\n\n\n \n\n\n\n October 2021.\n \n\n\n\n
\n\n\n\n \n \n \"StavronikitaPaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n \n \n \n \n \n \n \n \n\n\n\n
\n
@misc{pratikakis_stavronikita_2021,\n\ttitle = {Stavronikita {Monastery} {Greek} handwritten document {Collection} no.53 [{Data} set]},\n\turl = {https://zenodo.org/record/5595669},\n\tdoi = {10.5281/zenodo.5595669},\n\tabstract = {The collection is one of the oldest Stavronikita Monastery on Mount Athos. It is a parchment, four-gospel manuscript which has been written between 1301 and 1350. It comprises 54 pages with dimensions that are approximately 250x185 mm. The script is elegant minuscule and the use of majuscule letters is rare. Tachygraphical symbols and abbreviations are encountered in the manuscript as well. Furthermore, the manuscript is enriched with chrysography, elegant epititles and initials. The dataset of ΧΦ53 consists of 1038 lines of text containing 5592 words (2374 unique words) that are distributed over 54 scanned handwritten text pages.},\n\turldate = {2023-03-31},\n\tpublisher = {Zenodo},\n\tauthor = {Pratikakis, Ioannis and Papazoglou, Aleksandros and Symeonidis, Symeon and Tsochatzidis, Lazaros},\n\tmonth = oct,\n\tyear = {2021},\n\tkeywords = {greek, handwritten, miniscule, transcription},\n}\n\n
\n
\n\n\n
\n The collection is one of the oldest Stavronikita Monastery on Mount Athos. It is a parchment, four-gospel manuscript which has been written between 1301 and 1350. It comprises 54 pages with dimensions that are approximately 250x185 mm. The script is elegant minuscule and the use of majuscule letters is rare. Tachygraphical symbols and abbreviations are encountered in the manuscript as well. Furthermore, the manuscript is enriched with chrysography, elegant epititles and initials. The dataset of ΧΦ53 consists of 1038 lines of text containing 5592 words (2374 unique words) that are distributed over 54 scanned handwritten text pages.\n
\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Usman, M.\n \n \n (2)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n \n A Comprehensive Overview of Large Language Models.\n \n \n \n \n\n\n \n Naveed, H.; Khan, A. U.; Qiu, S.; Saqib, M.; Anwar, S.; Usman, M.; Akhtar, N.; Barnes, N.; and Mian, A.\n\n\n \n\n\n\n December 2023.\n arXiv:2307.06435 [cs]\n\n\n\n
\n\n\n\n \n \n \"APaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n \n \n \n \n \n \n \n \n \n \n\n\n\n
\n
@misc{naveed2023,\n\ttitle = {A {Comprehensive} {Overview} of {Large} {Language} {Models}},\n\turl = {http://arxiv.org/abs/2307.06435},\n\tdoi = {10.48550/arXiv.2307.06435},\n\tabstract = {Large Language Models (LLMs) have recently demonstrated remarkable capabilities in natural language processing tasks and beyond. This success of LLMs has led to a large influx of research contributions in this direction. These works encompass diverse topics such as architectural innovations, better training strategies, context length improvements, fine-tuning, multi-modal LLMs, robotics, datasets, benchmarking, efficiency, and more. With the rapid development of techniques and regular breakthroughs in LLM research, it has become considerably challenging to perceive the bigger picture of the advances in this direction. Considering the rapidly emerging plethora of literature on LLMs, it is imperative that the research community is able to benefit from a concise yet comprehensive overview of the recent developments in this field. This article provides an overview of the existing literature on a broad range of LLM-related concepts. Our self-contained comprehensive overview of LLMs discusses relevant background concepts along with covering the advanced topics at the frontier of research in LLMs. This review article is intended to not only provide a systematic survey but also a quick comprehensive reference for the researchers and practitioners to draw insights from extensive informative summaries of the existing works to advance the LLM research.},\n\turldate = {2024-02-19},\n\tpublisher = {arXiv},\n\tauthor = {Naveed, Humza and Khan, Asad Ullah and Qiu, Shi and Saqib, Muhammad and Anwar, Saeed and Usman, Muhammad and Akhtar, Naveed and Barnes, Nick and Mian, Ajmal},\n\tmonth = dec,\n\tyear = {2023},\n\tnote = {arXiv:2307.06435 [cs]},\n\tkeywords = {Computer Science - Computation and Language, LLM, Large Language Model, Overview, Überblick},\n}\n\n
\n
\n\n\n
\n Large Language Models (LLMs) have recently demonstrated remarkable capabilities in natural language processing tasks and beyond. This success of LLMs has led to a large influx of research contributions in this direction. These works encompass diverse topics such as architectural innovations, better training strategies, context length improvements, fine-tuning, multi-modal LLMs, robotics, datasets, benchmarking, efficiency, and more. With the rapid development of techniques and regular breakthroughs in LLM research, it has become considerably challenging to perceive the bigger picture of the advances in this direction. Considering the rapidly emerging plethora of literature on LLMs, it is imperative that the research community is able to benefit from a concise yet comprehensive overview of the recent developments in this field. This article provides an overview of the existing literature on a broad range of LLM-related concepts. Our self-contained comprehensive overview of LLMs discusses relevant background concepts along with covering the advanced topics at the frontier of research in LLMs. This review article is intended to not only provide a systematic survey but also a quick comprehensive reference for the researchers and practitioners to draw insights from extensive informative summaries of the existing works to advance the LLM research.\n
\n\n\n
\n\n\n
\n \n\n \n \n \n \n \n \n A Comprehensive Overview of Large Language Models.\n \n \n \n \n\n\n \n Naveed, H.; Khan, A. U.; Qiu, S.; Saqib, M.; Anwar, S.; Usman, M.; Akhtar, N.; Barnes, N.; and Mian, A.\n\n\n \n\n\n\n December 2023.\n arXiv:2307.06435 [cs]\n\n\n\n
\n\n\n\n \n \n \"APaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n \n \n \n \n \n \n \n \n \n \n\n\n\n
\n
@misc{naveed2023a,\n\ttitle = {A {Comprehensive} {Overview} of {Large} {Language} {Models}},\n\turl = {http://arxiv.org/abs/2307.06435},\n\tdoi = {10.48550/arXiv.2307.06435},\n\tabstract = {Large Language Models (LLMs) have recently demonstrated remarkable capabilities in natural language processing tasks and beyond. This success of LLMs has led to a large influx of research contributions in this direction. These works encompass diverse topics such as architectural innovations, better training strategies, context length improvements, fine-tuning, multi-modal LLMs, robotics, datasets, benchmarking, efficiency, and more. With the rapid development of techniques and regular breakthroughs in LLM research, it has become considerably challenging to perceive the bigger picture of the advances in this direction. Considering the rapidly emerging plethora of literature on LLMs, it is imperative that the research community is able to benefit from a concise yet comprehensive overview of the recent developments in this field. This article provides an overview of the existing literature on a broad range of LLM-related concepts. Our self-contained comprehensive overview of LLMs discusses relevant background concepts along with covering the advanced topics at the frontier of research in LLMs. This review article is intended to not only provide a systematic survey but also a quick comprehensive reference for the researchers and practitioners to draw insights from extensive informative summaries of the existing works to advance the LLM research.},\n\turldate = {2024-02-19},\n\tpublisher = {arXiv},\n\tauthor = {Naveed, Humza and Khan, Asad Ullah and Qiu, Shi and Saqib, Muhammad and Anwar, Saeed and Usman, Muhammad and Akhtar, Naveed and Barnes, Nick and Mian, Ajmal},\n\tmonth = dec,\n\tyear = {2023},\n\tnote = {arXiv:2307.06435 [cs]},\n\tkeywords = {Computer Science - Computation and Language, LLM, Large Language Model, Overview, Überblick},\n}\n\n
\n
\n\n\n
\n Large Language Models (LLMs) have recently demonstrated remarkable capabilities in natural language processing tasks and beyond. This success of LLMs has led to a large influx of research contributions in this direction. These works encompass diverse topics such as architectural innovations, better training strategies, context length improvements, fine-tuning, multi-modal LLMs, robotics, datasets, benchmarking, efficiency, and more. With the rapid development of techniques and regular breakthroughs in LLM research, it has become considerably challenging to perceive the bigger picture of the advances in this direction. Considering the rapidly emerging plethora of literature on LLMs, it is imperative that the research community is able to benefit from a concise yet comprehensive overview of the recent developments in this field. This article provides an overview of the existing literature on a broad range of LLM-related concepts. Our self-contained comprehensive overview of LLMs discusses relevant background concepts along with covering the advanced topics at the frontier of research in LLMs. This review article is intended to not only provide a systematic survey but also a quick comprehensive reference for the researchers and practitioners to draw insights from extensive informative summaries of the existing works to advance the LLM research.\n
\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Uszkoreit, J.\n \n \n (1)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n \n Attention Is All You Need.\n \n \n \n \n\n\n \n Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A. N.; Kaiser, L.; and Polosukhin, I.\n\n\n \n\n\n\n December 2017.\n arXiv:1706.03762 [cs]\n\n\n\n
\n\n\n\n \n \n \"AttentionPaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n \n \n \n \n\n\n\n
\n
@misc{vaswani2017,\n\ttitle = {Attention {Is} {All} {You} {Need}},\n\turl = {http://arxiv.org/abs/1706.03762},\n\tdoi = {10.48550/arXiv.1706.03762},\n\tabstract = {The dominant sequence transduction models are based on complex recurrent or convolutional neural networks in an encoder-decoder configuration. The best performing models also connect the encoder and decoder through an attention mechanism. We propose a new simple network architecture, the Transformer, based solely on attention mechanisms, dispensing with recurrence and convolutions entirely. Experiments on two machine translation tasks show these models to be superior in quality while being more parallelizable and requiring significantly less time to train. Our model achieves 28.4 BLEU on the WMT 2014 English-to-German translation task, improving over the existing best results, including ensembles by over 2 BLEU. On the WMT 2014 English-to-French translation task, our model establishes a new single-model state-of-the-art BLEU score of 41.8 after training for 3.5 days on eight GPUs, a small fraction of the training costs of the best models from the literature. We show that the Transformer generalizes well to other tasks by applying it successfully to English constituency parsing both with large and limited training data.},\n\turldate = {2023-02-02},\n\tpublisher = {arXiv},\n\tauthor = {Vaswani, Ashish and Shazeer, Noam and Parmar, Niki and Uszkoreit, Jakob and Jones, Llion and Gomez, Aidan N. and Kaiser, Lukasz and Polosukhin, Illia},\n\tmonth = dec,\n\tyear = {2017},\n\tnote = {arXiv:1706.03762 [cs]},\n\tkeywords = {Computer Science - Computation and Language, Computer Science - Machine Learning},\n}\n\n
\n
\n\n\n
\n The dominant sequence transduction models are based on complex recurrent or convolutional neural networks in an encoder-decoder configuration. The best performing models also connect the encoder and decoder through an attention mechanism. We propose a new simple network architecture, the Transformer, based solely on attention mechanisms, dispensing with recurrence and convolutions entirely. Experiments on two machine translation tasks show these models to be superior in quality while being more parallelizable and requiring significantly less time to train. Our model achieves 28.4 BLEU on the WMT 2014 English-to-German translation task, improving over the existing best results, including ensembles by over 2 BLEU. On the WMT 2014 English-to-French translation task, our model establishes a new single-model state-of-the-art BLEU score of 41.8 after training for 3.5 days on eight GPUs, a small fraction of the training costs of the best models from the literature. We show that the Transformer generalizes well to other tasks by applying it successfully to English constituency parsing both with large and limited training data.\n
\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Valveny, E.\n \n \n (1)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n \n Word Spotting and Recognition with Embedded Attributes.\n \n \n \n \n\n\n \n Almazan, J.; Gordo, A.; Fornes, A.; and Valveny, E.\n\n\n \n\n\n\n IEEE Transactions on Pattern Analysis and Machine Intelligence, 36(12): 2552–2566. December 2014.\n \n\n\n\n
\n\n\n\n \n \n \"WordPaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@article{almazanWordSpottingRecognition2014,\n\ttitle = {Word {Spotting} and {Recognition} with {Embedded} {Attributes}},\n\tvolume = {36},\n\tissn = {0162-8828, 2160-9292},\n\turl = {http://ieeexplore.ieee.org/document/6857995/},\n\tdoi = {10.1109/TPAMI.2014.2339814},\n\tnumber = {12},\n\turldate = {2023-11-17},\n\tjournal = {IEEE Transactions on Pattern Analysis and Machine Intelligence},\n\tauthor = {Almazan, Jon and Gordo, Albert and Fornes, Alicia and Valveny, Ernest},\n\tmonth = dec,\n\tyear = {2014},\n\tpages = {2552--2566},\n}\n\n
\n
\n\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Van Gelder, E.\n \n \n (1)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n \n Automatic Writer Identification in Historical Documents: A Case Study.\n \n \n \n \n\n\n \n Christlein, V.; Diem, M.; Kleber, F.; Mühlberger, G.; Schwägerl-Melchior, V.; Van Gelder, E.; and Maier, A.\n\n\n \n\n\n\n Zeitschrift für digitale Geisteswissenschaften. 2016.\n Publisher: HAB - Herzog August Bibliothek\n\n\n\n
\n\n\n\n \n \n \"AutomaticPaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@article{christleinAutomaticWriterIdentification2016,\n\ttitle = {Automatic {Writer} {Identification} in {Historical} {Documents}: {A} {Case} {Study}},\n\tshorttitle = {Automatic {Writer} {Identification} in {Historical} {Documents}},\n\turl = {http://www.zfdg.de/2016_002},\n\tdoi = {10.17175/2016_002},\n\tlanguage = {en},\n\turldate = {2023-11-17},\n\tjournal = {Zeitschrift für digitale Geisteswissenschaften},\n\tauthor = {Christlein, Vincent and Diem, Markus and Kleber, Florian and Mühlberger, Günter and Schwägerl-Melchior, Verena and Van Gelder, Esther and Maier, Andreas},\n\tyear = {2016},\n\tnote = {Publisher: HAB - Herzog August Bibliothek},\n}\n\n
\n
\n\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Vanoost, E.\n \n \n (1)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n \n LM Studio: Run a local AI on your desktop or server.\n \n \n \n \n\n\n \n Vanoost, E.\n\n\n \n\n\n\n January 2024.\n \n\n\n\n
\n\n\n\n \n \n \"LMPaper\n  \n \n\n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n \n \n\n\n\n
\n
@misc{team2024,\n\ttitle = {{LM} {Studio}: {Run} a local {AI} on your desktop or server},\n\tshorttitle = {{LM} {Studio}},\n\turl = {https://4sysops.com/archives/lm-studio-run-a-local-ai-on-your-desktop-or-server/},\n\tabstract = {LM Studio is a free tool that allows you to run an AI on your desktop using locally installed open-source Large Language Models (LLMs). It features a browser to search and download LLMs from Hugging Face, an in-app Chat UI, and a runtime for a local server compatible with the OpenAI API. You can use this server to set up a development environment before deploying a more extensive LLM system or even run your ChatGPT clone without sharing your corporate data with third parties.},\n\tlanguage = {en-US},\n\turldate = {2024-02-12},\n\tjournal = {4sysops},\n\tauthor = {Vanoost, Evi},\n\tmonth = jan,\n\tyear = {2024},\n\tkeywords = {LLM; Language Model},\n}\n\n
\n
\n\n\n
\n LM Studio is a free tool that allows you to run an AI on your desktop using locally installed open-source Large Language Models (LLMs). It features a browser to search and download LLMs from Hugging Face, an in-app Chat UI, and a runtime for a local server compatible with the OpenAI API. You can use this server to set up a development environment before deploying a more extensive LLM system or even run your ChatGPT clone without sharing your corporate data with third parties.\n
\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Vasserman, L.\n \n \n (1)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n \n Model Cards for Model Reporting.\n \n \n \n \n\n\n \n Mitchell, M.; Wu, S.; Zaldivar, A.; Barnes, P.; Vasserman, L.; Hutchinson, B.; Spitzer, E.; Raji, I. D.; and Gebru, T.\n\n\n \n\n\n\n Proceedings of the Conference on Fairness, Accountability, and Transparency,220–229. January 2019.\n arXiv: 1810.03993\n\n\n\n
\n\n\n\n \n \n \"ModelPaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n \n \n \n \n\n\n\n
\n
@article{mitchell_model_2019,\n\ttitle = {Model {Cards} for {Model} {Reporting}},\n\turl = {http://arxiv.org/abs/1810.03993},\n\tdoi = {10.1145/3287560.3287596},\n\tabstract = {Trained machine learning models are increasingly used to perform high-impact tasks in areas such as law enforcement, medicine, education, and employment. In order to clarify the intended use cases of machine learning models and minimize their usage in contexts for which they are not well suited, we recommend that released models be accompanied by documentation detailing their performance characteristics. In this paper, we propose a framework that we call model cards, to encourage such transparent model reporting. Model cards are short documents accompanying trained machine learning models that provide benchmarked evaluation in a variety of conditions, such as across different cultural, demographic, or phenotypic groups (e.g., race, geographic location, sex, Fitzpatrick skin type) and intersectional groups (e.g., age and race, or sex and Fitzpatrick skin type) that are relevant to the intended application domains. Model cards also disclose the context in which models are intended to be used, details of the performance evaluation procedures, and other relevant information. While we focus primarily on human-centered machine learning models in the application fields of computer vision and natural language processing, this framework can be used to document any trained machine learning model. To solidify the concept, we provide cards for two supervised models: One trained to detect smiling faces in images, and one trained to detect toxic comments in text. We propose model cards as a step towards the responsible democratization of machine learning and related AI technology, increasing transparency into how well AI technology works. We hope this work encourages those releasing trained machine learning models to accompany model releases with similar detailed evaluation numbers and other relevant documentation.},\n\turldate = {2022-01-24},\n\tjournal = {Proceedings of the Conference on Fairness, Accountability, and Transparency},\n\tauthor = {Mitchell, Margaret and Wu, Simone and Zaldivar, Andrew and Barnes, Parker and Vasserman, Lucy and Hutchinson, Ben and Spitzer, Elena and Raji, Inioluwa Deborah and Gebru, Timnit},\n\tmonth = jan,\n\tyear = {2019},\n\tnote = {arXiv: 1810.03993},\n\tkeywords = {Computer Science - Artificial Intelligence, Computer Science - Machine Learning},\n\tpages = {220--229},\n}\n\n
\n
\n\n\n
\n Trained machine learning models are increasingly used to perform high-impact tasks in areas such as law enforcement, medicine, education, and employment. In order to clarify the intended use cases of machine learning models and minimize their usage in contexts for which they are not well suited, we recommend that released models be accompanied by documentation detailing their performance characteristics. In this paper, we propose a framework that we call model cards, to encourage such transparent model reporting. Model cards are short documents accompanying trained machine learning models that provide benchmarked evaluation in a variety of conditions, such as across different cultural, demographic, or phenotypic groups (e.g., race, geographic location, sex, Fitzpatrick skin type) and intersectional groups (e.g., age and race, or sex and Fitzpatrick skin type) that are relevant to the intended application domains. Model cards also disclose the context in which models are intended to be used, details of the performance evaluation procedures, and other relevant information. While we focus primarily on human-centered machine learning models in the application fields of computer vision and natural language processing, this framework can be used to document any trained machine learning model. To solidify the concept, we provide cards for two supervised models: One trained to detect smiling faces in images, and one trained to detect toxic comments in text. We propose model cards as a step towards the responsible democratization of machine learning and related AI technology, increasing transparency into how well AI technology works. We hope this work encourages those releasing trained machine learning models to accompany model releases with similar detailed evaluation numbers and other relevant documentation.\n
\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Vaswani, A.\n \n \n (1)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n \n Attention Is All You Need.\n \n \n \n \n\n\n \n Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A. N.; Kaiser, L.; and Polosukhin, I.\n\n\n \n\n\n\n December 2017.\n arXiv:1706.03762 [cs]\n\n\n\n
\n\n\n\n \n \n \"AttentionPaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n \n \n \n \n\n\n\n
\n
@misc{vaswani2017,\n\ttitle = {Attention {Is} {All} {You} {Need}},\n\turl = {http://arxiv.org/abs/1706.03762},\n\tdoi = {10.48550/arXiv.1706.03762},\n\tabstract = {The dominant sequence transduction models are based on complex recurrent or convolutional neural networks in an encoder-decoder configuration. The best performing models also connect the encoder and decoder through an attention mechanism. We propose a new simple network architecture, the Transformer, based solely on attention mechanisms, dispensing with recurrence and convolutions entirely. Experiments on two machine translation tasks show these models to be superior in quality while being more parallelizable and requiring significantly less time to train. Our model achieves 28.4 BLEU on the WMT 2014 English-to-German translation task, improving over the existing best results, including ensembles by over 2 BLEU. On the WMT 2014 English-to-French translation task, our model establishes a new single-model state-of-the-art BLEU score of 41.8 after training for 3.5 days on eight GPUs, a small fraction of the training costs of the best models from the literature. We show that the Transformer generalizes well to other tasks by applying it successfully to English constituency parsing both with large and limited training data.},\n\turldate = {2023-02-02},\n\tpublisher = {arXiv},\n\tauthor = {Vaswani, Ashish and Shazeer, Noam and Parmar, Niki and Uszkoreit, Jakob and Jones, Llion and Gomez, Aidan N. and Kaiser, Lukasz and Polosukhin, Illia},\n\tmonth = dec,\n\tyear = {2017},\n\tnote = {arXiv:1706.03762 [cs]},\n\tkeywords = {Computer Science - Computation and Language, Computer Science - Machine Learning},\n}\n\n
\n
\n\n\n
\n The dominant sequence transduction models are based on complex recurrent or convolutional neural networks in an encoder-decoder configuration. The best performing models also connect the encoder and decoder through an attention mechanism. We propose a new simple network architecture, the Transformer, based solely on attention mechanisms, dispensing with recurrence and convolutions entirely. Experiments on two machine translation tasks show these models to be superior in quality while being more parallelizable and requiring significantly less time to train. Our model achieves 28.4 BLEU on the WMT 2014 English-to-German translation task, improving over the existing best results, including ensembles by over 2 BLEU. On the WMT 2014 English-to-French translation task, our model establishes a new single-model state-of-the-art BLEU score of 41.8 after training for 3.5 days on eight GPUs, a small fraction of the training costs of the best models from the literature. We show that the Transformer generalizes well to other tasks by applying it successfully to English constituency parsing both with large and limited training data.\n
\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Vernet, M.\n \n \n (2)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n Handling Heavily Abbreviated Manuscripts: HTR Engines vs Text Normalisation Approaches.\n \n \n \n\n\n \n Camps, J.; Vidal-Gorène, C.; and Vernet, M.\n\n\n \n\n\n\n In Barney Smith, E. H.; and Pal, U., editor(s), Document Analysis and Recognition – ICDAR 2021 Workshops, of Lecture Notes in Computer Science, pages 306–316, Cham, 2021. Springer International Publishing\n \n\n\n\n
\n\n\n\n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n \n \n \n \n \n \n\n\n\n
\n
@inproceedings{campsHandlingHeavilyAbbreviated2021a,\n\taddress = {Cham},\n\tseries = {Lecture {Notes} in {Computer} {Science}},\n\ttitle = {Handling {Heavily} {Abbreviated} {Manuscripts}: {HTR} {Engines} vs {Text} {Normalisation} {Approaches}},\n\tisbn = {978-3-030-86159-9},\n\tshorttitle = {Handling {Heavily} {Abbreviated} {Manuscripts}},\n\tdoi = {10.1007/978-3-030-86159-9_21},\n\tabstract = {Although abbreviations are fairly common in handwritten sources, particularly in medieval and modern Western manuscripts, previous research dealing with computational approaches to their expansion is scarce. Yet abbreviations present particular challenges to computational approaches such as handwritten text recognition and natural language processing tasks. Often, pre-processing ultimately aims to lead from a digitised image of the source to a normalised text, which includes expansion of the abbreviations. We explore different setups to obtain such a normalised text, either directly, by training HTR engines on normalised (i.e., expanded, disabbreviated) text, or by decomposing the process into discrete steps, each making use of specialist models for recognition, word segmentation and normalisation. The case studies considered here are drawn from the medieval Latin tradition.},\n\tlanguage = {en},\n\tbooktitle = {Document {Analysis} and {Recognition} – {ICDAR} 2021 {Workshops}},\n\tpublisher = {Springer International Publishing},\n\tauthor = {Camps, Jean-Baptiste and Vidal-Gorène, Chahan and Vernet, Marguerite},\n\teditor = {Barney Smith, Elisa H. and Pal, Umapada},\n\tyear = {2021},\n\tkeywords = {Abbreviations, Handwritten text recognition, Medieval western manuscripts},\n\tpages = {306--316},\n}\n\n
\n
\n\n\n
\n Although abbreviations are fairly common in handwritten sources, particularly in medieval and modern Western manuscripts, previous research dealing with computational approaches to their expansion is scarce. Yet abbreviations present particular challenges to computational approaches such as handwritten text recognition and natural language processing tasks. Often, pre-processing ultimately aims to lead from a digitised image of the source to a normalised text, which includes expansion of the abbreviations. We explore different setups to obtain such a normalised text, either directly, by training HTR engines on normalised (i.e., expanded, disabbreviated) text, or by decomposing the process into discrete steps, each making use of specialist models for recognition, word segmentation and normalisation. The case studies considered here are drawn from the medieval Latin tradition.\n
\n\n\n
\n\n\n
\n \n\n \n \n \n \n \n Handling Heavily Abbreviated Manuscripts: HTR Engines vs Text Normalisation Approaches.\n \n \n \n\n\n \n Camps, J.; Vidal-Gorène, C.; and Vernet, M.\n\n\n \n\n\n\n In Barney Smith, E. H.; and Pal, U., editor(s), Document Analysis and Recognition – ICDAR 2021 Workshops, of Lecture Notes in Computer Science, pages 306–316, Cham, 2021. Springer International Publishing\n \n\n\n\n
\n\n\n\n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n \n \n \n \n \n \n \n \n \n \n\n\n\n
\n
@inproceedings{camps_handling_2021,\n\taddress = {Cham},\n\tseries = {Lecture {Notes} in {Computer} {Science}},\n\ttitle = {Handling {Heavily} {Abbreviated} {Manuscripts}: {HTR} {Engines} vs {Text} {Normalisation} {Approaches}},\n\tisbn = {978-3-030-86159-9},\n\tshorttitle = {Handling {Heavily} {Abbreviated} {Manuscripts}},\n\tdoi = {10.1007/978-3-030-86159-9_21},\n\tabstract = {Although abbreviations are fairly common in handwritten sources, particularly in medieval and modern Western manuscripts, previous research dealing with computational approaches to their expansion is scarce. Yet abbreviations present particular challenges to computational approaches such as handwritten text recognition and natural language processing tasks. Often, pre-processing ultimately aims to lead from a digitised image of the source to a normalised text, which includes expansion of the abbreviations. We explore different setups to obtain such a normalised text, either directly, by training HTR engines on normalised (i.e., expanded, disabbreviated) text, or by decomposing the process into discrete steps, each making use of specialist models for recognition, word segmentation and normalisation. The case studies considered here are drawn from the medieval Latin tradition.},\n\tlanguage = {en},\n\tbooktitle = {Document {Analysis} and {Recognition} – {ICDAR} 2021 {Workshops}},\n\tpublisher = {Springer International Publishing},\n\tauthor = {Camps, Jean-Baptiste and Vidal-Gorène, Chahan and Vernet, Marguerite},\n\teditor = {Barney Smith, Elisa H. and Pal, Umapada},\n\tyear = {2021},\n\tkeywords = {Abbreviation, Abbreviations, Handwritten text recognition, Medieval western manuscripts, Text Recognition},\n\tpages = {306--316},\n}\n\n
\n
\n\n\n
\n Although abbreviations are fairly common in handwritten sources, particularly in medieval and modern Western manuscripts, previous research dealing with computational approaches to their expansion is scarce. Yet abbreviations present particular challenges to computational approaches such as handwritten text recognition and natural language processing tasks. Often, pre-processing ultimately aims to lead from a digitised image of the source to a normalised text, which includes expansion of the abbreviations. We explore different setups to obtain such a normalised text, either directly, by training HTR engines on normalised (i.e., expanded, disabbreviated) text, or by decomposing the process into discrete steps, each making use of specialist models for recognition, word segmentation and normalisation. The case studies considered here are drawn from the medieval Latin tradition.\n
\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Vidal-Gorène, C.\n \n \n (2)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n Handling Heavily Abbreviated Manuscripts: HTR Engines vs Text Normalisation Approaches.\n \n \n \n\n\n \n Camps, J.; Vidal-Gorène, C.; and Vernet, M.\n\n\n \n\n\n\n In Barney Smith, E. H.; and Pal, U., editor(s), Document Analysis and Recognition – ICDAR 2021 Workshops, of Lecture Notes in Computer Science, pages 306–316, Cham, 2021. Springer International Publishing\n \n\n\n\n
\n\n\n\n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n \n \n \n \n \n \n\n\n\n
\n
@inproceedings{campsHandlingHeavilyAbbreviated2021a,\n\taddress = {Cham},\n\tseries = {Lecture {Notes} in {Computer} {Science}},\n\ttitle = {Handling {Heavily} {Abbreviated} {Manuscripts}: {HTR} {Engines} vs {Text} {Normalisation} {Approaches}},\n\tisbn = {978-3-030-86159-9},\n\tshorttitle = {Handling {Heavily} {Abbreviated} {Manuscripts}},\n\tdoi = {10.1007/978-3-030-86159-9_21},\n\tabstract = {Although abbreviations are fairly common in handwritten sources, particularly in medieval and modern Western manuscripts, previous research dealing with computational approaches to their expansion is scarce. Yet abbreviations present particular challenges to computational approaches such as handwritten text recognition and natural language processing tasks. Often, pre-processing ultimately aims to lead from a digitised image of the source to a normalised text, which includes expansion of the abbreviations. We explore different setups to obtain such a normalised text, either directly, by training HTR engines on normalised (i.e., expanded, disabbreviated) text, or by decomposing the process into discrete steps, each making use of specialist models for recognition, word segmentation and normalisation. The case studies considered here are drawn from the medieval Latin tradition.},\n\tlanguage = {en},\n\tbooktitle = {Document {Analysis} and {Recognition} – {ICDAR} 2021 {Workshops}},\n\tpublisher = {Springer International Publishing},\n\tauthor = {Camps, Jean-Baptiste and Vidal-Gorène, Chahan and Vernet, Marguerite},\n\teditor = {Barney Smith, Elisa H. and Pal, Umapada},\n\tyear = {2021},\n\tkeywords = {Abbreviations, Handwritten text recognition, Medieval western manuscripts},\n\tpages = {306--316},\n}\n\n
\n
\n\n\n
\n Although abbreviations are fairly common in handwritten sources, particularly in medieval and modern Western manuscripts, previous research dealing with computational approaches to their expansion is scarce. Yet abbreviations present particular challenges to computational approaches such as handwritten text recognition and natural language processing tasks. Often, pre-processing ultimately aims to lead from a digitised image of the source to a normalised text, which includes expansion of the abbreviations. We explore different setups to obtain such a normalised text, either directly, by training HTR engines on normalised (i.e., expanded, disabbreviated) text, or by decomposing the process into discrete steps, each making use of specialist models for recognition, word segmentation and normalisation. The case studies considered here are drawn from the medieval Latin tradition.\n
\n\n\n
\n\n\n
\n \n\n \n \n \n \n \n Handling Heavily Abbreviated Manuscripts: HTR Engines vs Text Normalisation Approaches.\n \n \n \n\n\n \n Camps, J.; Vidal-Gorène, C.; and Vernet, M.\n\n\n \n\n\n\n In Barney Smith, E. H.; and Pal, U., editor(s), Document Analysis and Recognition – ICDAR 2021 Workshops, of Lecture Notes in Computer Science, pages 306–316, Cham, 2021. Springer International Publishing\n \n\n\n\n
\n\n\n\n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n \n \n \n \n \n \n \n \n \n \n\n\n\n
\n
@inproceedings{camps_handling_2021,\n\taddress = {Cham},\n\tseries = {Lecture {Notes} in {Computer} {Science}},\n\ttitle = {Handling {Heavily} {Abbreviated} {Manuscripts}: {HTR} {Engines} vs {Text} {Normalisation} {Approaches}},\n\tisbn = {978-3-030-86159-9},\n\tshorttitle = {Handling {Heavily} {Abbreviated} {Manuscripts}},\n\tdoi = {10.1007/978-3-030-86159-9_21},\n\tabstract = {Although abbreviations are fairly common in handwritten sources, particularly in medieval and modern Western manuscripts, previous research dealing with computational approaches to their expansion is scarce. Yet abbreviations present particular challenges to computational approaches such as handwritten text recognition and natural language processing tasks. Often, pre-processing ultimately aims to lead from a digitised image of the source to a normalised text, which includes expansion of the abbreviations. We explore different setups to obtain such a normalised text, either directly, by training HTR engines on normalised (i.e., expanded, disabbreviated) text, or by decomposing the process into discrete steps, each making use of specialist models for recognition, word segmentation and normalisation. The case studies considered here are drawn from the medieval Latin tradition.},\n\tlanguage = {en},\n\tbooktitle = {Document {Analysis} and {Recognition} – {ICDAR} 2021 {Workshops}},\n\tpublisher = {Springer International Publishing},\n\tauthor = {Camps, Jean-Baptiste and Vidal-Gorène, Chahan and Vernet, Marguerite},\n\teditor = {Barney Smith, Elisa H. and Pal, Umapada},\n\tyear = {2021},\n\tkeywords = {Abbreviation, Abbreviations, Handwritten text recognition, Medieval western manuscripts, Text Recognition},\n\tpages = {306--316},\n}\n\n
\n
\n\n\n
\n Although abbreviations are fairly common in handwritten sources, particularly in medieval and modern Western manuscripts, previous research dealing with computational approaches to their expansion is scarce. Yet abbreviations present particular challenges to computational approaches such as handwritten text recognition and natural language processing tasks. Often, pre-processing ultimately aims to lead from a digitised image of the source to a normalised text, which includes expansion of the abbreviations. We explore different setups to obtain such a normalised text, either directly, by training HTR engines on normalised (i.e., expanded, disabbreviated) text, or by decomposing the process into discrete steps, each making use of specialist models for recognition, word segmentation and normalisation. The case studies considered here are drawn from the medieval Latin tradition.\n
\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Vincent, E.\n \n \n (1)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n \n Unsupervised Layered Image Decomposition into Object Prototypes.\n \n \n \n \n\n\n \n Monnier, T.; Vincent, E.; Ponce, J.; and Aubry, M.\n\n\n \n\n\n\n August 2021.\n arXiv:2104.14575 [cs]\n\n\n\n
\n\n\n\n \n \n \"UnsupervisedPaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n \n \n\n\n\n
\n
@misc{monnier_unsupervised_2021,\n\ttitle = {Unsupervised {Layered} {Image} {Decomposition} into {Object} {Prototypes}},\n\turl = {http://arxiv.org/abs/2104.14575},\n\tdoi = {10.48550/arXiv.2104.14575},\n\tabstract = {We present an unsupervised learning framework for decomposing images into layers of automatically discovered object models. Contrary to recent approaches that model image layers with autoencoder networks, we represent them as explicit transformations of a small set of prototypical images. Our model has three main components: (i) a set of object prototypes in the form of learnable images with a transparency channel, which we refer to as sprites; (ii) differentiable parametric functions predicting occlusions and transformation parameters necessary to instantiate the sprites in a given image; (iii) a layered image formation model with occlusion for compositing these instances into complete images including background. By jointly learning the sprites and occlusion/transformation predictors to reconstruct images, our approach not only yields accurate layered image decompositions, but also identifies object categories and instance parameters. We first validate our approach by providing results on par with the state of the art on standard multi-object synthetic benchmarks (Tetrominoes, Multi-dSprites, CLEVR6). We then demonstrate the applicability of our model to real images in tasks that include clustering (SVHN, GTSRB), cosegmentation (Weizmann Horse) and object discovery from unfiltered social network images. To the best of our knowledge, our approach is the first layered image decomposition algorithm that learns an explicit and shared concept of object type, and is robust enough to be applied to real images.},\n\turldate = {2022-09-30},\n\tpublisher = {arXiv},\n\tauthor = {Monnier, Tom and Vincent, Elliot and Ponce, Jean and Aubry, Mathieu},\n\tmonth = aug,\n\tyear = {2021},\n\tnote = {arXiv:2104.14575 [cs]},\n\tkeywords = {Computer Science - Computer Vision and Pattern Recognition},\n}\n\n
\n
\n\n\n
\n We present an unsupervised learning framework for decomposing images into layers of automatically discovered object models. Contrary to recent approaches that model image layers with autoencoder networks, we represent them as explicit transformations of a small set of prototypical images. Our model has three main components: (i) a set of object prototypes in the form of learnable images with a transparency channel, which we refer to as sprites; (ii) differentiable parametric functions predicting occlusions and transformation parameters necessary to instantiate the sprites in a given image; (iii) a layered image formation model with occlusion for compositing these instances into complete images including background. By jointly learning the sprites and occlusion/transformation predictors to reconstruct images, our approach not only yields accurate layered image decompositions, but also identifies object categories and instance parameters. We first validate our approach by providing results on par with the state of the art on standard multi-object synthetic benchmarks (Tetrominoes, Multi-dSprites, CLEVR6). We then demonstrate the applicability of our model to real images in tasks that include clustering (SVHN, GTSRB), cosegmentation (Weizmann Horse) and object discovery from unfiltered social network images. To the best of our knowledge, our approach is the first layered image decomposition algorithm that learns an explicit and shared concept of object type, and is robust enough to be applied to real images.\n
\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Vitek, D.\n \n \n (1)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n Inferring standard name form, gender and nobility from historical texts using stable model semantics.\n \n \n \n\n\n \n Lauc, D.; and Vitek, D.\n\n\n \n\n\n\n Digital Humanities Quarterly, 015(1). May 2021.\n \n\n\n\n
\n\n\n\n \n\n \n\n \n link\n  \n \n\n bibtex\n \n\n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n \n \n\n\n\n
\n
@article{lauc_inferring_2021,\n\ttitle = {Inferring standard name form, gender and nobility from historical texts using stable model semantics},\n\tvolume = {015},\n\tissn = {1938-4122},\n\tnumber = {1},\n\tjournal = {Digital Humanities Quarterly},\n\tauthor = {Lauc, Davor and Vitek, Darko},\n\tmonth = may,\n\tyear = {2021},\n\tkeywords = {nlp},\n}\n\n
\n
\n\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Vogeler, G.\n \n \n (1)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n \n The ‘assertive edition’.\n \n \n \n \n\n\n \n Vogeler, G.\n\n\n \n\n\n\n International Journal of Digital Humanities, 1(2): 309–322. July 2019.\n \n\n\n\n
\n\n\n\n \n \n \"ThePaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n\n\n\n
\n
@article{vogeler_assertive_2019,\n\ttitle = {The ‘assertive edition’},\n\tvolume = {1},\n\tissn = {2524-7840},\n\turl = {https://doi.org/10.1007/s42803-019-00025-5},\n\tdoi = {10.1007/s42803-019-00025-5},\n\tabstract = {The paper describes the special interest among historians in scholarly editing and the resulting editorial practice in contrast to the methods applied by pure philological textual criticism. The interest in historical ‘facts’ suggests methods the goal of which is to create formal representations of the information conveyed by the text in structured databases. This can be achieved with RDF representations of statements extracted from the text, by automatic information extraction methods, or by hand. The paper suggests the use of embedded RDF representations in TEI markup, following the practice in several recent projects, and it concludes with a proposal for a definition of the ‘assertive edition’.},\n\tlanguage = {en},\n\tnumber = {2},\n\turldate = {2023-04-03},\n\tjournal = {International Journal of Digital Humanities},\n\tauthor = {Vogeler, Georg},\n\tmonth = jul,\n\tyear = {2019},\n\tkeywords = {Critial edition, Digital scholarly edition, Historical documents, History, RDF (Resource Description Framework), Semantic web, TEI (Text Encoding Initiative)},\n\tpages = {309--322},\n}\n\n
\n
\n\n\n
\n The paper describes the special interest among historians in scholarly editing and the resulting editorial practice in contrast to the methods applied by pure philological textual criticism. The interest in historical ‘facts’ suggests methods the goal of which is to create formal representations of the information conveyed by the text in structured databases. This can be achieved with RDF representations of statements extracted from the text, by automatic information extraction methods, or by hand. The paper suggests the use of embedded RDF representations in TEI markup, following the practice in several recent projects, and it concludes with a proposal for a definition of the ‘assertive edition’.\n
\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Volk, M.\n \n \n (3)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n \n Bullingers Briefwechsel zugänglich machen: Stand der Handschriftenerkennung.\n \n \n \n \n\n\n \n Ströbel, P.; Hodel, T.; Fischer, A.; Scius, A.; Wolf, B.; Janka, A.; Widmer, J.; Scheurer, P.; and Volk, M.\n\n\n \n\n\n\n . March 2023.\n Publisher: [object Object]\n\n\n\n
\n\n\n\n \n \n \"BullingersPaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n\n\n\n
\n
@article{strobel2023a,\n\ttitle = {Bullingers {Briefwechsel} zugänglich machen: {Stand} der {Handschriftenerkennung}},\n\tcopyright = {Creative Commons Attribution 4.0 International, Open Access},\n\tshorttitle = {Bullingers {Briefwechsel} zugänglich machen},\n\turl = {https://zenodo.org/record/7715357},\n\tdoi = {10.5281/ZENODO.7715357},\n\tabstract = {"Anhand des Briefwechsels Heinrich Bullingers (1504-1575), das rund 10'000 Briefe umfasst, demonstrieren wir den Stand der Forschung in automatisierter Handschriftenerkennung. Es finden sich mehr als hundert unterschiedliche Schreiberhände in den Briefen mit sehr unterschiedlicher Verteilung. Das Korpus ist zweisprachig (Latein/Deutsch) und teilweise findet der Sprachwechsel innerhalb von Abschnitten oder gar Sätzen statt. Auf Grund dieser Vielfalt eignet sich der Briefwechsel optimal als Testumgebung für entsprechende Algorithmen und ist aufschlussreiche für Forschungsprojekte und Erinnerungsinstitutionen mit ähnlichen Problemstellungen. Im Paper werden drei Verfahren gegeneinander gestellt und abgewogen. Im folgenden werde drei Ansätze an dem Korpus getestet, die Aufschlüsse zum Stand und möglichen Entwicklungen im Bereich der Handschriftenerkennung versprechen. Erstens wird mit Transkribus eine etablierte Plattform genutzt, die zwei Engines (HTR+ und PyLaia) anbietet. Zweitens wird mit Hilfe von Data Augmentation versucht die Erkennung mit der state-of-the-art Engine HTRFlor zu verbessern und drittens werden neue Transformer-basierte Modelle (TrOCR) eingesetzt." Ein Beitrag zur 9. Tagung des Verbands "Digital Humanities im deutschsprachigen Raum" - DHd 2023 Open Humanities Open Culture.},\n\turldate = {2024-04-22},\n\tauthor = {Ströbel, Phillip and Hodel, Tobias and Fischer, Andreas and Scius, Anna and Wolf, Beat and Janka, Anna and Widmer, Jonas and Scheurer, Patricia and Volk, Martin},\n\tcollaborator = {Trilcke, Peer and Busch, Anna and Helling, Patrick and Plum, Alistair and Wolter, Vivien and Weis, Joëlle and Chudoba, Hendrik},\n\tmonth = mar,\n\tyear = {2023},\n\tnote = {Publisher: [object Object]},\n\tkeywords = {Annotieren, Bewertung, DHd2023, Data augmentation, Daten, Handschriftenerkennung, Manuskript, Transkription, maschinelles Lernen},\n}\n\n
\n
\n\n\n
\n \"Anhand des Briefwechsels Heinrich Bullingers (1504-1575), das rund 10'000 Briefe umfasst, demonstrieren wir den Stand der Forschung in automatisierter Handschriftenerkennung. Es finden sich mehr als hundert unterschiedliche Schreiberhände in den Briefen mit sehr unterschiedlicher Verteilung. Das Korpus ist zweisprachig (Latein/Deutsch) und teilweise findet der Sprachwechsel innerhalb von Abschnitten oder gar Sätzen statt. Auf Grund dieser Vielfalt eignet sich der Briefwechsel optimal als Testumgebung für entsprechende Algorithmen und ist aufschlussreiche für Forschungsprojekte und Erinnerungsinstitutionen mit ähnlichen Problemstellungen. Im Paper werden drei Verfahren gegeneinander gestellt und abgewogen. Im folgenden werde drei Ansätze an dem Korpus getestet, die Aufschlüsse zum Stand und möglichen Entwicklungen im Bereich der Handschriftenerkennung versprechen. Erstens wird mit Transkribus eine etablierte Plattform genutzt, die zwei Engines (HTR+ und PyLaia) anbietet. Zweitens wird mit Hilfe von Data Augmentation versucht die Erkennung mit der state-of-the-art Engine HTRFlor zu verbessern und drittens werden neue Transformer-basierte Modelle (TrOCR) eingesetzt.\" Ein Beitrag zur 9. Tagung des Verbands \"Digital Humanities im deutschsprachigen Raum\" - DHd 2023 Open Humanities Open Culture.\n
\n\n\n
\n\n\n
\n \n\n \n \n \n \n \n \n The Adaptability of a Transformer-Based OCR Model for Historical Documents.\n \n \n \n \n\n\n \n Ströbel, P. B.; Hodel, T.; Boente, W.; and Volk, M.\n\n\n \n\n\n\n In Coustaty, M.; and Fornés, A., editor(s), Document Analysis and Recognition – ICDAR 2023 Workshops, volume 14193, pages 34–48. Springer Nature Switzerland, Cham, 2023.\n Series Title: Lecture Notes in Computer Science\n\n\n\n
\n\n\n\n \n \n \"ThePaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@incollection{coustaty_adaptability_2023,\n\taddress = {Cham},\n\ttitle = {The {Adaptability} of a {Transformer}-{Based} {OCR} {Model} for {Historical} {Documents}},\n\tvolume = {14193},\n\tisbn = {978-3-031-41497-8 978-3-031-41498-5},\n\turl = {https://link.springer.com/10.1007/978-3-031-41498-5_3},\n\tlanguage = {en},\n\turldate = {2023-10-17},\n\tbooktitle = {Document {Analysis} and {Recognition} – {ICDAR} 2023 {Workshops}},\n\tpublisher = {Springer Nature Switzerland},\n\tauthor = {Ströbel, Phillip Benjamin and Hodel, Tobias and Boente, Walter and Volk, Martin},\n\teditor = {Coustaty, Mickael and Fornés, Alicia},\n\tyear = {2023},\n\tdoi = {10.1007/978-3-031-41498-5_3},\n\tnote = {Series Title: Lecture Notes in Computer Science},\n\tpages = {34--48},\n}\n\n
\n
\n\n\n\n
\n\n\n
\n \n\n \n \n \n \n \n \n The Bullinger Dataset: A Writer Adaptation Challenge.\n \n \n \n \n\n\n \n Scius-Bertrand, A.; Ströbel, P.; Volk, M.; Hodel, T.; and Fischer, A.\n\n\n \n\n\n\n In Fink, G. A.; Jain, R.; Kise, K.; and Zanibbi, R., editor(s), Document Analysis and Recognition - ICDAR 2023, volume 14187, pages 397–410. Springer Nature Switzerland, Cham, 2023.\n Series Title: Lecture Notes in Computer Science\n\n\n\n
\n\n\n\n \n \n \"ThePaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@incollection{fink_bullinger_2023,\n\taddress = {Cham},\n\ttitle = {The {Bullinger} {Dataset}: {A} {Writer} {Adaptation} {Challenge}},\n\tvolume = {14187},\n\tisbn = {978-3-031-41675-0 978-3-031-41676-7},\n\tshorttitle = {The {Bullinger} {Dataset}},\n\turl = {https://link.springer.com/10.1007/978-3-031-41676-7_23},\n\tlanguage = {en},\n\turldate = {2023-08-24},\n\tbooktitle = {Document {Analysis} and {Recognition} - {ICDAR} 2023},\n\tpublisher = {Springer Nature Switzerland},\n\tauthor = {Scius-Bertrand, Anna and Ströbel, Phillip and Volk, Martin and Hodel, Tobias and Fischer, Andreas},\n\teditor = {Fink, Gernot A. and Jain, Rajiv and Kise, Koichi and Zanibbi, Richard},\n\tyear = {2023},\n\tdoi = {10.1007/978-3-031-41676-7_23},\n\tnote = {Series Title: Lecture Notes in Computer Science},\n\tpages = {397--410},\n}\n\n
\n
\n\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Weidemann, M.\n \n \n (1)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n \n System Description of CITlab's Recognition & Retrieval Engine for ICDAR2017 Competition on Information Extraction in Historical Handwritten Records.\n \n \n \n \n\n\n \n Strauss, T.; Weidemann, M.; Michael, J.; Leifert, G.; Grüning, T.; and Labahn, R.\n\n\n \n\n\n\n CoRR, abs/1804.09943. 2018.\n \n\n\n\n
\n\n\n\n \n \n \"SystemPaper\n  \n \n\n \n\n \n link\n  \n \n\n bibtex\n \n\n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@article{strauss_system_2018,\n\ttitle = {System {Description} of {CITlab}'s {Recognition} \\& {Retrieval} {Engine} for {ICDAR2017} {Competition} on {Information} {Extraction} in {Historical} {Handwritten} {Records}},\n\tvolume = {abs/1804.09943},\n\turl = {http://arxiv.org/abs/1804.09943},\n\turldate = {2018-06-29},\n\tjournal = {CoRR},\n\tauthor = {Strauss, Tobias and Weidemann, Max and Michael, Johannes and Leifert, Gundram and Grüning, Tobias and Labahn, Roger},\n\tyear = {2018},\n}\n\n
\n
\n\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Wick, C.\n \n \n (1)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n Calamari - A High-Performance Tensorflow-based Deep Learning Package for Optical Character Recognition.\n \n \n \n\n\n \n Wick, C.; Reul, C.; and Puppe, F.\n\n\n \n\n\n\n Digital Humanities Quarterly, 14(1). 2020.\n \n\n\n\n
\n\n\n\n \n\n \n\n \n link\n  \n \n\n bibtex\n \n\n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@article{wick2020,\n\ttitle = {Calamari - {A} {High}-{Performance} {Tensorflow}-based {Deep} {Learning} {Package} for {Optical} {Character} {Recognition}},\n\tvolume = {14},\n\tnumber = {1},\n\tjournal = {Digital Humanities Quarterly},\n\tauthor = {Wick, Christoph and Reul, Christian and Puppe, Frank},\n\tyear = {2020},\n}\n\n
\n
\n\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Widmer, J.\n \n \n (1)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n \n Bullingers Briefwechsel zugänglich machen: Stand der Handschriftenerkennung.\n \n \n \n \n\n\n \n Ströbel, P.; Hodel, T.; Fischer, A.; Scius, A.; Wolf, B.; Janka, A.; Widmer, J.; Scheurer, P.; and Volk, M.\n\n\n \n\n\n\n . March 2023.\n Publisher: [object Object]\n\n\n\n
\n\n\n\n \n \n \"BullingersPaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n\n\n\n
\n
@article{strobel2023a,\n\ttitle = {Bullingers {Briefwechsel} zugänglich machen: {Stand} der {Handschriftenerkennung}},\n\tcopyright = {Creative Commons Attribution 4.0 International, Open Access},\n\tshorttitle = {Bullingers {Briefwechsel} zugänglich machen},\n\turl = {https://zenodo.org/record/7715357},\n\tdoi = {10.5281/ZENODO.7715357},\n\tabstract = {"Anhand des Briefwechsels Heinrich Bullingers (1504-1575), das rund 10'000 Briefe umfasst, demonstrieren wir den Stand der Forschung in automatisierter Handschriftenerkennung. Es finden sich mehr als hundert unterschiedliche Schreiberhände in den Briefen mit sehr unterschiedlicher Verteilung. Das Korpus ist zweisprachig (Latein/Deutsch) und teilweise findet der Sprachwechsel innerhalb von Abschnitten oder gar Sätzen statt. Auf Grund dieser Vielfalt eignet sich der Briefwechsel optimal als Testumgebung für entsprechende Algorithmen und ist aufschlussreiche für Forschungsprojekte und Erinnerungsinstitutionen mit ähnlichen Problemstellungen. Im Paper werden drei Verfahren gegeneinander gestellt und abgewogen. Im folgenden werde drei Ansätze an dem Korpus getestet, die Aufschlüsse zum Stand und möglichen Entwicklungen im Bereich der Handschriftenerkennung versprechen. Erstens wird mit Transkribus eine etablierte Plattform genutzt, die zwei Engines (HTR+ und PyLaia) anbietet. Zweitens wird mit Hilfe von Data Augmentation versucht die Erkennung mit der state-of-the-art Engine HTRFlor zu verbessern und drittens werden neue Transformer-basierte Modelle (TrOCR) eingesetzt." Ein Beitrag zur 9. Tagung des Verbands "Digital Humanities im deutschsprachigen Raum" - DHd 2023 Open Humanities Open Culture.},\n\turldate = {2024-04-22},\n\tauthor = {Ströbel, Phillip and Hodel, Tobias and Fischer, Andreas and Scius, Anna and Wolf, Beat and Janka, Anna and Widmer, Jonas and Scheurer, Patricia and Volk, Martin},\n\tcollaborator = {Trilcke, Peer and Busch, Anna and Helling, Patrick and Plum, Alistair and Wolter, Vivien and Weis, Joëlle and Chudoba, Hendrik},\n\tmonth = mar,\n\tyear = {2023},\n\tnote = {Publisher: [object Object]},\n\tkeywords = {Annotieren, Bewertung, DHd2023, Data augmentation, Daten, Handschriftenerkennung, Manuskript, Transkription, maschinelles Lernen},\n}\n\n
\n
\n\n\n
\n \"Anhand des Briefwechsels Heinrich Bullingers (1504-1575), das rund 10'000 Briefe umfasst, demonstrieren wir den Stand der Forschung in automatisierter Handschriftenerkennung. Es finden sich mehr als hundert unterschiedliche Schreiberhände in den Briefen mit sehr unterschiedlicher Verteilung. Das Korpus ist zweisprachig (Latein/Deutsch) und teilweise findet der Sprachwechsel innerhalb von Abschnitten oder gar Sätzen statt. Auf Grund dieser Vielfalt eignet sich der Briefwechsel optimal als Testumgebung für entsprechende Algorithmen und ist aufschlussreiche für Forschungsprojekte und Erinnerungsinstitutionen mit ähnlichen Problemstellungen. Im Paper werden drei Verfahren gegeneinander gestellt und abgewogen. Im folgenden werde drei Ansätze an dem Korpus getestet, die Aufschlüsse zum Stand und möglichen Entwicklungen im Bereich der Handschriftenerkennung versprechen. Erstens wird mit Transkribus eine etablierte Plattform genutzt, die zwei Engines (HTR+ und PyLaia) anbietet. Zweitens wird mit Hilfe von Data Augmentation versucht die Erkennung mit der state-of-the-art Engine HTRFlor zu verbessern und drittens werden neue Transformer-basierte Modelle (TrOCR) eingesetzt.\" Ein Beitrag zur 9. Tagung des Verbands \"Digital Humanities im deutschsprachigen Raum\" - DHd 2023 Open Humanities Open Culture.\n
\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Wolf, B.\n \n \n (1)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n \n Bullingers Briefwechsel zugänglich machen: Stand der Handschriftenerkennung.\n \n \n \n \n\n\n \n Ströbel, P.; Hodel, T.; Fischer, A.; Scius, A.; Wolf, B.; Janka, A.; Widmer, J.; Scheurer, P.; and Volk, M.\n\n\n \n\n\n\n . March 2023.\n Publisher: [object Object]\n\n\n\n
\n\n\n\n \n \n \"BullingersPaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n\n\n\n
\n
@article{strobel2023a,\n\ttitle = {Bullingers {Briefwechsel} zugänglich machen: {Stand} der {Handschriftenerkennung}},\n\tcopyright = {Creative Commons Attribution 4.0 International, Open Access},\n\tshorttitle = {Bullingers {Briefwechsel} zugänglich machen},\n\turl = {https://zenodo.org/record/7715357},\n\tdoi = {10.5281/ZENODO.7715357},\n\tabstract = {"Anhand des Briefwechsels Heinrich Bullingers (1504-1575), das rund 10'000 Briefe umfasst, demonstrieren wir den Stand der Forschung in automatisierter Handschriftenerkennung. Es finden sich mehr als hundert unterschiedliche Schreiberhände in den Briefen mit sehr unterschiedlicher Verteilung. Das Korpus ist zweisprachig (Latein/Deutsch) und teilweise findet der Sprachwechsel innerhalb von Abschnitten oder gar Sätzen statt. Auf Grund dieser Vielfalt eignet sich der Briefwechsel optimal als Testumgebung für entsprechende Algorithmen und ist aufschlussreiche für Forschungsprojekte und Erinnerungsinstitutionen mit ähnlichen Problemstellungen. Im Paper werden drei Verfahren gegeneinander gestellt und abgewogen. Im folgenden werde drei Ansätze an dem Korpus getestet, die Aufschlüsse zum Stand und möglichen Entwicklungen im Bereich der Handschriftenerkennung versprechen. Erstens wird mit Transkribus eine etablierte Plattform genutzt, die zwei Engines (HTR+ und PyLaia) anbietet. Zweitens wird mit Hilfe von Data Augmentation versucht die Erkennung mit der state-of-the-art Engine HTRFlor zu verbessern und drittens werden neue Transformer-basierte Modelle (TrOCR) eingesetzt." Ein Beitrag zur 9. Tagung des Verbands "Digital Humanities im deutschsprachigen Raum" - DHd 2023 Open Humanities Open Culture.},\n\turldate = {2024-04-22},\n\tauthor = {Ströbel, Phillip and Hodel, Tobias and Fischer, Andreas and Scius, Anna and Wolf, Beat and Janka, Anna and Widmer, Jonas and Scheurer, Patricia and Volk, Martin},\n\tcollaborator = {Trilcke, Peer and Busch, Anna and Helling, Patrick and Plum, Alistair and Wolter, Vivien and Weis, Joëlle and Chudoba, Hendrik},\n\tmonth = mar,\n\tyear = {2023},\n\tnote = {Publisher: [object Object]},\n\tkeywords = {Annotieren, Bewertung, DHd2023, Data augmentation, Daten, Handschriftenerkennung, Manuskript, Transkription, maschinelles Lernen},\n}\n\n
\n
\n\n\n
\n \"Anhand des Briefwechsels Heinrich Bullingers (1504-1575), das rund 10'000 Briefe umfasst, demonstrieren wir den Stand der Forschung in automatisierter Handschriftenerkennung. Es finden sich mehr als hundert unterschiedliche Schreiberhände in den Briefen mit sehr unterschiedlicher Verteilung. Das Korpus ist zweisprachig (Latein/Deutsch) und teilweise findet der Sprachwechsel innerhalb von Abschnitten oder gar Sätzen statt. Auf Grund dieser Vielfalt eignet sich der Briefwechsel optimal als Testumgebung für entsprechende Algorithmen und ist aufschlussreiche für Forschungsprojekte und Erinnerungsinstitutionen mit ähnlichen Problemstellungen. Im Paper werden drei Verfahren gegeneinander gestellt und abgewogen. Im folgenden werde drei Ansätze an dem Korpus getestet, die Aufschlüsse zum Stand und möglichen Entwicklungen im Bereich der Handschriftenerkennung versprechen. Erstens wird mit Transkribus eine etablierte Plattform genutzt, die zwei Engines (HTR+ und PyLaia) anbietet. Zweitens wird mit Hilfe von Data Augmentation versucht die Erkennung mit der state-of-the-art Engine HTRFlor zu verbessern und drittens werden neue Transformer-basierte Modelle (TrOCR) eingesetzt.\" Ein Beitrag zur 9. Tagung des Verbands \"Digital Humanities im deutschsprachigen Raum\" - DHd 2023 Open Humanities Open Culture.\n
\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Wooldridge, M.\n \n \n (1)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n Artificial intelligence: a modern approach.\n \n \n \n\n\n \n Russell, S. J.; Norvig, P.; Chang, M.; Devlin, J.; Dragan, A.; Forsyth, D.; Goodfellow, I.; Malik, J.; Mansinghka, V.; Pearl, J.; and Wooldridge, M. J.\n\n\n \n\n\n\n of Pearson series in artificial intelligencePearson, Harlow, Fourth edition, global edition edition, 2022.\n \n\n\n\n
\n\n\n\n \n\n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@book{russell_artificial_2022,\n\taddress = {Harlow},\n\tedition = {Fourth edition, global edition},\n\tseries = {Pearson series in artificial intelligence},\n\ttitle = {Artificial intelligence: a modern approach},\n\tisbn = {978-1-292-40113-3},\n\tshorttitle = {Artificial intelligence},\n\tabstract = {"Updated edition of popular textbook on Artificial Intelligence. This edition specific looks at ways of keeping artificial intelligence under control"},\n\tlanguage = {eng},\n\tpublisher = {Pearson},\n\tauthor = {Russell, Stuart J. and Norvig, Peter and Chang, Ming-wei and Devlin, Jacob and Dragan, Anca and Forsyth, David and Goodfellow, Ian and Malik, Jitendra and Mansinghka, Vikas and Pearl, Judea and Wooldridge, Michael J.},\n\tyear = {2022},\n}\n\n
\n
\n\n\n
\n \"Updated edition of popular textbook on Artificial Intelligence. This edition specific looks at ways of keeping artificial intelligence under control\"\n
\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Wu, J.\n \n \n (1)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n \n Language Models are Unsupervised Multitask Learners.\n \n \n \n \n\n\n \n Radford, A.; Wu, J.; Child, R.; Luan, D.; Amodei, D.; and Sutskever, I.\n\n\n \n\n\n\n In 2019. \n \n\n\n\n
\n\n\n\n \n \n \"LanguagePaper\n  \n \n\n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@inproceedings{radford_language_2019,\n\ttitle = {Language {Models} are {Unsupervised} {Multitask} {Learners}},\n\turl = {https://www.semanticscholar.org/paper/Language-Models-are-Unsupervised-Multitask-Learners-Radford-Wu/9405cc0d6169988371b2755e573cc28650d14dfe},\n\tabstract = {Natural language processing tasks, such as question answering, machine translation, reading comprehension, and summarization, are typically approached with supervised learning on taskspecific datasets. We demonstrate that language models begin to learn these tasks without any explicit supervision when trained on a new dataset of millions of webpages called WebText. When conditioned on a document plus questions, the answers generated by the language model reach 55 F1 on the CoQA dataset matching or exceeding the performance of 3 out of 4 baseline systems without using the 127,000+ training examples. The capacity of the language model is essential to the success of zero-shot task transfer and increasing it improves performance in a log-linear fashion across tasks. Our largest model, GPT-2, is a 1.5B parameter Transformer that achieves state of the art results on 7 out of 8 tested language modeling datasets in a zero-shot setting but still underfits WebText. Samples from the model reflect these improvements and contain coherent paragraphs of text. These findings suggest a promising path towards building language processing systems which learn to perform tasks from their naturally occurring demonstrations.},\n\turldate = {2023-02-02},\n\tauthor = {Radford, Alec and Wu, Jeff and Child, Rewon and Luan, D. and Amodei, Dario and Sutskever, Ilya},\n\tyear = {2019},\n}\n\n
\n
\n\n\n
\n Natural language processing tasks, such as question answering, machine translation, reading comprehension, and summarization, are typically approached with supervised learning on taskspecific datasets. We demonstrate that language models begin to learn these tasks without any explicit supervision when trained on a new dataset of millions of webpages called WebText. When conditioned on a document plus questions, the answers generated by the language model reach 55 F1 on the CoQA dataset matching or exceeding the performance of 3 out of 4 baseline systems without using the 127,000+ training examples. The capacity of the language model is essential to the success of zero-shot task transfer and increasing it improves performance in a log-linear fashion across tasks. Our largest model, GPT-2, is a 1.5B parameter Transformer that achieves state of the art results on 7 out of 8 tested language modeling datasets in a zero-shot setting but still underfits WebText. Samples from the model reflect these improvements and contain coherent paragraphs of text. These findings suggest a promising path towards building language processing systems which learn to perform tasks from their naturally occurring demonstrations.\n
\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Wu, S.\n \n \n (1)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n \n Model Cards for Model Reporting.\n \n \n \n \n\n\n \n Mitchell, M.; Wu, S.; Zaldivar, A.; Barnes, P.; Vasserman, L.; Hutchinson, B.; Spitzer, E.; Raji, I. D.; and Gebru, T.\n\n\n \n\n\n\n Proceedings of the Conference on Fairness, Accountability, and Transparency,220–229. January 2019.\n arXiv: 1810.03993\n\n\n\n
\n\n\n\n \n \n \"ModelPaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n \n \n \n \n\n\n\n
\n
@article{mitchell_model_2019,\n\ttitle = {Model {Cards} for {Model} {Reporting}},\n\turl = {http://arxiv.org/abs/1810.03993},\n\tdoi = {10.1145/3287560.3287596},\n\tabstract = {Trained machine learning models are increasingly used to perform high-impact tasks in areas such as law enforcement, medicine, education, and employment. In order to clarify the intended use cases of machine learning models and minimize their usage in contexts for which they are not well suited, we recommend that released models be accompanied by documentation detailing their performance characteristics. In this paper, we propose a framework that we call model cards, to encourage such transparent model reporting. Model cards are short documents accompanying trained machine learning models that provide benchmarked evaluation in a variety of conditions, such as across different cultural, demographic, or phenotypic groups (e.g., race, geographic location, sex, Fitzpatrick skin type) and intersectional groups (e.g., age and race, or sex and Fitzpatrick skin type) that are relevant to the intended application domains. Model cards also disclose the context in which models are intended to be used, details of the performance evaluation procedures, and other relevant information. While we focus primarily on human-centered machine learning models in the application fields of computer vision and natural language processing, this framework can be used to document any trained machine learning model. To solidify the concept, we provide cards for two supervised models: One trained to detect smiling faces in images, and one trained to detect toxic comments in text. We propose model cards as a step towards the responsible democratization of machine learning and related AI technology, increasing transparency into how well AI technology works. We hope this work encourages those releasing trained machine learning models to accompany model releases with similar detailed evaluation numbers and other relevant documentation.},\n\turldate = {2022-01-24},\n\tjournal = {Proceedings of the Conference on Fairness, Accountability, and Transparency},\n\tauthor = {Mitchell, Margaret and Wu, Simone and Zaldivar, Andrew and Barnes, Parker and Vasserman, Lucy and Hutchinson, Ben and Spitzer, Elena and Raji, Inioluwa Deborah and Gebru, Timnit},\n\tmonth = jan,\n\tyear = {2019},\n\tnote = {arXiv: 1810.03993},\n\tkeywords = {Computer Science - Artificial Intelligence, Computer Science - Machine Learning},\n\tpages = {220--229},\n}\n\n
\n
\n\n\n
\n Trained machine learning models are increasingly used to perform high-impact tasks in areas such as law enforcement, medicine, education, and employment. In order to clarify the intended use cases of machine learning models and minimize their usage in contexts for which they are not well suited, we recommend that released models be accompanied by documentation detailing their performance characteristics. In this paper, we propose a framework that we call model cards, to encourage such transparent model reporting. Model cards are short documents accompanying trained machine learning models that provide benchmarked evaluation in a variety of conditions, such as across different cultural, demographic, or phenotypic groups (e.g., race, geographic location, sex, Fitzpatrick skin type) and intersectional groups (e.g., age and race, or sex and Fitzpatrick skin type) that are relevant to the intended application domains. Model cards also disclose the context in which models are intended to be used, details of the performance evaluation procedures, and other relevant information. While we focus primarily on human-centered machine learning models in the application fields of computer vision and natural language processing, this framework can be used to document any trained machine learning model. To solidify the concept, we provide cards for two supervised models: One trained to detect smiling faces in images, and one trained to detect toxic comments in text. We propose model cards as a step towards the responsible democratization of machine learning and related AI technology, increasing transparency into how well AI technology works. We hope this work encourages those releasing trained machine learning models to accompany model releases with similar detailed evaluation numbers and other relevant documentation.\n
\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Wu, Y.\n \n \n (1)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n \n Detectron2.\n \n \n \n \n\n\n \n Wu, Y.; Kirillov, A.; Massa, F.; Lo, W.; and Girshick, R.\n\n\n \n\n\n\n 2019.\n \n\n\n\n
\n\n\n\n \n \n \"Detectron2Paper\n  \n \n\n \n\n \n link\n  \n \n\n bibtex\n \n\n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@misc{wu2019,\n\ttitle = {Detectron2},\n\turl = {https://github.com/facebookresearch/detectron2},\n\tauthor = {Wu, Yuxin and Kirillov, Alexander and Massa, Francisco and Lo, Wan-Yen and Girshick, Ross},\n\tyear = {2019},\n}\n\n
\n
\n\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Wustlich, W.\n \n \n (1)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n \n Cells in Multidimensional Recurrent Neural Networks.\n \n \n \n \n\n\n \n Leifert, G.; Strauß, T.; Grüning, T.; Wustlich, W.; and Labahn, R.\n\n\n \n\n\n\n Journal of Machine Learning Research, 17: 97:1–97:37. 2016.\n \n\n\n\n
\n\n\n\n \n \n \"CellsPaper\n  \n \n\n \n\n \n link\n  \n \n\n bibtex\n \n\n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@article{leifert_cells_2016,\n\ttitle = {Cells in {Multidimensional} {Recurrent} {Neural} {Networks}},\n\tvolume = {17},\n\turl = {http://jmlr.org/papers/v17/14-203.html},\n\turldate = {2018-06-29},\n\tjournal = {Journal of Machine Learning Research},\n\tauthor = {Leifert, Gundram and Strauß, Tobias and Grüning, Tobias and Wustlich, Welf and Labahn, Roger},\n\tyear = {2016},\n\tpages = {97:1--97:37},\n}\n\n
\n
\n\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Xu, S.\n \n \n (1)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n \n TRIG: Transformer-Based Text Recognizer with Initial Embedding Guidance.\n \n \n \n \n\n\n \n Tao, Y.; Jia, Z.; Ma, R.; and Xu, S.\n\n\n \n\n\n\n Electronics, 10(22): 2780. January 2021.\n Number: 22 Publisher: Multidisciplinary Digital Publishing Institute\n\n\n\n
\n\n\n\n \n \n \"TRIG:Paper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n \n \n \n \n \n \n \n \n \n \n\n\n\n
\n
@article{tao_trig_2021,\n\ttitle = {{TRIG}: {Transformer}-{Based} {Text} {Recognizer} with {Initial} {Embedding} {Guidance}},\n\tvolume = {10},\n\tcopyright = {http://creativecommons.org/licenses/by/3.0/},\n\tissn = {2079-9292},\n\tshorttitle = {{TRIG}},\n\turl = {https://www.mdpi.com/2079-9292/10/22/2780},\n\tdoi = {10.3390/electronics10222780},\n\tabstract = {Scene text recognition (STR) is an important bridge between images and text, attracting abundant research attention. While convolutional neural networks (CNNS) have achieved remarkable progress in this task, most of the existing works need an extra module (context modeling module) to help CNN to capture global dependencies to solve the inductive bias and strengthen the relationship between text features. Recently, the transformer has been proposed as a promising network for global context modeling by self-attention mechanism, but one of the main short-comings, when applied to recognition, is the efficiency. We propose a 1-D split to address the challenges of complexity and replace the CNN with the transformer encoder to reduce the need for a context modeling module. Furthermore, recent methods use a frozen initial embedding to guide the decoder to decode the features to text, leading to a loss of accuracy. We propose to use a learnable initial embedding learned from the transformer encoder to make it adaptive to different input images. Above all, we introduce a novel architecture for text recognition, named TRansformer-based text recognizer with Initial embedding Guidance (TRIG), composed of three stages (transformation, feature extraction, and prediction). Extensive experiments show that our approach can achieve state-of-the-art on text recognition benchmarks.},\n\tlanguage = {en},\n\tnumber = {22},\n\turldate = {2023-09-29},\n\tjournal = {Electronics},\n\tauthor = {Tao, Yue and Jia, Zhiwei and Ma, Runze and Xu, Shugong},\n\tmonth = jan,\n\tyear = {2021},\n\tnote = {Number: 22\nPublisher: Multidisciplinary Digital Publishing Institute},\n\tkeywords = {1-D split, initial embedding, scene text recognition, self-attention, transformer},\n\tpages = {2780},\n}\n\n
\n
\n\n\n
\n Scene text recognition (STR) is an important bridge between images and text, attracting abundant research attention. While convolutional neural networks (CNNS) have achieved remarkable progress in this task, most of the existing works need an extra module (context modeling module) to help CNN to capture global dependencies to solve the inductive bias and strengthen the relationship between text features. Recently, the transformer has been proposed as a promising network for global context modeling by self-attention mechanism, but one of the main short-comings, when applied to recognition, is the efficiency. We propose a 1-D split to address the challenges of complexity and replace the CNN with the transformer encoder to reduce the need for a context modeling module. Furthermore, recent methods use a frozen initial embedding to guide the decoder to decode the features to text, leading to a loss of accuracy. We propose to use a learnable initial embedding learned from the transformer encoder to make it adaptive to different input images. Above all, we introduce a novel architecture for text recognition, named TRansformer-based text recognizer with Initial embedding Guidance (TRIG), composed of three stages (transformation, feature extraction, and prediction). Extensive experiments show that our approach can achieve state-of-the-art on text recognition benchmarks.\n
\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Zaldivar, A.\n \n \n (1)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n \n Model Cards for Model Reporting.\n \n \n \n \n\n\n \n Mitchell, M.; Wu, S.; Zaldivar, A.; Barnes, P.; Vasserman, L.; Hutchinson, B.; Spitzer, E.; Raji, I. D.; and Gebru, T.\n\n\n \n\n\n\n Proceedings of the Conference on Fairness, Accountability, and Transparency,220–229. January 2019.\n arXiv: 1810.03993\n\n\n\n
\n\n\n\n \n \n \"ModelPaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n \n \n \n \n\n\n\n
\n
@article{mitchell_model_2019,\n\ttitle = {Model {Cards} for {Model} {Reporting}},\n\turl = {http://arxiv.org/abs/1810.03993},\n\tdoi = {10.1145/3287560.3287596},\n\tabstract = {Trained machine learning models are increasingly used to perform high-impact tasks in areas such as law enforcement, medicine, education, and employment. In order to clarify the intended use cases of machine learning models and minimize their usage in contexts for which they are not well suited, we recommend that released models be accompanied by documentation detailing their performance characteristics. In this paper, we propose a framework that we call model cards, to encourage such transparent model reporting. Model cards are short documents accompanying trained machine learning models that provide benchmarked evaluation in a variety of conditions, such as across different cultural, demographic, or phenotypic groups (e.g., race, geographic location, sex, Fitzpatrick skin type) and intersectional groups (e.g., age and race, or sex and Fitzpatrick skin type) that are relevant to the intended application domains. Model cards also disclose the context in which models are intended to be used, details of the performance evaluation procedures, and other relevant information. While we focus primarily on human-centered machine learning models in the application fields of computer vision and natural language processing, this framework can be used to document any trained machine learning model. To solidify the concept, we provide cards for two supervised models: One trained to detect smiling faces in images, and one trained to detect toxic comments in text. We propose model cards as a step towards the responsible democratization of machine learning and related AI technology, increasing transparency into how well AI technology works. We hope this work encourages those releasing trained machine learning models to accompany model releases with similar detailed evaluation numbers and other relevant documentation.},\n\turldate = {2022-01-24},\n\tjournal = {Proceedings of the Conference on Fairness, Accountability, and Transparency},\n\tauthor = {Mitchell, Margaret and Wu, Simone and Zaldivar, Andrew and Barnes, Parker and Vasserman, Lucy and Hutchinson, Ben and Spitzer, Elena and Raji, Inioluwa Deborah and Gebru, Timnit},\n\tmonth = jan,\n\tyear = {2019},\n\tnote = {arXiv: 1810.03993},\n\tkeywords = {Computer Science - Artificial Intelligence, Computer Science - Machine Learning},\n\tpages = {220--229},\n}\n\n
\n
\n\n\n
\n Trained machine learning models are increasingly used to perform high-impact tasks in areas such as law enforcement, medicine, education, and employment. In order to clarify the intended use cases of machine learning models and minimize their usage in contexts for which they are not well suited, we recommend that released models be accompanied by documentation detailing their performance characteristics. In this paper, we propose a framework that we call model cards, to encourage such transparent model reporting. Model cards are short documents accompanying trained machine learning models that provide benchmarked evaluation in a variety of conditions, such as across different cultural, demographic, or phenotypic groups (e.g., race, geographic location, sex, Fitzpatrick skin type) and intersectional groups (e.g., age and race, or sex and Fitzpatrick skin type) that are relevant to the intended application domains. Model cards also disclose the context in which models are intended to be used, details of the performance evaluation procedures, and other relevant information. While we focus primarily on human-centered machine learning models in the application fields of computer vision and natural language processing, this framework can be used to document any trained machine learning model. To solidify the concept, we provide cards for two supervised models: One trained to detect smiling faces in images, and one trained to detect toxic comments in text. We propose model cards as a step towards the responsible democratization of machine learning and related AI technology, increasing transparency into how well AI technology works. We hope this work encourages those releasing trained machine learning models to accompany model releases with similar detailed evaluation numbers and other relevant documentation.\n
\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n Zhou, L.\n \n \n (1)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n \n Neuro-symbolic approaches in artificial intelligence.\n \n \n \n \n\n\n \n Hitzler, P.; Eberhart, A.; Ebrahimi, M.; Sarker, M. K.; and Zhou, L.\n\n\n \n\n\n\n National Science Review, 9(6): nwac035. June 2022.\n \n\n\n\n
\n\n\n\n \n \n \"Neuro-symbolicPaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@article{hitzler2022,\n\ttitle = {Neuro-symbolic approaches in artificial intelligence},\n\tvolume = {9},\n\tissn = {2095-5138},\n\turl = {https://doi.org/10.1093/nsr/nwac035},\n\tdoi = {10.1093/nsr/nwac035},\n\tnumber = {6},\n\turldate = {2024-01-23},\n\tjournal = {National Science Review},\n\tauthor = {Hitzler, Pascal and Eberhart, Aaron and Ebrahimi, Monireh and Sarker, Md Kamruzzaman and Zhou, Lu},\n\tmonth = jun,\n\tyear = {2022},\n\tpages = {nwac035},\n}\n\n
\n
\n\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n de Freitas, N.\n \n \n (1)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n \n Restoring and attributing ancient texts using deep neural networks.\n \n \n \n \n\n\n \n Assael, Y.; Sommerschield, T.; Shillingford, B.; Bordbar, M.; Pavlopoulos, J.; Chatzipanagiotou, M.; Androutsopoulos, I.; Prag, J.; and de Freitas, N.\n\n\n \n\n\n\n Nature, 603(7900): 280–283. March 2022.\n Number: 7900 Publisher: Nature Publishing Group\n\n\n\n
\n\n\n\n \n \n \"RestoringPaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n \n \n \n \n \n \n\n\n\n
\n
@article{assael_restoring_2022,\n\ttitle = {Restoring and attributing ancient texts using deep neural networks},\n\tvolume = {603},\n\tcopyright = {2022 The Author(s)},\n\tissn = {1476-4687},\n\turl = {https://www.nature.com/articles/s41586-022-04448-z/},\n\tdoi = {10.1038/s41586-022-04448-z},\n\tabstract = {Ancient history relies on disciplines such as epigraphy—the study of inscribed texts known as inscriptions—for evidence of the thought, language, society and history of past civilizations1. However, over the centuries, many inscriptions have been damaged to the point of illegibility, transported far from their original location and their date of writing is steeped in uncertainty. Here we present Ithaca, a deep neural network for the textual restoration, geographical attribution and chronological attribution of ancient Greek inscriptions. Ithaca is designed to assist and expand the historian’s workflow. The architecture of Ithaca focuses on collaboration, decision support and interpretability. While Ithaca alone achieves 62\\% accuracy when restoring damaged texts, the use of Ithaca by historians improved their accuracy from 25\\% to 72\\%, confirming the synergistic effect of this research tool. Ithaca can attribute inscriptions to their original location with an accuracy of 71\\% and can date them to less than 30 years of their ground-truth ranges, redating key texts of Classical Athens and contributing to topical debates in ancient history. This research shows how models such as Ithaca can unlock the cooperative potential between artificial intelligence and historians, transformationally impacting the way that we study and write about one of the most important periods in human history.},\n\tlanguage = {en},\n\tnumber = {7900},\n\turldate = {2022-09-28},\n\tjournal = {Nature},\n\tauthor = {Assael, Yannis and Sommerschield, Thea and Shillingford, Brendan and Bordbar, Mahyar and Pavlopoulos, John and Chatzipanagiotou, Marita and Androutsopoulos, Ion and Prag, Jonathan and de Freitas, Nando},\n\tmonth = mar,\n\tyear = {2022},\n\tnote = {Number: 7900\nPublisher: Nature Publishing Group},\n\tkeywords = {Archaeology, Computer science, History},\n\tpages = {280--283},\n}\n\n
\n
\n\n\n
\n Ancient history relies on disciplines such as epigraphy—the study of inscribed texts known as inscriptions—for evidence of the thought, language, society and history of past civilizations1. However, over the centuries, many inscriptions have been damaged to the point of illegibility, transported far from their original location and their date of writing is steeped in uncertainty. Here we present Ithaca, a deep neural network for the textual restoration, geographical attribution and chronological attribution of ancient Greek inscriptions. Ithaca is designed to assist and expand the historian’s workflow. The architecture of Ithaca focuses on collaboration, decision support and interpretability. While Ithaca alone achieves 62% accuracy when restoring damaged texts, the use of Ithaca by historians improved their accuracy from 25% to 72%, confirming the synergistic effect of this research tool. Ithaca can attribute inscriptions to their original location with an accuracy of 71% and can date them to less than 30 years of their ground-truth ranges, redating key texts of Classical Athens and contributing to topical debates in ancient history. This research shows how models such as Ithaca can unlock the cooperative potential between artificial intelligence and historians, transformationally impacting the way that we study and write about one of the most important periods in human history.\n
\n\n\n
\n\n\n\n\n\n
\n
\n\n\n\n\n
\n\n\n \n\n \n \n \n \n\n
\n"}; document.write(bibbase_data.data);