\n \n \n
\n
\n\n \n \n \n \n \n A Machine Learning Approach Using Open Databases to Support Drug Delivery Prediction.\n \n \n \n\n\n \n Pestana, H.; Regino, A. G.; Dametto, M.; Zagatti, F. R.; and Bonacin, R.\n\n\n \n\n\n\n In Costin, H.; Magjarevic, R.; and Petroiu, G., editor(s),
Advances in Digital Health and Medical Bioengineering II, pages 484–490, Cham, 2026. Springer Nature Switzerland\n
\n\n
\n\n
\n\n
\n\n \n\n \n \n doi\n \n \n\n \n link\n \n \n\n bibtex\n \n\n \n \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n \n \n \n\n\n\n
\n
@InProceedings{10.1007/978-3-032-24724-7_49,\nauthor="Pestana, Helder\nand Regino, Andr{\\'e} Gomes\nand Dametto, Mariangela\nand Zagatti, Fernando Rezende\nand Bonacin, Rodrigo",\neditor="Costin, Hariton-Nicolae\nand Magjarevic, Ratko\nand Petroiu, Gabriela-Gladiola",\ntitle="A Machine Learning Approach Using Open Databases to Support Drug Delivery Prediction",\nbooktitle="Advances in Digital Health and Medical Bioengineering II",\nyear="2026",\npublisher="Springer Nature Switzerland",\naddress="Cham",\npages="484--490",\nabstract="The development of effective and safe drugs is a complex and resource-intensive process that often relies on uncertain trial-and-error methods. Predicting pharmacokinetic properties, such as drug delivery, is decisive for accelerating drug discovery and enhancing therapeutic outcomes. This paper presents a Machine Learning (ML) and Deep Learning (DL) based approach utilizing open pharmacological databases to predict properties associated with drug distribution, with a focus on bioavailability and the octanol-water partition coefficient (LogP). The study encompasses data preprocessing, molecular representation via SMILES encoding, and model evaluation utilizing regression and classification metrics. Results show promising predictive performance, suggesting that ML and DL techniques can optimize early drug discovery stages and support decision-making.",\nisbn="978-3-032-24724-7",\ndoi="10.1007/978-3-032-24724-7_49"\n}\n\n\n
\n\n\n
\n The development of effective and safe drugs is a complex and resource-intensive process that often relies on uncertain trial-and-error methods. Predicting pharmacokinetic properties, such as drug delivery, is decisive for accelerating drug discovery and enhancing therapeutic outcomes. This paper presents a Machine Learning (ML) and Deep Learning (DL) based approach utilizing open pharmacological databases to predict properties associated with drug distribution, with a focus on bioavailability and the octanol-water partition coefficient (LogP). The study encompasses data preprocessing, molecular representation via SMILES encoding, and model evaluation utilizing regression and classification metrics. Results show promising predictive performance, suggesting that ML and DL techniques can optimize early drug discovery stages and support decision-making.\n
\n\n\n
\n\n\n
\n
\n\n \n \n \n \n \n LLM-Based Solution Applied to Explore Healthcare Datasets.\n \n \n \n\n\n \n Zagatti, F. R.; Regino, A. G.; Lopes, F. L.; Shimizu, G. Y.; Bonacin, R.; Lucrédio, D.; and de Medeiros Caseli, H.\n\n\n \n\n\n\n In Costin, H.; Magjarevic, R.; and Petroiu, G., editor(s),
Advances in Digital Health and Medical Bioengineering II, pages 525–530, Cham, 2026. Springer Nature Switzerland\n
\n\n
\n\n
\n\n
\n\n \n\n \n \n doi\n \n \n\n \n link\n \n \n\n bibtex\n \n\n \n \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n \n \n \n\n\n\n
\n
@InProceedings{10.1007/978-3-032-24724-7_53,\nauthor="Zagatti, Fernando Rezende\nand Regino, Andr{\\'e} Gomes\nand Lopes, Filipe Loyola\nand Shimizu, Gilson Yuuji\nand Bonacin, Rodrigo\nand Lucr{\\'e}dio, Daniel\nand de Medeiros Caseli, Helena",\neditor="Costin, Hariton-Nicolae\nand Magjarevic, Ratko\nand Petroiu, Gabriela-Gladiola",\ntitle="LLM-Based Solution Applied to Explore Healthcare Datasets",\nbooktitle="Advances in Digital Health and Medical Bioengineering II",\nyear="2026",\npublisher="Springer Nature Switzerland",\naddress="Cham",\npages="525--530",\nabstract="The growing availability of open health datasets has advanced medical research and healthcare innovation. This study proposes a Large Language Model (LLM) approach that enables Exploratory Data Analysis (EDA) through natural language queries by integrating Retrieval-Augmented Generation (RAG) and post-processing mechanisms. It was evaluated using five open health datasets, encompassing both structured and unstructured data. The results show that the approach effectively describes datasets, identifies outliers, and produces diverse visualizations, including histograms and correlation heatmaps. The method demonstrates the feasibility of using LLMs to automate and democratize EDA, enhancing accessibility and interpretability in health data exploration.",\nisbn="978-3-032-24724-7",\ndoi="10.1007/978-3-032-24724-7_53"\n}\n\n\n
\n\n\n
\n The growing availability of open health datasets has advanced medical research and healthcare innovation. This study proposes a Large Language Model (LLM) approach that enables Exploratory Data Analysis (EDA) through natural language queries by integrating Retrieval-Augmented Generation (RAG) and post-processing mechanisms. It was evaluated using five open health datasets, encompassing both structured and unstructured data. The results show that the approach effectively describes datasets, identifies outliers, and produces diverse visualizations, including histograms and correlation heatmaps. The method demonstrates the feasibility of using LLMs to automate and democratize EDA, enhancing accessibility and interpretability in health data exploration.\n
\n\n\n
\n\n\n
\n
\n\n \n \n \n \n \n Digital Technologies, Challenges, and Strategies in Distance Education: A Study with Students from a Brazilian Public University.\n \n \n \n\n\n \n Regino, A. G.; Shimizu, G. Y.; Zagatti, F. R.; Lopes, F. L.; Caceffo, R. E.; Bonacin, R.; and dos Reis, J. C.\n\n\n \n\n\n\n In Sprock, A. S.; Bezeira, A. M.; and Agredo-Delgado, V., editor(s),
Proceedings of the 20th Latin American Conference on Learning Technologies (LACLO 2025), pages 258–272, Singapore, 2026. Springer Nature Singapore\n
\n\n
\n\n
\n\n
\n\n \n\n \n\n \n link\n \n \n\n bibtex\n \n\n \n \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n \n \n \n\n\n\n
\n
@inproceedings{10.1007/978-981-95-7580-0_19,\nauthor="Regino, André Gomes\nand Shimizu, Gilson Yuuji\nand Zagatti, Fernando Rezende\nand Lopes, Filipe Loyola\nand Caceffo, Ricardo Edgard\nand Bonacin, Rodrigo\nand dos Reis, Julio Cesar",\neditor="Sprock, Antonio Silva\nand Bezeira, Ana Morales\nand Agredo-Delgado, Vanessa",\ntitle="Digital Technologies, Challenges, and Strategies in Distance Education: A Study with Students from a Brazilian Public University",\nbooktitle="Proceedings of the 20th Latin American Conference on Learning Technologies (LACLO 2025)",\nyear="2026",\npublisher="Springer Nature Singapore",\naddress="Singapore",\npages="258--272",\nabstract="Distance education at scale presents unique opportunities and challenges, especially in public systems designed for broad access. The goal of this study is to explore how students engage with educational technologies and navigate online learning, using data from a Brazilian statewide virtual university. We investigate usage patterns, learning difficulties, and the early adoption of emerging tools (e.g., ChatGPT). Our methodology combines both statistical analysis and topic modeling to uncover behavioral profiles and factors influencing engagement. The results reveal distinct trends across age groups and academic areas, including a counterintuitive finding: younger students, often seen as digital natives, reported more difficulty with focus and engagement than older peers. These patterns suggest the existence of at-risk profiles that can benefit from targeted support strategies. Our findings offer practical implications for improving online learning experiences and inform future research on how digital platforms and AI tools can support inclusive, flexible education in public settings.",\nisbn="978-981-95-7580-0"\n}\n\n\n
\n\n\n
\n Distance education at scale presents unique opportunities and challenges, especially in public systems designed for broad access. The goal of this study is to explore how students engage with educational technologies and navigate online learning, using data from a Brazilian statewide virtual university. We investigate usage patterns, learning difficulties, and the early adoption of emerging tools (e.g., ChatGPT). Our methodology combines both statistical analysis and topic modeling to uncover behavioral profiles and factors influencing engagement. The results reveal distinct trends across age groups and academic areas, including a counterintuitive finding: younger students, often seen as digital natives, reported more difficulty with focus and engagement than older peers. These patterns suggest the existence of at-risk profiles that can benefit from targeted support strategies. Our findings offer practical implications for improving online learning experiences and inform future research on how digital platforms and AI tools can support inclusive, flexible education in public settings.\n
\n\n\n
\n\n\n
\n
\n\n \n \n \n \n \n \n A Systematic Literature Review on RDF Triple Generation From Natural Language Texts.\n \n \n \n \n\n\n \n Regino, A. G.; Rossanez, A.; da Silva Torres, R.; and dos Reis, J. C.\n\n\n \n\n\n\n
Semantic Web, 17(1): 31. 2026.\n
\n\n
\n\n
\n\n
\n\n \n \n
Paper\n \n \n\n \n \n doi\n \n \n\n \n link\n \n \n\n bibtex\n \n\n \n \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n \n \n \n\n\n\n
\n
@article{doi:10.1177/22104968251398355,\nauthor = {André Gomes Regino and Anderson Rossanez and Ricardo da Silva Torres and Julio Cesar dos Reis},\ntitle ={A Systematic Literature Review on RDF Triple Generation From Natural Language Texts},\njournal = {Semantic Web},\nvolume = {17},\nnumber = {1},\npages = {31},\nyear = {2026},\ndoi = {10.1177/22104968251398355},\nURL = {https://doi.org/10.1177/22104968251398355},\nabstract = { We live in a big data era of unstructured data expressed as natural language (NL) texts. As the volume of text-based information grows, effective methods for encoding and extracting meaningful knowledge from this corpus are of paramount relevance. A challenging task concerns transforming NL texts into structured and semantically rich data. Semantic web technologies have revolutionized how we represent and access structured knowledge. Resource description framework (RDF) triples serve as a fundamental building block for this purpose, enabling the integration of diverse data sources. This investigation examines methods for RDF triple generation and knowledge graphs (KGs) enhancement from NL texts. This study area presents wide-ranging applications encompassing knowledge representation, data integration, NL understanding, and information retrieval. Our systematic literature review addresses the understanding, characterization, and identification of challenges and limitations in existing approaches to RDF triple generation from NL texts and their inclusion into an existing KG. We retrieved, categorized, and analyzed 150 articles from several scientific databases. We provide a comprehensive overview of the field, identify research gaps, and provide directions for future research. We found the most commonly available study categories, especially considering the domain, target language, the public availability of datasets, and real-world applications. Our results reveal a growing trend in this field in the last few years related to the use of transformer-based machine learning methods for triple generation. Our study also drives innovation by highlighting open research questions and providing a road map for future investigations. }\n}\n\n\n\n
\n\n\n
\n We live in a big data era of unstructured data expressed as natural language (NL) texts. As the volume of text-based information grows, effective methods for encoding and extracting meaningful knowledge from this corpus are of paramount relevance. A challenging task concerns transforming NL texts into structured and semantically rich data. Semantic web technologies have revolutionized how we represent and access structured knowledge. Resource description framework (RDF) triples serve as a fundamental building block for this purpose, enabling the integration of diverse data sources. This investigation examines methods for RDF triple generation and knowledge graphs (KGs) enhancement from NL texts. This study area presents wide-ranging applications encompassing knowledge representation, data integration, NL understanding, and information retrieval. Our systematic literature review addresses the understanding, characterization, and identification of challenges and limitations in existing approaches to RDF triple generation from NL texts and their inclusion into an existing KG. We retrieved, categorized, and analyzed 150 articles from several scientific databases. We provide a comprehensive overview of the field, identify research gaps, and provide directions for future research. We found the most commonly available study categories, especially considering the domain, target language, the public availability of datasets, and real-world applications. Our results reveal a growing trend in this field in the last few years related to the use of transformer-based machine learning methods for triple generation. Our study also drives innovation by highlighting open research questions and providing a road map for future investigations. \n
\n\n\n
\n\n\n
\n
\n\n \n \n \n \n \n \n BENCH4T3: A Framework to Create Benchmarks for Text-to-Triples Alignment Generation.\n \n \n \n \n\n\n \n Chico, V. J. S.; Regino, A. G.; and dos Reis, J. C.\n\n\n \n\n\n\n
Journal of the Brazilian Computer Society, 32(1): 85–101. Feb. 2026.\n
\n\n
\n\n
\n\n
\n\n \n \n
Paper\n \n \n\n \n \n doi\n \n \n\n \n link\n \n \n\n bibtex\n \n\n \n \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n \n \n \n\n\n\n
\n
@article{Chico_Regino_dos_Reis_2026, \ntitle={BENCH4T3: A Framework to Create Benchmarks for Text-to-Triples Alignment Generation}, \nvolume={32}, \nurl={https://journals-sol.sbc.org.br/index.php/jbcs/article/view/5809}, \ndoi={10.5753/jbcs.2026.5809}, \nabstract={Integrating Large Language Models (LLMs) with Knowledge Graphs (KGs) can significantly enhance their capabilities, leveraging LLMs’ text generation skills with KGs’ explanatory power. However, establishing this connection is challenging and demands proper alignment between unstructured texts and triples. Building benchmarks demands massive human effort in data curation and translation for non-English languages. The demand for adequate benchmarks for validation purposes negatively impacts research advancements. This study proposes an end-to-end framework to guide the automatic construction of text-to-triple alignment benchmarks for any language, using KGs as input. Our solution extracts relations from input triples and processes them to create accurately mapped texts. The proposed pipeline utilizes data curation through prompt engineering and data augmentation to enhance diversity in the generated examples. We experimentally evaluate our framework for creating a bimodal representation of RDF triples and natural language texts, assessing its ability to generate natural language from these triples. A key focus is on developing a benchmark for the underrepresented Portuguese language, facilitating the construction of models that connect structured data (triples) with text. Our solution is suited to creating a benchmark to improve alignment between KG triples and text data. The results indicate that the generated benchmark outperforms the results of existing solutions. The generative approach benefits from our Portuguese benchmark, achieving competitive results compared to established literature benchmarks. Our solution enables automatic generation of benchmarks for aligning triples and text.}, \nnumber={1}, \njournal={Journal of the Brazilian Computer Society}, \nauthor={Chico, Victor Jesus Sotelo and Regino, André Gomes and dos Reis, Julio Cesar}, \nyear={2026}, \nmonth={Feb.},\npages={85–101} \n}\n
\n\n\n
\n Integrating Large Language Models (LLMs) with Knowledge Graphs (KGs) can significantly enhance their capabilities, leveraging LLMs’ text generation skills with KGs’ explanatory power. However, establishing this connection is challenging and demands proper alignment between unstructured texts and triples. Building benchmarks demands massive human effort in data curation and translation for non-English languages. The demand for adequate benchmarks for validation purposes negatively impacts research advancements. This study proposes an end-to-end framework to guide the automatic construction of text-to-triple alignment benchmarks for any language, using KGs as input. Our solution extracts relations from input triples and processes them to create accurately mapped texts. The proposed pipeline utilizes data curation through prompt engineering and data augmentation to enhance diversity in the generated examples. We experimentally evaluate our framework for creating a bimodal representation of RDF triples and natural language texts, assessing its ability to generate natural language from these triples. A key focus is on developing a benchmark for the underrepresented Portuguese language, facilitating the construction of models that connect structured data (triples) with text. Our solution is suited to creating a benchmark to improve alignment between KG triples and text data. The results indicate that the generated benchmark outperforms the results of existing solutions. The generative approach benefits from our Portuguese benchmark, achieving competitive results compared to established literature benchmarks. Our solution enables automatic generation of benchmarks for aligning triples and text.\n
\n\n\n
\n\n\n\n\n\n