Who Can Fix This? User Recommendation for Knowledge Graph Repair via Embedding-Based Clustering. Ferranti, N., Guiamarães, D., & de Souza, J. F. In Proceedings Of The 13th Knowledge Capture Conference 2025 (K-CAP 2025), December, 2025.
Paper doi abstract bibtex Maintaining the consistency of large-scale knowledge graphs (KGs) like Wikidata requires both automated methods and human expertise. In this paper, we address the task of recommending users best suited to repair a given inconsistency in a KG. Our approach leverages textual entity abstracts to compute sentence embeddings, which are clustered to identify semantically coherent regions of the KG. We introduce a framework that combines unsupervised clustering with 10-fold evaluation to test user recommendation strategies. Repair histories are linked to users, and test inconsistencies are assigned to clusters using approximate prediction. We evaluate two strategies: (i) frequency-based assignment, which recommends users based on how often they have edited entities in the predicted cluster, and (ii) embedding-based similarity, which compares the test inconsistency to past user-edited items via cosine similarity. Preliminary results show a cluster silhouette $\ge0.5$, membership hit rate of 80%, with the frequency-based approach achieving a Hits@3 of 60%. Our findings suggest that lightweight unsupervised methods can effectively recommend users, showing promise for semi-automated KG maintenance.
@inproceedings{ferr-etal-2025KCAP,
title={Who Can Fix This? User Recommendation for Knowledge Graph Repair via Embedding-Based Clustering},
author={Nicolas Ferranti and Dayane Guiamar\~aes and Jairo Francisco de Souza},
abstract={Maintaining the consistency of large-scale knowledge graphs (KGs) like Wikidata requires both automated methods and human expertise. In this paper, we address the task of recommending users best suited to repair a given inconsistency in a KG. Our approach leverages textual entity abstracts to compute sentence embeddings, which are clustered to identify semantically coherent regions of the KG. We introduce a framework that combines unsupervised clustering with 10-fold evaluation to test user recommendation strategies. Repair histories are linked to users, and test inconsistencies are assigned to clusters using approximate prediction. We evaluate two strategies: (i) frequency-based assignment, which recommends users based on how often they have edited entities in the predicted cluster, and (ii) embedding-based similarity, which compares the test inconsistency to past user-edited items via cosine similarity. Preliminary results show a cluster silhouette $\ge0.5$, membership hit rate of 80\%, with the frequency-based approach achieving a Hits@3 of 60\%. Our findings suggest that lightweight unsupervised methods can effectively recommend users, showing promise for semi-automated KG maintenance.},
booktitle={Proceedings Of The 13th Knowledge Capture Conference 2025 (K-CAP 2025)},
year=2025,
month = dec,
day={10--12},
doi={10.1145/3731443.3771363},
url={http://www.polleres.net/publications/ferr-etal-2025KCAP.pdf}
}
Downloads: 0
{"_id":"3PkbQvKLMYmfSt9e8","bibbaseid":"ferranti-guiamares-desouza-whocanfixthisuserrecommendationforknowledgegraphrepairviaembeddingbasedclustering-2025","author_short":["Ferranti, N.","Guiamarães, D.","de Souza, J. F."],"bibdata":{"bibtype":"inproceedings","type":"inproceedings","title":"Who Can Fix This? User Recommendation for Knowledge Graph Repair via Embedding-Based Clustering","author":[{"firstnames":["Nicolas"],"propositions":[],"lastnames":["Ferranti"],"suffixes":[]},{"firstnames":["Dayane"],"propositions":[],"lastnames":["Guiamarães"],"suffixes":[]},{"firstnames":["Jairo","Francisco"],"propositions":["de"],"lastnames":["Souza"],"suffixes":[]}],"abstract":"Maintaining the consistency of large-scale knowledge graphs (KGs) like Wikidata requires both automated methods and human expertise. In this paper, we address the task of recommending users best suited to repair a given inconsistency in a KG. Our approach leverages textual entity abstracts to compute sentence embeddings, which are clustered to identify semantically coherent regions of the KG. We introduce a framework that combines unsupervised clustering with 10-fold evaluation to test user recommendation strategies. Repair histories are linked to users, and test inconsistencies are assigned to clusters using approximate prediction. We evaluate two strategies: (i) frequency-based assignment, which recommends users based on how often they have edited entities in the predicted cluster, and (ii) embedding-based similarity, which compares the test inconsistency to past user-edited items via cosine similarity. Preliminary results show a cluster silhouette $\\ge0.5$, membership hit rate of 80%, with the frequency-based approach achieving a Hits@3 of 60%. Our findings suggest that lightweight unsupervised methods can effectively recommend users, showing promise for semi-automated KG maintenance.","booktitle":"Proceedings Of The 13th Knowledge Capture Conference 2025 (K-CAP 2025)","year":"2025","month":"December","day":"10–12","doi":"10.1145/3731443.3771363","url":"http://www.polleres.net/publications/ferr-etal-2025KCAP.pdf","bibtex":"@inproceedings{ferr-etal-2025KCAP,\ntitle={Who Can Fix This? User Recommendation for Knowledge Graph Repair via Embedding-Based Clustering},\nauthor={Nicolas Ferranti and Dayane Guiamar\\~aes and Jairo Francisco de Souza},\nabstract={Maintaining the consistency of large-scale knowledge graphs (KGs) like Wikidata requires both automated methods and human expertise. In this paper, we address the task of recommending users best suited to repair a given inconsistency in a KG. Our approach leverages textual entity abstracts to compute sentence embeddings, which are clustered to identify semantically coherent regions of the KG. We introduce a framework that combines unsupervised clustering with 10-fold evaluation to test user recommendation strategies. Repair histories are linked to users, and test inconsistencies are assigned to clusters using approximate prediction. We evaluate two strategies: (i) frequency-based assignment, which recommends users based on how often they have edited entities in the predicted cluster, and (ii) embedding-based similarity, which compares the test inconsistency to past user-edited items via cosine similarity. Preliminary results show a cluster silhouette $\\ge0.5$, membership hit rate of 80\\%, with the frequency-based approach achieving a Hits@3 of 60\\%. Our findings suggest that lightweight unsupervised methods can effectively recommend users, showing promise for semi-automated KG maintenance.},\nbooktitle={Proceedings Of The 13th Knowledge Capture Conference 2025 (K-CAP 2025)},\nyear=2025,\nmonth = dec,\nday={10--12},\ndoi={10.1145/3731443.3771363},\nurl={http://www.polleres.net/publications/ferr-etal-2025KCAP.pdf}\n}\n\n","author_short":["Ferranti, N.","Guiamarães, D.","de Souza, J. F."],"key":"ferr-etal-2025KCAP","id":"ferr-etal-2025KCAP","bibbaseid":"ferranti-guiamares-desouza-whocanfixthisuserrecommendationforknowledgegraphrepairviaembeddingbasedclustering-2025","role":"author","urls":{"Paper":"http://www.polleres.net/publications/ferr-etal-2025KCAP.pdf"},"metadata":{"authorlinks":{}},"downloads":0},"bibtype":"inproceedings","biburl":"www.polleres.net/mypublications.bib","dataSources":["gixxkiKt6rtWGoKSh"],"keywords":[],"search_terms":["fix","user","recommendation","knowledge","graph","repair","via","embedding","based","clustering","ferranti","guiamarães","de souza"],"title":"Who Can Fix This? User Recommendation for Knowledge Graph Repair via Embedding-Based Clustering","year":2025}