Enhancing K-Means Using Class Labels. Peralta, B., Espinace, P., & Soto, A. *Intelligent Data Analysis (IDA)*, 17(6):1023-1039, 2013. Paper abstract bibtex 9 downloads Clustering is a relevant problem in machine learning where the main goal is to locate meaningful partitions of unlabeled data. In the case of labeled data, a related problem is supervised clustering, where the objective is to locate class- uniform clusters. Most current approaches to supervised clustering optimize a score related to cluster purity with respect to class labels. In particular, we present Labeled K-Means (LK-Means), an algorithm for supervised clustering based on a variant of K-Means that incorporates information about class labels. LK-Means replaces the classical cost function of K-Means by a convex combination of the joint cost associated to: (i) A discriminative score based on class labels, and (ii) A generative score based on a traditional metric for unsupervised clustering. We test the performance of LK-Means using standard real datasets and an application for object recognition. Moreover, we also compare its performance against classical K-Means and a popular K-Medoids-based supervised clustering method. Our experiments show that, in most cases, LK-Means outperforms the alternative techniques by a considerable margin. Furthermore, LK-Means presents execution times considerably lower than the alternative supervised clustering method under evaluation.

@Article{ peralta:etal:2013,
author = {B. Peralta and P. Espinace and A. Soto},
title = {Enhancing K-Means Using Class Labels},
journal = {Intelligent Data Analysis (IDA)},
volume = {17},
number = {6},
pages = {1023-1039},
year = {2013},
abstract = {Clustering is a relevant problem in machine learning where
the main goal is to locate meaningful partitions of
unlabeled data. In the case of labeled data, a related
problem is supervised clustering, where the objective is to
locate class- uniform clusters. Most current approaches to
supervised clustering optimize a score related to cluster
purity with respect to class labels. In particular, we
present Labeled K-Means (LK-Means), an algorithm for
supervised clustering based on a variant of K-Means that
incorporates information about class labels. LK-Means
replaces the classical cost function of K-Means by a convex
combination of the joint cost associated to: (i) A
discriminative score based on class labels, and (ii) A
generative score based on a traditional metric for
unsupervised clustering. We test the performance of
LK-Means using standard real datasets and an application
for object recognition. Moreover, we also compare its
performance against classical K-Means and a popular
K-Medoids-based supervised clustering method. Our
experiments show that, in most cases, LK-Means outperforms
the alternative techniques by a considerable margin.
Furthermore, LK-Means presents execution times considerably
lower than the alternative supervised clustering method
under evaluation. },
url = {http://saturno.ing.puc.cl/media/papers_alvaro/supClustering.pdf}
}

Downloads: 9

{"_id":{"_str":"53427a470e946d920a0018b5"},"__v":1,"authorIDs":["32ZR23o2BFySHbtQK","3ear6KFZSRqbj6YeT","4Pq6KLaQ8jKGXHZWH","54578d9a2abc8e9f370004f0","5e126ca5a4cabfdf01000053","5e158f76f1f31adf01000118","5e16174bf67f7dde010003ad","5e1f631ae8f5ddde010000eb","5e1f7182e8f5ddde010001ff","5e26da3642065ede01000066","5e3acefaf2a00cdf010001c8","5e62c3aecb259cde010000f9","5e65830c6e5f4cf3010000e7","5e666dfc46e828de010002c9","6cMBYieMJhf6Nd58M","6w6sGsxYSK2Quk6yZ","7xDcntrrtC62vkWM5","ARw5ReidxxZii9TTZ","BjzM7QpRCG7uCF7Zf","DQ4JRTTWkvKXtCNCp","GbYBJvxugXMriQwbi","HhRoRmBvwWfD4oLyK","JFk6x26H6LZMoht2n","JvArGGu5qM6EvSCvB","LpqQBhFH3PxepH9KY","MT4TkSGzAp69M3dGt","QFECgvB5v2i4j2Qzs","RKv56Kes3h6FwEa55","Rb9TkQ3KkhGAaNyXq","RdND8NxcJDsyZdkcK","SpKJ5YujbHKZnHc4v","TSRdcx4bbYKqcGbDg","W8ogS2GJa6sQKy26c","WTi3X2fT8dzBN5d8b","WfZbctNQYDBaiYW6n","XZny8xuqwfoxzhBCB","Xk2Q5qedS5MFHvjEW","bbARiTJLYS79ZMFbk","cBxsyeZ37EucQeBYK","cFyFQps7W3Sa2Wope","dGRBfr8zhMmbwK6eP","eRLgwkrEk7T7Lmzmf","fMYSCX8RMZap548vv","g6iKCQCFnJgKYYHaP","h2hTcQYuf2PB3oF8t","h83jBvZYJPJGutQrs","jAtuJBcGhng4Lq2Nd","pMoo2gotJcdDPwfrw","q5Zunk5Y2ruhw5vyq","rzNGhqxkbt2MvGY29","uC8ATA8AfngWpYLBq","uoJ7BKv28Q6TtPmPp","vMiJzqEKCsBxBEa3v","vQE6iTPpjxpuLip2Z","wQDRsDjhgpMJDGxWX","wbNg79jvDpzX9zHLK","wk86BgRiooBjy323E","zCbPxKnQGgDHiHMWn","zf9HENjsAzdWLMDAu"],"author_short":["Peralta, B.","Espinace, P.","Soto, A."],"bibbaseid":"peralta-espinace-soto-enhancingkmeansusingclasslabels-2013","bibdata":{"bibtype":"article","type":"article","author":[{"firstnames":["B."],"propositions":[],"lastnames":["Peralta"],"suffixes":[]},{"firstnames":["P."],"propositions":[],"lastnames":["Espinace"],"suffixes":[]},{"firstnames":["A."],"propositions":[],"lastnames":["Soto"],"suffixes":[]}],"title":"Enhancing K-Means Using Class Labels","journal":"Intelligent Data Analysis (IDA)","volume":"17","number":"6","pages":"1023-1039","year":"2013","abstract":"Clustering is a relevant problem in machine learning where the main goal is to locate meaningful partitions of unlabeled data. In the case of labeled data, a related problem is supervised clustering, where the objective is to locate class- uniform clusters. Most current approaches to supervised clustering optimize a score related to cluster purity with respect to class labels. In particular, we present Labeled K-Means (LK-Means), an algorithm for supervised clustering based on a variant of K-Means that incorporates information about class labels. LK-Means replaces the classical cost function of K-Means by a convex combination of the joint cost associated to: (i) A discriminative score based on class labels, and (ii) A generative score based on a traditional metric for unsupervised clustering. We test the performance of LK-Means using standard real datasets and an application for object recognition. Moreover, we also compare its performance against classical K-Means and a popular K-Medoids-based supervised clustering method. Our experiments show that, in most cases, LK-Means outperforms the alternative techniques by a considerable margin. Furthermore, LK-Means presents execution times considerably lower than the alternative supervised clustering method under evaluation. ","url":"http://saturno.ing.puc.cl/media/papers_alvaro/supClustering.pdf","bibtex":"@Article{\t peralta:etal:2013,\n author\t= {B. Peralta and P. Espinace and A. Soto},\n title\t\t= {Enhancing K-Means Using Class Labels},\n journal\t= {Intelligent Data Analysis (IDA)},\n volume\t= {17},\n number\t= {6},\n pages\t\t= {1023-1039},\n year\t\t= {2013},\n abstract\t= {Clustering is a relevant problem in machine learning where\n\t\t the main goal is to locate meaningful partitions of\n\t\t unlabeled data. In the case of labeled data, a related\n\t\t problem is supervised clustering, where the objective is to\n\t\t locate class- uniform clusters. Most current approaches to\n\t\t supervised clustering optimize a score related to cluster\n\t\t purity with respect to class labels. In particular, we\n\t\t present Labeled K-Means (LK-Means), an algorithm for\n\t\t supervised clustering based on a variant of K-Means that\n\t\t incorporates information about class labels. LK-Means\n\t\t replaces the classical cost function of K-Means by a convex\n\t\t combination of the joint cost associated to: (i) A\n\t\t discriminative score based on class labels, and (ii) A\n\t\t generative score based on a traditional metric for\n\t\t unsupervised clustering. We test the performance of\n\t\t LK-Means using standard real datasets and an application\n\t\t for object recognition. Moreover, we also compare its\n\t\t performance against classical K-Means and a popular\n\t\t K-Medoids-based supervised clustering method. Our\n\t\t experiments show that, in most cases, LK-Means outperforms\n\t\t the alternative techniques by a considerable margin.\n\t\t Furthermore, LK-Means presents execution times considerably\n\t\t lower than the alternative supervised clustering method\n\t\t under evaluation. },\n url\t\t= {http://saturno.ing.puc.cl/media/papers_alvaro/supClustering.pdf}\n}\n\n","author_short":["Peralta, B.","Espinace, P.","Soto, A."],"key":"peralta:etal:2013","id":"peralta:etal:2013","bibbaseid":"peralta-espinace-soto-enhancingkmeansusingclasslabels-2013","role":"author","urls":{"Paper":"http://saturno.ing.puc.cl/media/papers_alvaro/supClustering.pdf"},"metadata":{"authorlinks":{"soto, a":"https://asoto.ing.puc.cl/publications/"}},"downloads":9},"bibtype":"article","biburl":"https://raw.githubusercontent.com/ialab-puc/ialab.ing.puc.cl/master/pubs.bib","downloads":9,"keywords":[],"search_terms":["enhancing","means","using","class","labels","peralta","espinace","soto"],"title":"Enhancing K-Means Using Class Labels","year":2013,"dataSources":["3YPRCmmijLqF4qHXd","sg6yZ29Z2xB5xP79R","sj4fjnZAPkEeYdZqL","m8qFBfFbjk9qWjcmJ","QjT2DEZoWmQYxjHXS"]}