SPARQ-SGD: Event-Triggered and Compressed Communication in Decentralized Optimization

SPARQ-SGD: Event-Triggered and Compressed Communication in Decentralized Optimization. Singh, N., Data, D., George, J., & Diggavi, S. IEEE Trans. Autom. Control., 68(2):721–736, 2023.

Paper

Arxiv doi abstract bibtex 18 downloads

In this paper, we propose and analyze SPARQ-SGD, a communication efficient algorithm for decentralized training of large-scale machine learning models over a graph with n nodes, where communication efficiency is achieved using compressed exchange of local model parameters among neighboring nodes, which is triggered only when an event (a locally computable condition) is satisfied. Specifically, in SPARQ-SGD, each node takes a fixed number of local gradient steps and then checks if the model parameters have significantly changed compared to its last update; only when the change is beyond a certain threshold (specified by a design criterion), it compresses its local model parameters using both quantization and sparsification and communicates them to its neighbors. We prove that SPARQ-SGD converges as O(1/nT) and O(1/sqrt(nT)) in the strongly-convex and non-convex settings, respectively, matching the convergence rates of plain decentralized SGD. This demonstrates that we get communication efficiency achieved by aggressive compression, local iterations, and event-triggered communication essentially for free.

@ARTICLE{9691792,  
author={Singh, Navjot and Data, Deepesh and George, Jemin and Diggavi, Suhas},  journal={IEEE Transactions on Automatic Control}, 
title={SPARQ-SGD: Event-Triggered and Compressed Communication in Decentralized Optimization}, 
journal={{IEEE} Trans. Autom. Control.},  
volume       = {68},
number       = {2},
pages        = {721--736},
year         = {2023},
url          = {https://doi.org/10.1109/TAC.2022.3145576},
doi          = {10.1109/TAC.2022.3145576},
abstract={In this paper, we propose and analyze SPARQ-SGD, a communication efficient algorithm for decentralized training of large-scale machine learning models over a graph with n nodes, where communication efficiency is achieved using compressed exchange of local model parameters among neighboring nodes, which is triggered only when an event (a locally computable condition) is satisfied. Specifically, in SPARQ-SGD, each node takes a fixed number of local gradient steps and then checks if the model parameters have significantly changed compared to its last update; only when the change is beyond a certain threshold (specified by a design criterion), it compresses its local model parameters using both quantization and sparsification and communicates them to its neighbors. We prove that SPARQ-SGD converges as O(1/nT) and O(1/sqrt(nT)) in the strongly-convex and non-convex settings, respectively, matching the convergence rates of plain decentralized SGD. This demonstrates that we get communication efficiency achieved by aggressive compression, local iterations, and event-triggered communication essentially for free.},  
keywords={},  
doi={10.1109/TAC.2022.3145576},  
ISSN={1558-2523},  
month={},
type={2},
tags={journal,DML,CEDL},
url_arxiv={https://arxiv.org/abs/1910.14280},
}

Downloads: 18

{"_id":"R4moiAm8kFziY5jDD","bibbaseid":"singh-data-george-diggavi-sparqsgdeventtriggeredandcompressedcommunicationindecentralizedoptimization-2023","author_short":["Singh, N.","Data, D.","George, J.","Diggavi, S."],"bibdata":{"bibtype":"article","type":"2","author":[{"propositions":[],"lastnames":["Singh"],"firstnames":["Navjot"],"suffixes":[]},{"propositions":[],"lastnames":["Data"],"firstnames":["Deepesh"],"suffixes":[]},{"propositions":[],"lastnames":["George"],"firstnames":["Jemin"],"suffixes":[]},{"propositions":[],"lastnames":["Diggavi"],"firstnames":["Suhas"],"suffixes":[]}],"journal":"IEEE Trans. Autom. Control.","title":"SPARQ-SGD: Event-Triggered and Compressed Communication in Decentralized Optimization","volume":"68","number":"2","pages":"721–736","year":"2023","url":"https://doi.org/10.1109/TAC.2022.3145576","doi":"10.1109/TAC.2022.3145576","abstract":"In this paper, we propose and analyze SPARQ-SGD, a communication efficient algorithm for decentralized training of large-scale machine learning models over a graph with n nodes, where communication efficiency is achieved using compressed exchange of local model parameters among neighboring nodes, which is triggered only when an event (a locally computable condition) is satisfied. Specifically, in SPARQ-SGD, each node takes a fixed number of local gradient steps and then checks if the model parameters have significantly changed compared to its last update; only when the change is beyond a certain threshold (specified by a design criterion), it compresses its local model parameters using both quantization and sparsification and communicates them to its neighbors. We prove that SPARQ-SGD converges as O(1/nT) and O(1/sqrt(nT)) in the strongly-convex and non-convex settings, respectively, matching the convergence rates of plain decentralized SGD. This demonstrates that we get communication efficiency achieved by aggressive compression, local iterations, and event-triggered communication essentially for free.","keywords":"","issn":"1558-2523","month":"","tags":"journal,DML,CEDL","url_arxiv":"https://arxiv.org/abs/1910.14280","bibtex":"@ARTICLE{9691792, \nauthor={Singh, Navjot and Data, Deepesh and George, Jemin and Diggavi, Suhas}, journal={IEEE Transactions on Automatic Control}, \ntitle={SPARQ-SGD: Event-Triggered and Compressed Communication in Decentralized Optimization}, \njournal={{IEEE} Trans. Autom. Control.}, \nvolume = {68},\nnumber = {2},\npages = {721--736},\nyear = {2023},\nurl = {https://doi.org/10.1109/TAC.2022.3145576},\ndoi = {10.1109/TAC.2022.3145576},\nabstract={In this paper, we propose and analyze SPARQ-SGD, a communication efficient algorithm for decentralized training of large-scale machine learning models over a graph with n nodes, where communication efficiency is achieved using compressed exchange of local model parameters among neighboring nodes, which is triggered only when an event (a locally computable condition) is satisfied. Specifically, in SPARQ-SGD, each node takes a fixed number of local gradient steps and then checks if the model parameters have significantly changed compared to its last update; only when the change is beyond a certain threshold (specified by a design criterion), it compresses its local model parameters using both quantization and sparsification and communicates them to its neighbors. We prove that SPARQ-SGD converges as O(1/nT) and O(1/sqrt(nT)) in the strongly-convex and non-convex settings, respectively, matching the convergence rates of plain decentralized SGD. This demonstrates that we get communication efficiency achieved by aggressive compression, local iterations, and event-triggered communication essentially for free.}, \nkeywords={}, \ndoi={10.1109/TAC.2022.3145576}, \nISSN={1558-2523}, \nmonth={},\ntype={2},\ntags={journal,DML,CEDL},\nurl_arxiv={https://arxiv.org/abs/1910.14280},\n}\n\n\n","author_short":["Singh, N.","Data, D.","George, J.","Diggavi, S."],"key":"9691792","id":"9691792","bibbaseid":"singh-data-george-diggavi-sparqsgdeventtriggeredandcompressedcommunicationindecentralizedoptimization-2023","role":"author","urls":{"Paper":"https://doi.org/10.1109/TAC.2022.3145576"," arxiv":"https://arxiv.org/abs/1910.14280"},"metadata":{"authorlinks":{}},"downloads":18,"html":""},"bibtype":"article","biburl":"https://bibbase.org/network/files/e2kjGxYgtBo8SWSbC","dataSources":["FaDBDiyFAJY5pL28h","ycfdiwWPzC2rE6H77"],"keywords":[],"search_terms":["sparq","sgd","event","triggered","compressed","communication","decentralized","optimization","singh","data","george","diggavi"],"title":"SPARQ-SGD: Event-Triggered and Compressed Communication in Decentralized Optimization","year":2023,"downloads":18}