SPARQ-SGD: Event-Triggered and Compressed Communication in Decentralized Stochastic Optimization. Singh, N., Data, D., George, J., & Diggavi, S. 2020 59th IEEE Conference on Decision and Control (CDC), 2020.
SPARQ-SGD: Event-Triggered and Compressed Communication in Decentralized Stochastic Optimization [link]Arxiv  doi  abstract   bibtex   2 downloads  
In this paper, we propose and analyze SPARQ-SGD, which is an event-triggered and compressed algorithm for decentralized training of large-scale machine learning models. Each node can locally compute a condition (event) which triggers a communication where quantized and sparsified local model parameters are sent. In SPARQ-SGD each node takes at least a fixed number (H) of local gradient steps and then checks if the model parameters have significantly changed compared to its last update; it communicates further compressed model parameters only when there is a significant change, as specified by a (design) criterion. We prove that the SPARQ-SGD converges as O(1nT) and O(1nT√) in the strongly-convex and non-convex settings, respectively, demonstrating that such aggressive compression, including event-triggered communication, model sparsification and quantization does not affect the overall convergence rate as compared to uncompressed decentralized training; thereby theoretically yielding communication efficiency for "free". We evaluate SPARQ-SGD over real datasets to demonstrate significant amount of savings in communication over the state-of-the-art.
@article{singh2019sparq,
 abstract = {In this paper, we propose and analyze SPARQ-SGD, which is an event-triggered and compressed algorithm for decentralized training of large-scale machine learning models. Each node can locally compute a condition (event) which triggers a communication where quantized and sparsified local model parameters are sent. In SPARQ-SGD each node takes at least a fixed number (H) of local gradient steps and then checks if the model parameters have significantly changed compared to its last update; it communicates further compressed model parameters only when there is a significant change, as specified by a (design) criterion. We prove that the SPARQ-SGD converges as O(1nT) and O(1nT√) in the strongly-convex and non-convex settings, respectively, demonstrating that such aggressive compression, including event-triggered communication, model sparsification and quantization does not affect the overall convergence rate as compared to uncompressed decentralized training; thereby theoretically yielding communication efficiency for "free". We evaluate SPARQ-SGD over real datasets to demonstrate significant amount of savings in communication over the state-of-the-art.},
 author = {Singh, Navjot and Data, Deepesh and George, Jemin and Diggavi, Suhas},
 journal = {2020 59th IEEE Conference on Decision and Control (CDC)},
 tags = {conf,CEDL,DML},
 title = {SPARQ-SGD: Event-Triggered and Compressed Communication in Decentralized Stochastic Optimization},
 type = {4},
 url_arxiv = {https://arxiv.org/abs/1910.14280},
 year = {2020},
 pages={3449-3456},
 doi={10.1109/CDC42340.2020.9303828},
 ISSN={2576-2370},
}

Downloads: 2