Data encoding for Byzantine-resilient distributed gradient descent

Data encoding for Byzantine-resilient distributed gradient descent. Data, D., Song, L., & Diggavi, S. In 2018 56th Annual Allerton Conference on Communication, Control, and Computing (Allerton), pages 863–870, 2018. IEEE.
doi abstract bibtex

We consider distributed gradient computation, where both data and computation are distributed among m worker machines, t of which can be Byzantine adversaries, and a designated (master) node computes the model/parameter vector, iteratively using gradient descent (GD). The Byzantine adversaries can (collaboratively) deviate arbitrarily from their gradient computation. To solve this, we propose a method based on data encoding and (real) error correction to combat the adversarial behavior. We can tolerate up to t ≤ [m-1/2] corrupt worker nodes, which is information-theoretically optimal. Our method does not assume any probability distribution on the data. We develop a sparse encoding scheme which enables computationally efficient data encoding. We demonstrate a trade-off between the number of adversaries tolerated and the resource requirement (storage and computational complexity). As an example, our scheme incurs a constant overhead (storage and computational complexity) over that required by the distributed GD algorithm, without adversaries, for t ≤ m/3.

@inproceedings{data2018data,
 abstract = {We consider distributed gradient computation, where both data and computation are distributed among m worker machines, t of which can be Byzantine adversaries, and a designated (master) node computes the model/parameter vector, iteratively using gradient descent (GD). The Byzantine adversaries can (collaboratively) deviate arbitrarily from their gradient computation. To solve this, we propose a method based on data encoding and (real) error correction to combat the adversarial behavior. We can tolerate up to t ≤ [m-1/2] corrupt worker nodes, which is information-theoretically optimal. Our method does not assume any probability distribution on the data. We develop a sparse encoding scheme which enables computationally efficient data encoding. We demonstrate a trade-off between the number of adversaries tolerated and the resource requirement (storage and computational complexity). As an example, our scheme incurs a constant overhead (storage and computational complexity) over that required by the distributed GD algorithm, without adversaries, for t ≤ m/3.},
 author = {Data, Deepesh and Song, Linqi and Diggavi, Suhas},
 booktitle = {2018 56th Annual Allerton Conference on Communication, Control, and Computing (Allerton)},
 organization = {IEEE},
 pages = {863--870},
 tags = {conf,SDL,DML},
 title = {Data encoding for Byzantine-resilient distributed gradient descent},
 type = {4},
 doi = {10.1109/ALLERTON.2018.8636017},
 year = {2018}
}

Downloads: 0