Abstract:
An effective Deep Neural Network (DNN) optimization algorithm that can use decentralized data sets over a peer-to-peer (P2P) network is proposed. In applications such as medical data analysis, the aggregation of data in one location may not be possible due to privacy issues. Hence, we formulate an algorithm to reach a global DNN model that does not require transmission of data among nodes. An existing solution for this issue is gossip stochastic gradient descend (SGD), which updates by averaging node models over a P2P network. However, in practical situations where the data are statistically heterogeneous across the nodes and/or where communication is asynchronous, gossip SGD often gets trapped in local minimum since the model gradients are noticeably different. To overcome this issue, we solve a linearly constrained DNN cost minimization problem, which results in variable update rules that restrict differences among all node models. Our approach can be based on the Primal-Dual Method of Multipliers (PDMM) or the Alternating Direction Method of Multiplier (ADMM), but the cost function is linearized to be suitable for deep learning. It facilitates asynchronous communication. The results of our numerical experiments using CIFAR-10 indicate that the proposed algorithms converge to a global recognition model even though statistically heterogeneous data sets are placed on the nodes.