Naoya Yoshida,Takayuki Nishio,Masahiro Morikura,Koji Yamamoto,Ryo Yonetani
Abstract URL: https://arxiv.org/abs/1905.07210v3
This paper proposes a cooperative mechanism for mitigating the performance degradation due to non-independent-and-identically-distributed (non-IID) data in collaborative machine learning (ML), namely federated learning (FL), which trains an ML model using the rich data and computational resources of mobile clients without gathering their data to central systems. The data of mobile clients is typically non-IID owing to diversity among mobile clients' interests and usage, and FL with non-IID data could degrade the model performance. Therefore, to mitigate the degradation induced by non-IID data, we assume that a limited number (e.g., less than 1%) of clients allow their data to be uploaded to a server, and we propose a hybrid learning mechanism referred to as Hybrid-FL, wherein the server updates the model using the data gathered from the clients and aggregates the model with the models trained by clients. The Hybrid-FL solves both client- and data-selection problems via heuristic algorithms, which try to select the optimal sets of clients who train models with their own data, clients who upload their data to the server, and data uploaded to the server. The algorithms increase the number of clients participating in FL and make more data gather in the server IID, thereby improving the prediction accuracy of the aggregated model. Evaluations, which consist of network simulations and ML experiments, demonstrate that the proposed scheme achieves a 13.5% higher classification accuracy than those of the previously proposed schemes for the non-IID case.