dispersed federated learning: vision, taxonomy, and future ...cations such as smart health-care,...

7
1 Dispersed Federated Learning: Vision, Taxonomy, and Future Directions Latif U. Khan, Walid Saad, Fellow, IEEE, Zhu Han, Fellow, IEEE, and Choong Seon Hong, Senior Member, IEEE Abstract—The ongoing deployment of the Internet of Things (IoT)-based smart applications is spurring the adoption of machine learning as a key technology enabler. To overcome the privacy and overhead challenges of centralized machine learning, there has been a significant recent interest in the concept of federated learning. Federated learning offers on- device, privacy-preserving machine learning without the need to transfer end-devices data to a third party location. However, federated learning still has privacy concerns due to sensitive information inferring capability of the aggregation server using end-devices local learning models. Furthermore, the federated learning process might fail due to a failure in the aggregation server (e.g., due to a malicious attack or physical defect). Other than privacy and robustness issues, federated learning over IoT networks requires a significant amount of communication resources for training. To cope with these issues, we propose a novel concept of dispersed federated learning (DFL) that is based on the true decentralization. We opine that DFL will serve as a practical implementation of federated learning for various IoT-based smart applications such as smart industries and intelligent transportation systems. First, the fundamentals of the DFL are presented. Second, a taxonomy is devised with a qualitative analysis of various DFL schemes. Third, a DFL framework for IoT networks is proposed with a matching theory- based solution. Finally, an outlook on future research directions is presented. Index Terms—Federated learning, machine learning, resource optimization, matching theory. I. I NTRODUCTION Recent years have revealed a significant rise in the num- ber of Internet of Things (IoT) devices to enable various appli- cations such as smart health-care, augmented reality, industry 4.0, autonomous driving cars, among others. These applica- tions use emerging communication and computing technolo- gies along with machine learning to offer smart services [1]. In order deploy machine learning in large-scale, heterogeneous systems such as the IoT, it is necessary to preserve the privacy of the data and reduce the communication overhead. As a result, centralized machine learning techniques may not be L. U. Khan and C. S. Hong are with the Department of Computer Science & Engineering, Kyung Hee University, Yongin-si 17104, South Korea. Walid Saad is with the Wireless@VT, Bradley Department of Electrical and Computer Engineering, Virginia Tech, Blacksburg, VA 24061 USA. Zhu Han is with the Electrical and Computer Engineering Department, University of Houston, Houston, TX 77004 USA, and also with the Computer Science Department, University of Houston, Houston, TX 77004 USA, and the Department of Computer Science and Engineering, Kyung Hee University, South Korea. Corresponding author: Choong Seon Hong ([email protected]) suitable. Instead, federated learning (FL) [2] is a privacy- preserving, distributed machine learning solution that can be amenable to deployment in an IoT. Although FL preserves users’ privacy by training a globally shared model without migrating the end-devices data to an aggregation server, it still exhibits privacy concerns. For example, a malicious aggregation server can infer the end-devices sensitive informa- tion from their learning model parameters [3]. Moreover, FL requires a significant amount of communication resources for the iterative exchange of learning model parameters between end-devices and the aggregation server [4], [5]. Furthermore, FL based on a centralized aggregation server might suffer from a malicious user attack or failure due to a physical damage, which significantly degrades the performance of FL. Therefore, to truly benefit from the deployment of FL in IoT networks, we must resolve the aforementioned challenges. To tackle the privacy concerns of FL, a user-level dif- ferential privacy protection obtained by addition of certain noise to local learning models has been proposed in [6]. In another work in [7], the authors proposed a secure aggregation protocol for FL. Although the works in [7] and [6] can enhance privacy, they can introduce an additional complexity to existing FL schemes (e.g., FedAvg or FedProx). On the other hand, a number of recent works considered [4], [8]–[10] resource optimization in FL. A review of resource optimization and incentive mechanism was provided in [4]. In [8], the authors analyzed the performance of FL over wireless networks and derived a closed-form expression to characterize FL conver- gence as function of the wireless channel. The authors then used that analysis to develop a resource allocation program that improve the learning and communication performance. Meanwhile, the authors in [9] proposed an adaptive FL frame- work to offer resource optimization. To further improve the resource optimization for FL, the work in [10] developed a hierarchical FL scheme for heterogeneous cellular networks. The main advantage of hierarchical FL is the reuse of the already occupied frequency bands by other users within small cells. All of the works in [4], [8]–[10] considered optimal communication resources allocation and spectrum reuse for federated learning. Therefore, we can jointly consider optimal communication resources allocation and spectrum reuse to significantly reduce the FL convergence time. In [5], a novel framework, called collaborative FL (CFL) was proposed to enable the participation of devices with insufficient communication resources in the learning process for performance enhancement. In CFL, the devices send their local learning models to nearby devices with sufficient communication resources for local aggregation. Subsequently, arXiv:2008.05189v1 [cs.DC] 12 Aug 2020

Upload: others

Post on 05-Sep-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Dispersed Federated Learning: Vision, Taxonomy, and Future ...cations such as smart health-care, augmented reality, industry 4.0, autonomous driving cars, among others. These applica-tions

1

Dispersed Federated Learning: Vision, Taxonomy,and Future Directions

Latif U. Khan, Walid Saad, Fellow, IEEE, Zhu Han, Fellow, IEEE, and Choong Seon Hong, Senior Member, IEEE

Abstract—The ongoing deployment of the Internet of Things(IoT)-based smart applications is spurring the adoption ofmachine learning as a key technology enabler. To overcomethe privacy and overhead challenges of centralized machinelearning, there has been a significant recent interest in theconcept of federated learning. Federated learning offers on-device, privacy-preserving machine learning without the need totransfer end-devices data to a third party location. However,federated learning still has privacy concerns due to sensitiveinformation inferring capability of the aggregation server usingend-devices local learning models. Furthermore, the federatedlearning process might fail due to a failure in the aggregationserver (e.g., due to a malicious attack or physical defect). Otherthan privacy and robustness issues, federated learning overIoT networks requires a significant amount of communicationresources for training. To cope with these issues, we proposea novel concept of dispersed federated learning (DFL) that isbased on the true decentralization. We opine that DFL willserve as a practical implementation of federated learning forvarious IoT-based smart applications such as smart industriesand intelligent transportation systems. First, the fundamentalsof the DFL are presented. Second, a taxonomy is devised witha qualitative analysis of various DFL schemes. Third, a DFLframework for IoT networks is proposed with a matching theory-based solution. Finally, an outlook on future research directionsis presented.

Index Terms—Federated learning, machine learning, resourceoptimization, matching theory.

I. INTRODUCTION

Recent years have revealed a significant rise in the num-ber of Internet of Things (IoT) devices to enable various appli-cations such as smart health-care, augmented reality, industry4.0, autonomous driving cars, among others. These applica-tions use emerging communication and computing technolo-gies along with machine learning to offer smart services [1].In order deploy machine learning in large-scale, heterogeneoussystems such as the IoT, it is necessary to preserve the privacyof the data and reduce the communication overhead. As aresult, centralized machine learning techniques may not be

L. U. Khan and C. S. Hong are with the Department of Computer Science& Engineering, Kyung Hee University, Yongin-si 17104, South Korea.

Walid Saad is with the Wireless@VT, Bradley Department of Electricaland Computer Engineering, Virginia Tech, Blacksburg, VA 24061 USA.

Zhu Han is with the Electrical and Computer Engineering Department,University of Houston, Houston, TX 77004 USA, and also with the ComputerScience Department, University of Houston, Houston, TX 77004 USA, andthe Department of Computer Science and Engineering, Kyung Hee University,South Korea.

Corresponding author: Choong Seon Hong ([email protected])

suitable. Instead, federated learning (FL) [2] is a privacy-preserving, distributed machine learning solution that can beamenable to deployment in an IoT. Although FL preservesusers’ privacy by training a globally shared model withoutmigrating the end-devices data to an aggregation server, itstill exhibits privacy concerns. For example, a maliciousaggregation server can infer the end-devices sensitive informa-tion from their learning model parameters [3]. Moreover, FLrequires a significant amount of communication resources forthe iterative exchange of learning model parameters betweenend-devices and the aggregation server [4], [5]. Furthermore,FL based on a centralized aggregation server might sufferfrom a malicious user attack or failure due to a physicaldamage, which significantly degrades the performance of FL.Therefore, to truly benefit from the deployment of FL in IoTnetworks, we must resolve the aforementioned challenges.

To tackle the privacy concerns of FL, a user-level dif-ferential privacy protection obtained by addition of certainnoise to local learning models has been proposed in [6]. Inanother work in [7], the authors proposed a secure aggregationprotocol for FL. Although the works in [7] and [6] can enhanceprivacy, they can introduce an additional complexity to existingFL schemes (e.g., FedAvg or FedProx). On the other hand,a number of recent works considered [4], [8]–[10] resourceoptimization in FL. A review of resource optimization andincentive mechanism was provided in [4]. In [8], the authorsanalyzed the performance of FL over wireless networks andderived a closed-form expression to characterize FL conver-gence as function of the wireless channel. The authors thenused that analysis to develop a resource allocation programthat improve the learning and communication performance.Meanwhile, the authors in [9] proposed an adaptive FL frame-work to offer resource optimization. To further improve theresource optimization for FL, the work in [10] developed ahierarchical FL scheme for heterogeneous cellular networks.The main advantage of hierarchical FL is the reuse of thealready occupied frequency bands by other users within smallcells. All of the works in [4], [8]–[10] considered optimalcommunication resources allocation and spectrum reuse forfederated learning. Therefore, we can jointly consider optimalcommunication resources allocation and spectrum reuse tosignificantly reduce the FL convergence time.

In [5], a novel framework, called collaborative FL (CFL)was proposed to enable the participation of devices withinsufficient communication resources in the learning processfor performance enhancement. In CFL, the devices sendtheir local learning models to nearby devices with sufficientcommunication resources for local aggregation. Subsequently,

arX

iv:2

008.

0518

9v1

[cs

.DC

] 1

2 A

ug 2

020

Page 2: Dispersed Federated Learning: Vision, Taxonomy, and Future ...cations such as smart health-care, augmented reality, industry 4.0, autonomous driving cars, among others. These applica-tions

2

Dispersed Federated

Learning (DFL)

Centralized-aggregation-

enabled DFL (CDFL)

Distributed-aggregation-

enabled DFL (DDFL)

Sub-global

Model 1

Sub-global

Model 3

Global Model

Sub-global

Model 1

Sub-global

Model 2

Sub-global

Model 3Transfer of

Sub-global

Models

Global

Model

Global

Model

Global

Model

DeviceCluster

HeadBS

Sub-global

ModelGlobal

Aggregation of

Sub-global

Models

Global Model

Computation

Updated Global

Model

Lo

cal

Mod

el

Co

mp

utat

ion

Devices Cluster

Updated Global Model

Sub-global

Model

Computation

Device SBS

Transfer of sub-

global models

Sub-global

Model

Global Model

Computation

Updated Global Model

Sub-global

Model

Computation

Local Model

Computation

Sub-global

Model 2

Fig. 1: Taxonomy of DFL

the receiving devices send the locally aggregated models tothe BS for global aggregation. The base station (BS) afterperforming global aggregation sends back the global modelupdates to the end-devices for local models update. In contrastto CFL [5], the main contribution of this paper is a novelframework, dubbed dispersed FL (DFL) that offers learningof a global FL model in a fully distributed manner. TheDFL framework uses a distributed fashion of learning tojointly offer communication resources reuse, robustness, andenhanced privacy for FL. In our proposed DFL, first, sub-global models are computed within different groups consistingof closely located end-devices. The sub-global models are thenaggregated to yield a global model. Aggregation, as discussedin Section II, can be either centralized or distributed depending

on the type of DFL. Finally, the global model updates aresent back to the end-devices. The two-stage aggregation oflearning models in DFL can offer a limited privacy protection.Inferring an end-devices sensitive information from sub-globalmodels (e.g., at global server) is very difficult compared toinformation inferring at sub-global model computation serverusing local learning model updates [11]. Thus, we can say thatDFL can offer a limited privacy preservation. Furthermore,one can reuse the communication resources occupied by othercellular users within small groups used for sub-global modelcomputation to offer efficient communication resource usage.The summary of our contributions are as follows:

• We present a vision of DFL and categorize it into twotypes: centralized-aggregation-enabled DFL (CDFL) and

Page 3: Dispersed Federated Learning: Vision, Taxonomy, and Future ...cations such as smart health-care, augmented reality, industry 4.0, autonomous driving cars, among others. These applica-tions

3

distributed-aggregation-enabled DFL (DDFL). Further-more, a comparative analysis of different possible DFLschemes is provided.

• We present a DFL framework for IoT networks. Weconsider a cost function for DFL that captures loss inglobal model accuracy due to packet error rate andlocal learning model accuracy. To minimize the costof DFL, we propose an iterative approach scheme thatperforms joint association and resource allocation. Forassociation, we use a one-sided one-to-many matchinggame, whereas a one-sided one-to-one matching game isused for resource allocation.

• To validate our proposed iterative-matching-game-enabled solution, we provide numerical results. We showthe fast convergence of the proposed scheme for a fewernumber of global DFL rounds. Two baseline schemes:baseline-1 (i.e., proposed association and random re-source allocation) and baseline-2 (i.e., proposed resourceallocation and random association) are considered forcomparative analysis. Furthermore, we evaluate the per-formance of DFL using the MNIST dataset for imageclassification tasks which shows promising results.

• Finally, we conclude the paper and provide an outlook onfuture research directions.

II. DFL: FUNDAMENTALS AND TAXONOMY

A. FundamentalsIn DFL, a sub-global model is first iteratively computed

for different groups similar to traditional FL. Next, the sub-global model updates are aggregated to yield to a globalmodel. The aggregation of the sub-global models can beperformed either in a centralized or a distributed way. Finally,the global model updates are sent back to all the devicesinvolved in learning as shown in Fig. 1. Here, learning takesplace in an iterative manner until convergence of the globalFL model to a desirable value. For a fixed global FL modelaccuracy, there is a tradeoff between the number of sub-global iterations and the number of global rounds. An increasein sub-global model iterations requires few global rounds totrain the FL model, and vice versa. The selection of sub-global model iterations and global rounds strongly dependon the application and settings used. For instance, consideran intelligent transportation setting in which the sub-globalmodel computation takes place at the level of an autonomousvehicle and the global model aggregation at the core network.For such scenarios, it is preferable to use more sub-globaliterations than global iterations due to the high mobility ofvehicles and the potentially high number of handovers betweenthe roadside units (RSUs). On the other hand, consider thescenario where different smart industries seek to train a DFLmodel for a certain application. First, we can train a sub-globalmodel within every industry. Next, the sub-global models areshared between the industries. Finally, global aggregation takesplace within every industry and global model updates are sentback to end-devices. In such a scenario of smart industries,we can use a few sub-global iterations and more global roundscompared to the later scenario of autonomous vehicles. Next,we discuss the taxonomy of DFL.

B. Taxonomy

We can categorize DFL into two main types (shown inFig. 1) depending on the fashion of global model aggregation:CDFL and DDFL, as discussed next.

1) Centralized-Aggregation-Enabled DFL: In CDFL, aglobal FL model is obtained by aggregation of the sub-global models at a centralized server. In contrast to CFL[5], CDFL iteratively computes sub-global models for allgroups of closely located devices prior to global aggregation.Meanwhile, in CFL, few single iteration local aggregationsat a few devices take place prior to global aggregation at theBS. CDFL can be either edge-based or cloud-based dependingon the context of the machine learning model. Edge-basedCDFL is more suitable for the local context-aware, specializedmachine learning model [4] for devices located within asmall geographical region. For instance, we can train a localcontext-aware, specialized machine learning model (e.g., asmart keyboard search suggestion for a regional language)using edge-based CDFL. To do so, we can consider an edge-based small cell base station (SBS) and a device-to-device(D2D) communication network reusing the already occupiedfrequency bands by other cellular users. Multiple clusters withtheir own cluster heads can be formed using some criteria(e.g., throughput enhancement). First, sub-global models canbe trained in a fashion similar to traditional FL for differentclusters with the cluster head acting for sub-global modelaggregation. Second, sub-global models are transferred to theedge servers for global model aggregation by cluster heads.Finally, the global model updates are transmitted back to thecluster heads which disseminate them to all the devices. Onthe other hand, cloud-based DFL can train global context-aware, generalized machine learning models for devices lo-cated within several geographically distributed regions. Theprocess of cloud-based DFL is similar to edge-based DFL,but with the aggregation of sub-global models taking placeat the remote cloud. Although cloud-based DFL enable moregeneralized learning of a model, it suffers from the high-latency issue.

2) Distributed-Aggregation-Enabled DFL: DDFL aggre-gation of the sub-global model updates is performed atdistributed nodes in contrast to CDFL. CDFL is based oncentralized aggregation of sub-global models which can besuffered from centralized server crash or a malicious attack.DDFL overcomes these issues by using distributed nodes-based aggregation. However, it suffers from high latencyand extra communication resources for transfer of sub-globalmodel updates among distributed nodes, as given in Table I.To implement DDFL, there can be two possible ways de-pending on the way in which the sub-global model updatesare exchanged. The sharing of sub-global model updates canbe either using blockchain-based miners or direct exchangebetween edge servers. Blockchain-based DFL can offer themost trustful transfer of sub-global model updates. However,it faces the inherent challenge of high latency associated witha blockchain consensus algorithm (e.g., proof of work) thatis not desirable in FL model computation [12]. Therefore, wemust design novel consensus algorithms with low latency for

Page 4: Dispersed Federated Learning: Vision, Taxonomy, and Future ...cations such as smart health-care, augmented reality, industry 4.0, autonomous driving cars, among others. These applica-tions

4

TABLE I: Comparison of various DFL schemes.

Parameter Description Edge-basedCDFL

Cloud-basedCDFL

Edge-basedDDFL

Blockchain-based DDFL

Global model computa-tion delay

This is related to the delay in com-puting global model during for oneglobal round.

Lowest Low High Highest

Robustness It refers to a successful opera-tion of DFL in case of edgeserver/cloud server/miners physicaldamage. Additionally, robustness isalso related to the interruption ofDFL due to malicious attack on theaggregation server.

Low Low High Highest

Communicationresources usage

This refers to the communica-tion resources used for computingglobal DFL model.

Low High High Highest

Implementationcomplexity

This is the measure of the complex-ity in the computation of a globalDFL model.

Lowest Low High Highest

blockchain-based DFL.

III. DDFL FRAMEWORK FOR IOT SYSTEMS

For an IoT system-enabled by federated learning usinga massive number of end-devices, we need a significantamount of communication resources for training. To enablecommunication resource-efficient and robust FL for such anIoT network, we can either use DFL or DDFL. In contrastto CDFL, DDFL performs sub-global model aggregations atmultiple locations and thus offers a more robust operation.

A. System ModelConsider an IoT network which consists of IoT devices

and edge server-enabled SBSs. The IoT devices, each havingtheir own local datasets seek to train FL model. The IoT de-vices show significant heterogeneity in terms of computationalpower (CPU-cycles/sec), local dataset size, and distribution. Toaccount for IoT devices heterogeneity, we use the notion ofrelative local accuracy. The lower values of relative accuracyreflect better local learning model accuracy, and thus lessglobal communication rounds are needed to achieve a desirableglobal FL accuracy. To enable edge-based DDFL in IoTnetwork, first, a sub-global model is computed for SBS in aniterative fashion via interaction with its associated IoT devices.After the computation of the sub-global model updates byall the SBSs, the sub-global model updates are transferredbetween all the SBSs via fast backhaul links. Finally, all theSBSs perform global model aggregation and sent the globalmodel updates to their associated IoT devices. The process foredge-based DDFL uses wireless channel resources. The packeterror rate due to channel uncertainties might cause severedegradation effect on the performance of the FL process [8].We can define a cost function that jointly accounts for relativelocal accuracy and the degradation effect due to the packeterror rate on the performance of FL. The packet error ratestrongly depends on devices signal-to-interference-plus-noise(SINR) ratio. Therefore, to enable edge-based DDFL for IoTnetwork while minimizing the cost function, there is a needto address the two following challenges.

• An IoT network topology is characterized by dynamictopology due to the free mobility of IoT devices. Fur-thermore, the seamless connectivity of IoT devices withthe SBSs is desirable during the exchange of learningmodel parameters. Considering the aforementioned chal-lenges, we must appropriately associate IoT devices withcorresponding SBSs so as to optimize FL performanceby increasing the overall throughput that consequentlyminimizes the cost.

• To effectively use the limited available communicationresources, we can reuse the already occupied resourceblocks by cellular users for IoT-networks. Therefore, wemust propose an effective resource allocation strategy forfederated learning-based IoT networks.

B. Iterative-Matching-Based SolutionIn this subsection, we present an iterative scheme for cost

optimization of proposed edge-based DDFL. First, we considerseveral constraints for our DDFL framework:• Constraint 1: Every device can be allocated to a maxi-

mum of one resource block.• Constraint 2: Every resource block must not be allocated

to more than one device.• Constraint 3: The total number of resource blocks allo-

cated to all devices must not be greater than the maximumavailable resource blocks.

• Constraint 4: Every device be associated to a maximumof one SBS.

• Constraint 5: Every SBS can be associated with maxi-mum number of devices determined by a threshold (fixedin this paper for all SBSs).We consider a cost function which jointly considers

devices relative local accuracy and effect of packet error rateon global DDFL model accuracy. The cost function jointlydepends on both resource allocation and devices association.Due to the NP-hard nature of the joint optimization problemfor optimizing both resource allocation and device association,we decompose the main problem into two sub-problems: re-source allocation and association sub-problems. Our proposed

Page 5: Dispersed Federated Learning: Vision, Taxonomy, and Future ...cations such as smart health-care, augmented reality, industry 4.0, autonomous driving cars, among others. These applica-tions

5

0.51

1

1

DD

FL

Cos

t

1.5

0.5Relative local accuracy

2

Normalized SINR0.5

0 0

(a)

0 5 10 15 20 25 30Iterations

0

0.5

1

1.5

2

2.5

3

CC

FL

10-9

ProposedBaseline-1Baseline-2

(b)

0 50 100 150 200Communication rounds

0.3

0.4

0.5

0.6

0.7

0.8

0.9

Accu

racy

DDFL (Sub-global=2)DDFL (Sub-global=8)DDFL (Sub-global=16)

(c)

Fig. 2: (a) Variations in DDFL cost for SINR and relative local accuracy,(b) Proposed DDFL cost vs. iterations, (c) Accuracyvs. communication rounds.

algorithm solves one problem (i.e., resource allocation) byfixing the other problem (i.e., association) in an iterativemanner until the convergence.

1) One-to-One Matching-Based Resource Allocation: Tominimize the DDFL cost (i.e., improving FL accuracy) for IoTnetworks by improving SINR, we can use a low-complexitymatching theory-based wireless resource allocation scheme.Our resource allocation problem is similar to house allocationproblem that can be represented by a tuple (A,H,P), whereA, H , and P represent the agents set, houses set, andpreference profile of agents over houses, respectively [13].Similar to the house allocation problem, we can define apreference profile Pr for a set of resource blocks over devices.To allocate resource blocks to devices, we use a one-sided one-to-one matching game. All the resource blocks rank devicesbased on the values of the cost function. For a particularresource block, the device with low cost is preferred overthe car with a high cost. Using preference profile Pr , all theresource blocks are allocated to devices iteratively until noblocking pair is left, which shows stable matching [14].

2) One-to-Many Matching-Based Devices Association:For a fixed resource block allocation, the DDFL cost stronglydepends on the association of devices with SBSs. Therefore,we must associate the devices with SBSs so as to minimizethe DDFL cost by increasing the throughput. Moreover, itwill decrease DDFL convergence time. We can associatedevices to SBSs using a brute force algorithm, it suffersfrom high computational complexity. In contrast to the houseallocation problem, our device association problem involvesthe assignment of a single SBS to multiple devices. Therefore,we use one-sided one-to-many matching for the association ofSBSs to devices. First, we define a preference profile Pa forRSUs. The preference profile Pa ranks all the devices basedon the values of the cost function. Similar to Pr , the devicewith a high cost are given less preference than the devicewith a low cost. Once the preference profile is obtained forall devices, the association is performed in an iterative manneruntil no blocking par is left to achieve stable matching [14].

C. Performance Evaluation

We present numerical results to evaluate the performanceof the proposed framework for IoT networks. Moreover, weevaluate the performance of the proposed DFFL frameworkfor an image classification task using the MNIST dataset[2]. Our simulation scenario consists of 54 devices and 6SBSs which are deployed in an area of 1000 × 1000 m2.The position of the SBS is taken fixed whereas devices arepositioned randomly. All the values are generated using anaverage of 500 different runs. We compare the performance ofour proposed iterative-matching-based scheme two baselinesschemes such as baseline-1 and baseline-2. Baseline-1 usesone-to-many matching-based association and random resourceallocation, whereas baseline-2 uses one-to-one matching-basedresource allocation and random association. We use the term“iterations” to refer to one-time execution of both one-to-onematching-based resource allocation and one-to-many matching-based association. Furthermore, the keyword ”communicationround” denotes global iteration for DDFL.

Fig. 2a shows the variations in DDFL cost for signal-to-interference-plus-noise-ratio (SINR) and relative local ac-curacy. For a fixed relative local accuracy, the DDFL costdecreases exponentially with an increase in SINR and viceversa. For lower values of SINR, the decrease in DFL cost isvery less, but for higher values of SINR, the cost decreasessignificantly. On the other hand, a decrease in relative localaccuracy shows a decrease in DDFL cost. The DDFL costshows lower values for higher and lower values of SNIR andrelative local accuracy, respectively. This shows that the costof training DDFL global model is low for devices having highlocal learning model accuracy (low relative local accuracy) andhigh throughput and vice versa.

Fig. 2b shows the DDFL cost vs. iterations for proposed,baseline-1, and baseline-2 schemes. The proposed scheme out-performed both baseline-1 and baseline-2 schemes. Moreover,all three schemes showed fast convergence for a lower numberof iterations. Baseline-1 has better performance than baseline-2, which shows that DDFL cost depends more on devicesassociation than resource block allocation. Finally, we evaluatethe performance of DDFL using the MNIST dataset for image

Page 6: Dispersed Federated Learning: Vision, Taxonomy, and Future ...cations such as smart health-care, augmented reality, industry 4.0, autonomous driving cars, among others. These applica-tions

6

classification tasks in Fig. 2c, which clearly reveals the fastconvergence of the DDFL scheme using different numbers ofsub-global iterations. Moreover, increasing the number of sub-global iterations shows generally faster convergence than alower number of sub-global iterations. However, increasing thenumber of sub-global iterations beyond a certain value mightnot cause a significant performance improvement in terms ofconvergence. For 2 sub-global iterations, more communicationrounds are needed to reach convergence than 8 sub-globaliterations. Furthermore, the performance of DDFL for 16 sub-global iterations shows better performance for initial commu-nication rounds but shows similar performance as 2 sub-globaliterations for later iterations. From the above discussion, it isclear that we must make a trade-off between the sub-globaliterations and global communication rounds for attaining acertain global FL accuracy.

IV. CONCLUSIONS AND FUTURE DIRECTIONS

In this paper, we have presented a novel idea of DFL.A taxonomy is devised based on the fashion of global modelaggregation for DFL. We have found that DFL can be adoptedwidely to achieve resource-efficient and privacy-aware imple-mentation of FL for several IoT scenarios. Furthermore, DFLoffers a tradeoff between convergence rate and the numberof sub-global iterations. Finally, we present several futureresearch directions regarding DFL.

A. Data Heterogeneity-Aware DFL

How do we enable FL learning for a massive numberof devices with heterogeneous datasets of non-independentand identically distribution (non-IID) nature? In contrast toFedAvg, FedProx was developed to account for data hetero-geneity in the training of a global FL model. FedProx is basedon the addition of a weighted proximal term to end-device lossfunction to handle data heterogeneity. However, choosing theweight of the proximal term might be challenging. Further-more, it is unclear whether FedProx can provably improvethe convergence rate [15]. To cope with these challenges,we can propose novel heterogeneity-aware DFL for deviceswith heterogeneous datasets. A set of devices can be dividedinto different clusters based on datasets homogeneity, withtheir own cluster heads acting for sub-global aggregation. Thesub-global models are obtained in an iterative manner viathe exchange of learning model parameters between the end-devices and their cluster heads. Then, the sub-global modelsare aggregated to yield a global model which is then sent backto all the cluster heads. Finally, the cluster heads disseminatethe global model updates to all the devices for updatingtheir local learning models. This process of global modelcomputation takes place iteratively until convergence.

B. Enhanced Distributed Privacy-Aware DFL

How do we enable DFL with enhanced privacy preser-vation? In contrast to global aggregation server in DFL,the server used for sub-global aggregation can infer theend-devices sensitive information from their local learning

models. Therefore, new effective privacy-preserving schemeswithin groups used for sub-global model computation mustbe designed. One way is to add artificial noise to locallearning models before sending for sub-global aggregation.We must properly add noises of different variances to enableeffective differential privacy. Another challenge is how tosimultaneously achieve a certain privacy level while keepingthe convergence time low. A simultaneous better privacyprotection level and low convergence time are generally con-tradictory. Therefore, we must propose a novel distributedprivacy protection schemes to enable privacy preservation forvarious DFL groups while offering a trade-off between privacyprotection level and convergence time.

C. Distributed Fairness-Aware DFL

How does one enable DFL that accounts for deviceswith degraded performance within a group used for sub-globalmodel computation? A significant amount of heterogeneityexists within a group of devices in terms of local learningmodel accuracy and communication resources. The devicesheterogeneity results in significant variations in local learningmodel accuracy. The devices with better performance willinfluence the sub-global FL model more than other devices.Therefore, we must assign different weights in sub-globalaggregation to enable fairness among devices. To enablefairness-aware DFL, we must propose distributed fairness-aware DFL scheme (different than IV-A approach that isbased on clustering) which assigns weights adaptively to end-devices within different groups used for sub-global modelcomputation.

D. Mobility-Aware DFL

How do we enable efficient DFL for mobile nodes?An end-device involved in DFL process requires seamlessconnectivity with the sub-global aggregation server duringsub-global model computation. Therefore, it is necessary topropose novel protocols that accounts for end-devices mobility.Several ways can be used for mobility management suchas deep learning-enabled mobility prediction, hidden Markovmodel, Bayesian network, and data mining.

REFERENCES

[1] L. U. Khan, I. Yaqoob, M. Imran, Z. Han, and C. S. Hong, “6G wirelesssystems: A vision, architectural elements, and future directions,” IEEEAccess, to appear, 2020.

[2] H. B. E. Moore, D. Ramage, S. Hampson, and B. A. Arcas,“Communication-efficient learning of deep networks from decentralizeddata,” arXiv preprint arXiv:1602.05629, 2016.

[3] W. Y. B. Lim, N. C. Luong, D. T. Hoang, Y. Jiao, Y.-C. Liang, Q. Yang,D. Niyato, and C. Miao, “Federated learning in mobile edge networks:A comprehensive survey,” To appear, IEEE Communications Surveys &Tutorials, 2020.

[4] L. U. Khan, N. H. Tran, S. R. Pandey, W. Saad, Z. Han, M. N. Nguyen,and C. S. Hong, “Federated learning for edge networks: Resource op-timization and incentive mechanism,” arXiv preprint arXiv:1911.05642,2019.

[5] M. Chen, H. V. Poor, W. Saad, and S. Cui, “Wireless communications forcollaborative federated learning in the internet of things,” arXiv preprintarXiv:2006.02499, 2020.

[6] H. B. McMahan, D. Ramage, K. Talwar, and L. Zhang, “Learn-ing differentially private recurrent language models,” arXiv preprintarXiv:1710.06963, 2017.

Page 7: Dispersed Federated Learning: Vision, Taxonomy, and Future ...cations such as smart health-care, augmented reality, industry 4.0, autonomous driving cars, among others. These applica-tions

7

[7] K. Bonawitz, V. Ivanov, B. Kreuter, A. Marcedone, H. B. McMahan,S. Patel, D. Ramage, A. Segal, and K. Seth, “Practical secure aggrega-tion for privacy-preserving machine learning,” in ACM Conference onComputer and Communications Security, Dallas Texas USA, October2017, pp. 1175–1191.

[8] M. Chen, Z. Yang, W. Saad, C. Yin, H. V. Poor, and S. Cui, “A jointlearning and communications framework for federated learning overwireless networks,” arXiv preprint arXiv:1909.07972, 2019.

[9] S. Wang, T. Tuor, T. Salonidis, K. K. Leung, C. Makaya, T. He, andK. Chan, “Adaptive federated learning in resource constrained edge com-puting systems,” IEEE Journal on Selected Areas in Communications,vol. 37, no. 6, pp. 1205–1221, June 2019.

[10] M. S. H. Abad, E. Ozfatura, D. Gunduz, and O. Ercetin, “Hierarchicalfederated learning across heterogeneous cellular networks,” in IEEEInternational Conference on Acoustics, Speech and Signal Processing,Barcelona, Spain, May 2020, pp. 8866–8870.

[11] M. Nasr, R. Shokri, and A. Houmansadr, “Comprehensive privacyanalysis of deep learning: Passive and active white-box inference attacksagainst centralized and federated learning,” in IEEE Symposium onSecurity and Privacy, USA, May 2019, pp. 739–753.

[12] W. S. G. Lee, J. Park and M. Bennis, “Performance analysis ofblockchain systems with wireless mobile miners,” IEEE NetworkingLetters, to appear, 2020.

[13] L. Zhou, “On a conjecture by gale about one-sided matching problems,”Journal of Economic Theory, vol. 52, no. 1, pp. 123–135, October 1990.

[14] Z. Han, D. Niyato, W. Saad, T. Basar, and A. Hjørungnes, Gametheory in wireless and communication networks: theory, models, andapplications. Cambridge university press, 2012.

[15] P. Kairouz, H. B. McMahan, B. Avent, A. Bellet, M. Bennis, A. N.Bhagoji, K. Bonawitz, Z. Charles, G. Cormode, R. Cummings et al.,“Advances and open problems in federated learning,” arXiv preprintarXiv:1912.04977, 2019.

Latif U. Khan is currently pursuing his Ph.D. degreein Computer Engineering at Kyung Hee University(KHU), South Korea. He received his MS (ElectricalEngineering) degree with distinction from Universityof Engineering and Technology, Peshawar, Pakistanin 2017. His research interests include analyticaltechniques of optimization and game theory to edgecomputing and end-to-end network slicing.

Walid Saad (S’07, M’10, SM’15, F’19) receivedhis Ph.D. degree from the University of Oslo in2010. Currently, he is a professor in the Departmentof Electrical and Computer Engineering at VirginiaTech. His research interests include wireless net-works, machine learning, game theory, cybersecu-rity, unmanned aerial vehicles, and cyber-physicalsystems. He is the author/co-author of eight con-ference best paper awards and of the 2015 IEEEComSoc Fred W. Ellersick Prize.

Zhu Han (S’01, M’04, SM’09, F’14) received hisPh.D. degree in electrical and computer engineeringfrom the University of Maryland, College Park.Currently, he is a professor in the Electrical andComputer Engineering Department as well as in theComputer Science Department at the University ofHouston, Texas. Dr. Han is an AAAS fellow since2019. Dr. Han is 1% highly cited researcher since2017 according to Web of Science.

Choong Seon Hong (S’95-M’97-SM’11) is workingas a professor with the Department of ComputerScience and Engineering, Kyung Hee University. Hisresearch interests include future Internet, ad hoc net-works, network management, and network security.He is currently an Associate Editor of the IEEETransactions on Network and Service Management,International Journal of Network Management, andJournal of Communications and Networks and anAssociate Technical Editor of the IEEE Communi-cations Magazine.