2020.02.06 workshop on federated learning and analytics

131
Confidential + Proprietary Federated learning at Google: systems, algorithms, and applications K. Bonawitz federated.withgoogle.com [email protected] Presenting the work of many Workshop on Federated Learning and Analytics (FL-IBM’20) 2020.02.06

Upload: others

Post on 02-Jan-2022

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 2020.02.06 Workshop on Federated Learning and Analytics

Confidential + Proprietary

Federated learning at Google:systems, algorithms, and applicationsK. Bonawitz [email protected] the work of many

Workshop on Federated Learning and Analytics (FL-IBM’20)2020.02.06

Page 2: 2020.02.06 Workshop on Federated Learning and Analytics

Federated learning

Federated learning is a machine learning setting where multiple entities (clients) collaborate in solving a machine learning problem, under the coordination of a central server or service provider. Each client's raw data is stored locally and not exchanged or transferred; instead, focused updates intended for immediate aggregation are used to achieve the learning objective.

working definition proposed in Advances and Open Problems in Federated Learning (arxiv/1912.04977)

Page 3: 2020.02.06 Workshop on Federated Learning and Analytics

Why federated learning?

Page 4: 2020.02.06 Workshop on Federated Learning and Analytics

Data is born at the edge

Billions of phones & IoT devices constantly generate data

Data enables better products and smarter models

Page 5: 2020.02.06 Workshop on Federated Learning and Analytics

Data processing is moving on device:● Improved latency● Works offline● Better battery life● Privacy advantages

E.g., on-device inference for mobile keyboards and cameras.

Can data live at the edge?

Page 6: 2020.02.06 Workshop on Federated Learning and Analytics

Data processing is moving on device:● Improved latency● Works offline● Better battery life● Privacy advantages

E.g., on-device inference for mobile keyboards and cameras.

Can data live at the edge?

What about analytics?What about learning?

Page 7: 2020.02.06 Workshop on Federated Learning and Analytics

ML on Sensitive Data: Privacy versus UtilityPr

ivac

y

Utility

Page 8: 2020.02.06 Workshop on Federated Learning and Analytics

ML on Sensitive Data: Privacy versus UtilityPr

ivac

y

Utility

perception

Page 9: 2020.02.06 Workshop on Federated Learning and Analytics

today

Priv

acy

Utility

1. Policy

2. Technology

ML on Sensitive Data: Privacy versus Utility

perception

Page 10: 2020.02.06 Workshop on Federated Learning and Analytics

Priv

acy

Utility

goal 1. Policy

2. New Technology

ML on Sensitive Data: Privacy versus Utility (?)

Push the pareto frontier with better technology.

Make achieving high privacy and utility possible with less work.

today

Page 11: 2020.02.06 Workshop on Federated Learning and Analytics

APPLICATIONS INFRASTRUCTURE

RESEARCH

→ New capabilities →

← Real-world grounding ←

→ R

eal-w

orld

pro

blem

s →

← N

ovel

sol

utio

ns ←

→ Requirements →

← Practical solutions ←

Page 12: 2020.02.06 Workshop on Federated Learning and Analytics

Early days (2017) https://ai.googleblog.com/2017/04/federated-learning-collaborative.html

Page 13: 2020.02.06 Workshop on Federated Learning and Analytics

The nascent field of federated learning

Page 14: 2020.02.06 Workshop on Federated Learning and Analytics

The nascent field of federated learning

And workshops like this one...

Page 15: 2020.02.06 Workshop on Federated Learning and Analytics

Advances and Open Problems in FL

58 authors from 25 institutions

arxiv.org/abs/1912.04977

Page 16: 2020.02.06 Workshop on Federated Learning and Analytics

The nascent field of federated learning

● Cross-device vs Cross-silo

● Data partitioning: Horizontal, Vertical, other

Page 17: 2020.02.06 Workshop on Federated Learning and Analytics

Characteristics of the federated learning setting (I)

Datacenter distributed learning Cross-silo federated learning

Cross-device federated learning

Setting Training a model on a large but "flat" dataset. Clients are compute nodes in a single cluster or datacenter.

Training a model on siloed data. Clients are different organizations (e.g., medical or financial) or datacenters in different geographical regions.

The clients are a very large number of mobile or IoT devices.

Data distribution Data is centrally stored, so it can be shuffled and balanced across clients. Any client can read any part of the dataset.

Data is generated locally and remains decentralized. Each client stores its own data and cannot read the data of other clients. Data is not independently or identically distributed.

Orchestration Centrally orchestrated. A central orchestration server/service organizes the training, but never sees raw data.

Wide-area communication

None (fully connected clients in one datacenter/cluster).

Hub-and-spoke topology, with the hub representing a coordinating service provider (typically without data) and the spokes connecting to clients.

Data availability All clients are almost always available. Only a fraction of clients are available at any one time, often with diurnal and other variations.

Distribution scale Typically 1 - 1000 clients. Typically 2 - 100 clients. Massively parallel, up to 1010 clients.

Page 18: 2020.02.06 Workshop on Federated Learning and Analytics

Characteristics of the federated learning setting (II)

Datacenter distributed learning Cross-silo federated learning

Cross-device federated learning

Addressability Each client has an identity or name that allows the system to access it specifically.

Clients cannot be indexed directly (i.e., no use of client identifiers)

Client statefulness Stateful --- each client may participate in each round of the computation, carrying state from round to round.

Generally stateless --- each client will likely participate only once in a task, so generally we assume a fresh sample of never before seen clients in each round of computation.

Primary bottleneck Computation is more often the bottleneck in the datacenter, where very fast networks can be assumed.

Might be computation or communication.

Communication is often the primary bottleneck, though it depends on the task. Generally, federated computations uses wi-fi or slower connections.

Reliability of clients Relatively few failures. Highly unreliable --- 5% or more of the clients participating in a round of computation are expected to fail or drop out (e.g., because the device becomes ineligible when battery, network, or idleness requirements for training/computation are violated).

Data partition axis Data can be partitioned / re-partitioned arbitrarily across clients.

Partition is fixed. Could be example-partitioned (horizontal) or feature-partitioned (vertical).

Fixed partitioning by example(horizontal).

Page 19: 2020.02.06 Workshop on Federated Learning and Analytics

ML Engineer's Workflow

Page 20: 2020.02.06 Workshop on Federated Learning and Analytics

engineer

cloud data

Train & evaluateon cloud data

Model engineer workflow

server

Page 21: 2020.02.06 Workshop on Federated Learning and Analytics

clients

server

Final modelvalidation steps

engineer

Modeldeployment workflow

Page 22: 2020.02.06 Workshop on Federated Learning and Analytics

Modeldeployment workflow

clients

server

engineer

Deploy modelto devices

for on-deviceinference

Page 23: 2020.02.06 Workshop on Federated Learning and Analytics

Train & evaluateon decentralized

data

clients

server

Federatedtraining

engineer

Page 24: 2020.02.06 Workshop on Federated Learning and Analytics

opted-in

Federated learning

data device

Need me?

Page 25: 2020.02.06 Workshop on Federated Learning and Analytics

Federated learning

data device

Need me?

Not now

Page 26: 2020.02.06 Workshop on Federated Learning and Analytics

data device

Need me?

Yes!

Federated learning

Page 27: 2020.02.06 Workshop on Federated Learning and Analytics

initial model

engineer

Federated learning

data device updatedmodel

Page 28: 2020.02.06 Workshop on Federated Learning and Analytics

data device

Federated learning

initial model

engineer

(ephemeral)updatedmodel

Page 29: 2020.02.06 Workshop on Federated Learning and Analytics

data device

Federated learning

initial model

engineer

updatedmodel

Privacy principleFocused collection

Devices report only what is needed for this computation

Page 30: 2020.02.06 Workshop on Federated Learning and Analytics

data device

Federated learning

initial model

engineer

updatedmodel

Privacy principleEphemeral reports

Server never persistsper-device reports

Page 31: 2020.02.06 Workshop on Federated Learning and Analytics

data device

combinedmodel

Federated learning

initial model

engineer

updatedmodel

Page 32: 2020.02.06 Workshop on Federated Learning and Analytics

data device

combinedmodel

Federated learning

initial model

engineer

updatedmodel

Privacy principleOnly-in-aggregate Engineer may only access combined

device reports

Page 33: 2020.02.06 Workshop on Federated Learning and Analytics

(another)initial model

data device

combinedmodel

Federated learning

engineer

Page 34: 2020.02.06 Workshop on Federated Learning and Analytics

(another)combined

model

(another)initial model

Federated learning

engineer

data device

Page 35: 2020.02.06 Workshop on Federated Learning and Analytics

data device

(another)combined

model

(another)initial model

Typical orders-of-magnitude

100-1000s of users per round

100-1000s of rounds to convergence

1-10 minutes per round

engineer

Federated learning

Page 36: 2020.02.06 Workshop on Federated Learning and Analytics

Federated learning at Google500M+ installs

Daily use by multiple teams

Powering features on Pixel devices and in Gboard and Android Messages.

Page 37: 2020.02.06 Workshop on Federated Learning and Analytics

Federated Learning on Pixel Phones

Page 38: 2020.02.06 Workshop on Federated Learning and Analytics

Gboard: next-word prediction

Federated RNN (compared to prior n-gram model):● Better next-word prediction accuracy: +24%● More useful prediction strip: +10% more clicks

Federated modelcompared to baseline

A. Hard, et al. Federated Learning for Mobile Keyboard Prediction. arXiv:1811.03604

Page 39: 2020.02.06 Workshop on Federated Learning and Analytics

Other federated models in GboardEmoji prediction● 7% more accurate emoji predictions● prediction strip clicks +4% more● 11% more users share emojis!

Action predictionWhen is it useful to suggest a gif, sticker, or search query?● 47% reduction in unhelpful suggestions● increasing overall emoji, gif, and sticker

shares

Discovering new wordsFederated discovery of what words people are typing that Gboard doesn’t know.

T. Yang, et al. Applied Federated Learning: Improving Google Keyboard Query Suggestions. arXiv:1812.02903

M. Chen, et al. Federated Learning Of Out-Of-Vocabulary Words. arXiv:1903.10635

Ramaswamy, et al. Federated Learning for Emoji Prediction in a Mobile Keyboard. arXiv:1906.04329.

Page 40: 2020.02.06 Workshop on Federated Learning and Analytics

Privacy-in-depth

We advocate for building federated systems wherein the privacy properties degrade as gracefully as possible in cases where one technique or another fails to provide its intended privacy contribution.

Page 41: 2020.02.06 Workshop on Federated Learning and Analytics

initial model

engineer

updatedmodel

Privacy principleOnly-in-aggregate Engineer may only access combined

device reports

Privacy principleEphemeral reports

Server never persistsper-device reports

Privacy principleFocused collection

Devices report only what is needed for this computation

Page 42: 2020.02.06 Workshop on Federated Learning and Analytics

initial model

engineer

Privacy principleOnly-in-aggregate Engineer may only access combined

device reports

Wouldn't it be great if...

Page 43: 2020.02.06 Workshop on Federated Learning and Analytics

Confidential + Proprietary

Secure Aggregation.Existing protocols either:

Transmit a lot of data

Fail when users drop out

(or both)

A novel protocol for K. Bonawitz, V. Ivanov, B. Kreuter, A. Marcedone, H. B. McMahan, S. Patel, D. Ramage, A. Segal, K. Seth. Practical Secure Aggregation for Privacy-Preserving Machine Learning. CCS 2017.

Page 44: 2020.02.06 Workshop on Federated Learning and Analytics

Alice

Bob

Carol

Random positive/negative pairs, aka antiparticles

Devices cooperate to sample random pairs of 0-sum perturbations vectors.

Matched pair sums to 0

Page 45: 2020.02.06 Workshop on Federated Learning and Analytics

Alice

Bob

Carol

Random positive/negative pairs, aka antiparticles

Devices cooperate to sample random pairs of 0-sum perturbations vectors.

Page 46: 2020.02.06 Workshop on Federated Learning and Analytics

Alice

Bob

Carol

Add antiparticles before sending to the server

Each contribution looks random on its own...

++

+

+

+

+

Page 47: 2020.02.06 Workshop on Federated Learning and Analytics

The antiparticles cancel when summing contributions

++

+

+

+

but paired antiparticles cancel out when summed.

Each contribution looks random on its own...

+∑

Alice

Bob

Carol

Page 48: 2020.02.06 Workshop on Federated Learning and Analytics

Revealing the sum.

++

+

but paired antiparticles cancel out when summed.

Each contribution looks random on its own...

+

+

+∑

Alice

Bob

Carol

Page 49: 2020.02.06 Workshop on Federated Learning and Analytics

Google aggregates users' updates, but cannot inspect the individual updates.

Secure Aggregation

# Params Bits/Param # Users Expansion

220 = 1 m 16 210 = 1 k 1.73x

224 = 16 m 16 214 = 16 k 1.98x

Communication Efficient

Secure⅓ malicious clients + fully observed server

Robust

⅓ clients can drop out

Interactive Cryptographic ProtocolEach phase, 1000 clients + server interchange messages over 4 rounds of communication.

K. Bonawitz, V. Ivanov, B. Kreuter, A. Marcedone, H. B. McMahan, S. Patel, D. Ramage, A. Segal, K. Seth. Practical Secure Aggregation for Privacy-Preserving Machine Learning. CCS 2017.

Page 50: 2020.02.06 Workshop on Federated Learning and Analytics

data device

updated model

model

+∑

1. Devices “clip” their updates, limiting any one user's contribution

2. Server adds noise when combining updates

DP-SGD plus Federated Averaging M. Abadi, et. al. Deep Learning with Differential Privacy. CCS 2016.

H. B. McMahan, et al. Learning Differentially Private Recurrent Language Models. ICLR 2018.

Page 51: 2020.02.06 Workshop on Federated Learning and Analytics

data device

updated model

model

+

++

+

+

Secure Aggregation + DP → Distributed DP

Page 52: 2020.02.06 Workshop on Federated Learning and Analytics

data device

updated model

model

+

++

+

+

Secure Aggregation + DP → Distributed DP

Local DP (lower privacy/higher ε)

Central DP (high privacy/

small ε)

Page 53: 2020.02.06 Workshop on Federated Learning and Analytics

ML Engineer's Workflow

Page 54: 2020.02.06 Workshop on Federated Learning and Analytics

Train & evaluateon decentralized

data

clients

server

Federatedtraining

engineer

Page 55: 2020.02.06 Workshop on Federated Learning and Analytics

engineer

cloud data

Train & evaluateon cloud data

server

This is what we like to see ...

Page 56: 2020.02.06 Workshop on Federated Learning and Analytics

engineer

cloud data

Train & evaluateon cloud data

server

Gahhh!

… but sometimes we see this

Page 57: 2020.02.06 Workshop on Federated Learning and Analytics

Typical ML Tasks requiring data inspection

Augenstein, et. al. Generative Models for Effective ML on Private, Decentralized Datasets. Arxiv, 2019.

Page 58: 2020.02.06 Workshop on Federated Learning and Analytics

Typical ML Tasks requiring representative examples

Augenstein, et. al. Generative Models for Effective ML on Private, Decentralized Datasets. Arxiv, 2019.

Page 59: 2020.02.06 Workshop on Federated Learning and Analytics

Example Federated GAN Problem

Page 60: 2020.02.06 Workshop on Federated Learning and Analytics

Example Federated GAN ProblemAfter application update, the classification accuracy drops

Page 61: 2020.02.06 Workshop on Federated Learning and Analytics

Train 2 GANs: one on a subset of data exhibiting high classification accuracy, and another on low classification accuracy.

Example Federated GAN Problem

Page 62: 2020.02.06 Workshop on Federated Learning and Analytics

Example Federated GAN ResultsGAN after 1000 rdsGAN after 0 rdsExample of Real

Data on Devices in Sub-Population

Population Description

EMNIST Dataset, 50% of Devices have their images ‘flipped’ (black<-> white)

Sub-Population Description

Devices where data classifies with ‘low’ accuracy

Devices where data classifies with ‘high’ accuracy

Page 63: 2020.02.06 Workshop on Federated Learning and Analytics

Example Federated GAN ResultsExample of Real Data on Devices in Sub-Population

Population Description

EMNIST Dataset, 50% of Devices have their images ‘flipped’ (black<-> white)

Sub-Population Description

Devices where data classifies with ‘low’ accuracy

Devices where data classifies with ‘high’ accuracy

After 1000 roundsAfter 1 round

Page 64: 2020.02.06 Workshop on Federated Learning and Analytics

Example Federated GAN ResultsAfter 1000 roundsAfter 1 roundExample of Real

Data on Devices in Sub-Population

Population Description

EMNIST Dataset, 50% of Devices have their images ‘flipped’ (black<-> white)

Sub-Population Description

Devices where data classifies with ‘low’ accuracy

Devices where data classifies with ‘high’ accuracy

Page 65: 2020.02.06 Workshop on Federated Learning and Analytics

Example Federated GAN ResultsGAN after 1000 rdsExample of Real

Data on Devices in Sub-Population Now the modeler

can discern this difference ...

… indicating that this is the problem

Page 66: 2020.02.06 Workshop on Federated Learning and Analytics

Development environment designed specifically for FL● Language that combines TF and communication (embedded in Python)● Libraries of FL algorithms (tff.learning) expressed in this language● Runtimes, datasets, examples, etc., for (simulation-based) experiments

Part of TensorFlow ecosystem● tensorflow.org/federated

OSS project on GitHub● github.com/tensorflow/federated

Page 67: 2020.02.06 Workshop on Federated Learning and Analytics
Page 68: 2020.02.06 Workshop on Federated Learning and Analytics

Easy way to get started exploring the FL space on your own● Pseudocode-like style of programming, high-level and compact● Reference implementations of core FL algorithms such as federated

averaging that you can fork/modify● Preprocessed datasets and some standard models (more coming,

contribute your own)● Modular and configurable simulation environment (Python notebooks)● Repro of research (emerging), including simulation scripts, models,

hyperparameters, to fork, modify, and experiment with

Page 69: 2020.02.06 Workshop on Federated Learning and Analytics

A way to leverage the latest FL research in your application● Designed from day 1 to facilitate deployment to physical devices● Designed for smooth transition from simulations into production

○ Federated learning logic is expressed in a platform- and language-independent manner, so your code does not have to change during this transition

● Designed for composability and hackability○ Explicit mechanisms for expressing FL code as reusable and stackable modules○ Code structure that’s easy to read and modify

● Actively used at Google, integrated with our production infrastructure● Deployment options (emerging), interfaces and tools

Page 70: 2020.02.06 Workshop on Federated Learning and Analytics

* Embedded in Python.

Page 71: 2020.02.06 Workshop on Federated Learning and Analytics

Communication is an integral part of your application logic!● Canned algorithms don’t always work out of the box

○ You may have to try different algorithms○ Your specific use case may call for engineering a custom communication pattern

● A given deployment scenario may call for additional ingredients○ Compression, differential privacy, adaptive, stateful, multi-round algorithms, etc.

Existing tools offer inadequate communication abstractions● Point-to-point messaging, checkpoints, etc. are much too low-level● Allreduce-like abstractions not a good fit for mobile device deployment● No first-class support from the type system, etc.

Page 72: 2020.02.06 Workshop on Federated Learning and Analytics

Portability between research and production is essential● Effective development may only be feasible on a live deployed system

○ E.g., by evaluating ideas by training and evaluation in “dry mode”

● Reduced friction for deploying new research algorithms in production○ Plus ability to use simulation framework to test production code

Consequences:● For maximum portability, code should be platform/language-agnostic● Program logic should be expressed declaratively to support:

○ Ability to compile to diverse platforms○ Ability to statically analyze all code to verify that it has the properties we want

Page 73: 2020.02.06 Workshop on Federated Learning and Analytics
Page 74: 2020.02.06 Workshop on Federated Learning and Analytics

CLIENTS

Page 75: 2020.02.06 Workshop on Federated Learning and Analytics

CLIENTS

68.0

70.5

69.8

70.1

a local item of data of type float32(e.g., a sensor reading or a model weight)

Page 76: 2020.02.06 Workshop on Federated Learning and Analytics

CLIENTS

68.0

70.5

69.8

70.1

a local item of data of type float32(e.g., a sensor reading or a model weight)

a “federated value” (a multi-set)

Page 77: 2020.02.06 Workshop on Federated Learning and Analytics

CLIENTS

68.0

70.5

69.8

70.1

a local item of data of type float32(e.g., a sensor reading or a model weight)

a “federated value” (a multi-set)

has type {float32}@CLIENTS

Page 78: 2020.02.06 Workshop on Federated Learning and Analytics

CLIENTS

68.0

70.5

69.8

70.1

a local item of data of type float32(e.g., a sensor reading or a model weight)

a “federated value” (a multi-set)

the “placement”type of local items on each client

has type {float32}@CLIENTS

Page 79: 2020.02.06 Workshop on Federated Learning and Analytics

68.0

70.5

69.8

70.1

a “federated value” (a multi-set)

the “placement”type of local items on each client

has type {float32}@CLIENTS

SERVER

Page 80: 2020.02.06 Workshop on Federated Learning and Analytics

68.0

70.5

69.8

70.1 the “placement”type of local items on each client

has type {float32}@CLIENTS

SERVER

?

Page 81: 2020.02.06 Workshop on Federated Learning and Analytics

68.0

70.5

69.8

70.1{float32}@CLIENTS

SERVER

?

float32@SERVER

Page 82: 2020.02.06 Workshop on Federated Learning and Analytics

68.0

70.5

69.8

70.1{float32}@CLIENTS

SERVER

69.5

float32@SERVER

distributed aggregation

Page 83: 2020.02.06 Workshop on Federated Learning and Analytics

68.0

70.5

69.8

70.1{float32}@CLIENTS

SERVER

69.5

float32@SERVER

distributed aggregation

federated “op” can be interpreted as a function even though its inputs and outputs are in different places

Page 84: 2020.02.06 Workshop on Federated Learning and Analytics

68.0

70.5

69.8

70.1

SERVER

69.5distributed aggregation

federated “op” can be interpreted as a function even though its inputs and outputs are in different places

{float32}@CLIENTS → float32@SERVER

Page 85: 2020.02.06 Workshop on Federated Learning and Analytics

68.0

70.5

69.8

70.1

SERVER

69.5

federated “op” can be interpreted as a function even though its inputs and outputs are in different places

tff.federated_mean

it represents an abstract specification ofa distributed communication protocol

{float32}@CLIENTS → float32@SERVER

Page 86: 2020.02.06 Workshop on Federated Learning and Analytics

READINGS_TYPE = tff.FederatedType(tf.float32, tff.CLIENTS)

# An abstract specification of a simple distributed system

@tff.federated_computation(READINGS_TYPE)

def get_average_temperature(sensor_readings):

return tff.federated_mean(sensor_readings)

Page 87: 2020.02.06 Workshop on Federated Learning and Analytics

READINGS_TYPE = tff.FederatedType(tf.float32, tff.CLIENTS)

# An abstract specification of a simple distributed system

@tff.federated_computation(READINGS_TYPE)

def get_average_temperature(sensor_readings):

return tff.federated_mean(sensor_readings)

Page 88: 2020.02.06 Workshop on Federated Learning and Analytics

READINGS_TYPE = tff.FederatedType(tf.float32, tff.CLIENTS)

# An abstract specification of a simple distributed system

@tff.federated_computation(READINGS_TYPE)

def get_average_temperature(sensor_readings):

return tff.federated_mean(sensor_readings)

Page 89: 2020.02.06 Workshop on Federated Learning and Analytics

@tff.federated_computation(READINGS_TYPE)

def get_average_temperature(sensor_readings):

return tff.federated_mean(sensor_readings)

What does “get_average_temperature” represent?● The body of the Python function was traced once, disposed of, and

replaced by serialized abstract representation in TFF’s language.

an instance of TFF’s computation.proto (no longer Python code)

Page 90: 2020.02.06 Workshop on Federated Learning and Analytics
Page 91: 2020.02.06 Workshop on Federated Learning and Analytics

temperaturesensor readings(an input)

temperaturethreshold(an input)?

output: % of sensor readings > threshold

Page 92: 2020.02.06 Workshop on Federated Learning and Analytics

federated broadcasttemperaturesensor readings(an input)

temperaturethreshold(an input)

output: % of sensor readings > threshold

Page 93: 2020.02.06 Workshop on Federated Learning and Analytics

federated broadcast

federatedmap

1

0

1

>to_float

>

>

temperaturesensor readings(an input)

temperaturethreshold(an input)

output: % of sensor readings > threshold

Page 94: 2020.02.06 Workshop on Federated Learning and Analytics

federated broadcast

1

0

1

>to_float

>

> federated mean

temperaturesensor readings(an input)

temperaturethreshold(an input)

output: % of sensor readings > threshold

federatedmap

Page 95: 2020.02.06 Workshop on Federated Learning and Analytics

95

@tff.federated_computationdef get_fraction_over_threshold(readings, threshold):

return ...

client-side input server-side input

Page 96: 2020.02.06 Workshop on Federated Learning and Analytics

96

@tff.federated_computationdef get_fraction_over_threshold(readings, threshold):

return tff.federated_mean( tff.federated_map( exceeds_threshold_fn, [readings, tff.federated_broadcast(threshold)]))

collective operations and communication

Page 97: 2020.02.06 Workshop on Federated Learning and Analytics

97

@tff.tf_computationdef exceeds_threshold_fn(reading, threshold): return tf.to_float(reading > threshold)

@tff.federated_computationdef get_fraction_over_threshold(readings, threshold):

return tff.federated_mean( tff.federated_map( exceeds_threshold_fn, [readings, tff.federated_broadcast(threshold)]))

local on-device processing

Page 98: 2020.02.06 Workshop on Federated Learning and Analytics

98

READINGS_TYPE = tff.FederatedType(tf.float32, tff.CLIENTS)THRESHOLD_TYPE = tff.FederatedType(tf.float32, tff.SERVER)

@tff.tf_computation(tf.float32, tf.float32)def exceeds_threshold_fn(reading, threshold): return tf.to_float(reading > threshold)

@tff.federated_computation(READINGS_TYPE, THRESHOLD_TYPE)def get_fraction_over_threshold(readings, threshold):

return tff.federated_mean( tff.federated_map( exceeds_threshold_fn, [readings, tff.federated_broadcast(threshold)]))

types

Page 99: 2020.02.06 Workshop on Federated Learning and Analytics
Page 100: 2020.02.06 Workshop on Federated Learning and Analytics
Page 101: 2020.02.06 Workshop on Federated Learning and Analytics

@tff.federated_computation( SERVER_MODEL_TYPE, SERVER_FLOAT_TYPE, CLIENT_DATA_TYPE)def federated_train(model, learning_rate, data): return ...

101

initial modelon the server

server-supplied learning rate

on-device data

Page 102: 2020.02.06 Workshop on Federated Learning and Analytics

102

@tff.federated_computation( SERVER_MODEL_TYPE, SERVER_FLOAT_TYPE, CLIENT_DATA_TYPE)def federated_train(model, learning_rate, data): return tff.federated_mean( tff.federated_map( local_train, [tff.federated_broadcast(model), tff.federated_broadcast(learning_rate), data]))

server-to-client communication

Page 103: 2020.02.06 Workshop on Federated Learning and Analytics

103

@tff.federated_computation( SERVER_MODEL_TYPE, SERVER_FLOAT_TYPE, CLIENT_DATA_TYPE)def federated_train(model, learning_rate, data): return tff.federated_mean( tff.federated_map( local_train, [tff.federated_broadcast(model), tff.federated_broadcast(learning_rate), data]))

everything needed for local training is now on clients

Page 104: 2020.02.06 Workshop on Federated Learning and Analytics

104

@tff.federated_computation( SERVER_MODEL_TYPE, SERVER_FLOAT_TYPE, CLIENT_DATA_TYPE)def federated_train(model, learning_rate, data): return tff.federated_mean( tff.federated_map( local_train, [tff.federated_broadcast(model), tff.federated_broadcast(learning_rate), data]))

clients train locallyanother computation

Page 105: 2020.02.06 Workshop on Federated Learning and Analytics

105

@tff.federated_computation( SERVER_MODEL_TYPE, SERVER_FLOAT_TYPE, CLIENT_DATA_TYPE)def federated_train(model, learning_rate, data): return tff.federated_mean( tff.federated_map( local_train, [tff.federated_broadcast(model), tff.federated_broadcast(learning_rate), data]))

averaging the locally trained models

Page 106: 2020.02.06 Workshop on Federated Learning and Analytics
Page 107: 2020.02.06 Workshop on Federated Learning and Analytics

How to inject compression when broadcasting data:

tff.federated_map(decode, tff.federated_broadcast( tff.federated_apply(encode, x)))

How to inject differential privacy when aggregating:tff.federated_mean( tff.federated_map(y → clip(y) + noise, x))

NOTE: Showing a lambda expression here in a simplifed form for the sakeof clarity; you would define a TFF computation (it’s shown in the tutorial).

initial model

locallytrained models

Page 108: 2020.02.06 Workshop on Federated Learning and Analytics
Page 109: 2020.02.06 Workshop on Federated Learning and Analytics

Calling a TFF computation like a Python function:

Page 110: 2020.02.06 Workshop on Federated Learning and Analytics
Page 111: 2020.02.06 Workshop on Federated Learning and Analytics
Page 112: 2020.02.06 Workshop on Federated Learning and Analytics

train_data, test_data = tff.simulation.datasets.emnist.load_data()

112

create an object that represents training and test data

Page 113: 2020.02.06 Workshop on Federated Learning and Analytics

train_data, test_data = tff.simulation.datasets.emnist.load_data()all_clients = train_data.client_ids

113

obtain the list of client ids(only accessible to the code that sets

up the experiment loop in Python)

Page 114: 2020.02.06 Workshop on Federated Learning and Analytics

train_data, test_data = tff.simulation.datasets.emnist.load_data()all_clients = train_data.client_ids... = train_data.create_tf_dataset_for_client(...)

construct an eager tf.data.Dataset for a given client

114

Page 115: 2020.02.06 Workshop on Federated Learning and Analytics

train_data, test_data = tff.simulation.datasets.emnist.load_data()all_clients = train_data.client_ids... = train_data.create_tf_dataset_for_client(...)

for round_num in range(5): clients_selected_in_this_round = random.sample(all_clients, 10)

115

in each round, simulate client selection (in Python)

Page 116: 2020.02.06 Workshop on Federated Learning and Analytics

train_data, test_data = tff.simulation.datasets.emnist.load_data()all_clients = train_data.client_ids... = train_data.create_tf_dataset_for_client(...)

for round_num in range(5): clients_selected_in_this_round = random.sample(all_clients, 10) federated_train_data = [ train_data.create_tf_dataset_for_client(c).repeat(10) for c in clients_selected_in_this_round] # Run the computation...

116

construct and post-process eager tf.data.Datasets for these clients

Page 117: 2020.02.06 Workshop on Federated Learning and Analytics
Page 118: 2020.02.06 Workshop on Federated Learning and Analytics

tff.learning

Page 119: 2020.02.06 Workshop on Federated Learning and Analytics

model_fn = lambda: tff.learning.from_keras_model( … )

119

tff.learning

absorb an existing Keras model for use in TFF

Page 120: 2020.02.06 Workshop on Federated Learning and Analytics

model_fn = lambda: tff.learning.from_keras_model( … )

train = tff.learning.build_federated_averaging_process(model_fn)eval = tff.learning.build_federated_evaluation(model_fn)

120

tff.learning

TFF computations for training and evaluation

Page 121: 2020.02.06 Workshop on Federated Learning and Analytics

model_fn = lambda: tff.learning.from_keras_model( … )

train = tff.learning.build_federated_averaging_process(model_fn)eval = tff.learning.build_federated_evaluation(model_fn)

state = train.initialize()

121

tff.learning

create server state for the first round

Page 122: 2020.02.06 Workshop on Federated Learning and Analytics

122

tff.learning

model_fn = lambda: tff.learning.from_keras_model( … )

train = tff.learning.build_federated_averaging_process(model_fn)eval = tff.learning.build_federated_evaluation(model_fn)

state = train.initialize()for _ in range(5): client_data = …

loop over rounds, pick a slice of client data in each (as shown a few slides ago)

Page 123: 2020.02.06 Workshop on Federated Learning and Analytics

123

tff.learning

model_fn = lambda: tff.learning.from_keras_model( … )

train = tff.learning.build_federated_averaging_process(model_fn)eval = tff.learning.build_federated_evaluation(model_fn)

state = train.initialize()for _ in range(5): client_data = … state, metrics = train.next(state, client_data)

run a single round of training, produce new server state and mertrics

Page 124: 2020.02.06 Workshop on Federated Learning and Analytics

124

tff.learning

model_fn = lambda: tff.learning.from_keras_model( … )

train = tff.learning.build_federated_averaging_process(model_fn)eval = tff.learning.build_federated_evaluation(model_fn)

state = train.initialize()for _ in range(5): client_data = … state, metrics = train.next(state, client_data)

metrics = eval(state.model, ...)

extract the trained model and evaluate it

Page 125: 2020.02.06 Workshop on Federated Learning and Analytics
Page 126: 2020.02.06 Workshop on Federated Learning and Analytics

Modular framework for runtimes (in tff.framework)● Provided single-machine multi-threaded executor (shown in tutorials)

tff.framework.set_default_executor(

tff.framework.create_local_executor())

● More ready-to-use setups emerging○ GCP/GKE

● Can setup custom executor stacks from building blocks○ Multi-machine, multi-tier, GPU-enabled, etc.

● Can contribute executor components to the framework○ Abstract interface tff.framework.Executor○ Alternatively, the gRPC variant of this interface

Page 127: 2020.02.06 Workshop on Federated Learning and Analytics
Page 128: 2020.02.06 Workshop on Federated Learning and Analytics

Two kinds of approaches viable today:● Plug devices as components into TFF’s simulation runtime framework

○ e.g., as custom executors, via gRPC

● Use the (emerging) compiler toolset to generate executable artifacts○ e.g., see tff.backends.mapreduce

More deployment options on the way!

Page 129: 2020.02.06 Workshop on Federated Learning and Analytics
Page 130: 2020.02.06 Workshop on Federated Learning and Analytics

All that you’ve seen is open-source, available on GitHub● github.com/tensorflow/federated● tensorflow.org/federated

Many ways to contribute to the emerging TFF ecosystem● Apply the tff.learning API to existing ML models and data● Develop new federated algorithms using TFF abstractions● Help evolve core abstractions to make TFF more expressive● Help improve usability and evolve libraries built around TFF● Integrate with new backends to expand deployment options

Page 131: 2020.02.06 Workshop on Federated Learning and Analytics

federated training

clients

server

engineer

modeldeployment

What can theworld see?

admin

What can the server admin see?

What can the network see?

What can the server see?

What can thedevice see?

What can the engineer see?

Privacy Principle

Focused collection

Privacy Principle

Only-in-aggregaterelease

Improving privacy

Privacy Principle

No memorization of individuals’ data

Technology

Secure Aggregation

Technology

Private Retrieval

Technology

Differential Privacy

Privacy Principle

Minimize data exposure

Encryption at rest and on the wireLimit retention time

Compute on encrypted values

Privacy Principle

Anonymous / ephemeralcollection

Technology

Federated Learning

Technology

Federated Analytics