knockoff nets: stealing functionality of black-box …knockoff nets: stealing functionality of...

17
Knockoff Nets: Stealing Functionality of Black-Box Models Tribhuvanesh Orekondy 1 Bernt Schiele 1 Mario Fritz 2 1 Max Planck Institute for Informatics, Saarland Informatics Campus 2 CISPA Helmholtz Center i.G., Saarland Informatics Campus Abstract Machine Learning (ML) models are increasingly de- ployed in the wild to perform a wide range of tasks. In this work, we ask to what extent can an adversary steal func- tionality of such “victim” models based solely on black- box interactions: image in, predictions out. In contrast to prior work, we present an adversary lacking knowledge of train/test data used by the model, its internals, and seman- tics over model outputs. We formulate model functionality stealing as a two-step approach: (i) querying a set of input images to the blackbox model to obtain predictions; and (ii) training a “knockoff” with queried image-prediction pairs. We make multiple remarkable observations: (a) querying random images from a different distribution than that of the blackbox training data results in a well-performing knock- off; (b) this is possible even when the knockoff is repre- sented using a different architecture; and (c) our reinforce- ment learning approach additionally improves query sam- ple efficiency in certain settings and provides performance gains. We validate model functionality stealing on a range of datasets and tasks, as well as on a popular image analy- sis API where we create a reasonable knockoff for as little as $30. 1. Introduction Machine Learning (ML) models and especially deep neural networks are deployed to improve productivity or ex- perience e.g., photo assistants in smartphones, image recog- nition APIs in cloud-based internet services, and for navi- gation and control in autonomous vehicles. Developing and engineering such models for commercial use is a product of intense time, money, and human effort – ranging from collecting a massive annotated dataset to tuning the right model for the task. The details of the dataset, exact model architecture, and hyperparameters are naturally kept confi- dential to protect the models’ value. However, in order to be monetized or simply serve a purpose, they are deployed in various applications (e.g., home assistants) to function as API Similar functionality Blackbox Classifier Knockoff Classifier grebe finch ani tern vireo jay Train data .01 .13 .42 .03 . . . .23 .04 .00 .33 . . . .07 .01 .01 .05 . . . ... .01 .00 .00 .18 . . . Figure 1: An adversary can create a “knockoff” of a blackbox model solely by interacting with its API: image in, prediction out. The knockoff bypasses the monetary costs and intellectual effort involved in creating the blackbox model. blackboxes: input in, predictions out. Large-scale deployments of deep learning models in the wild has motivated the community to ask: can someone abuse the model solely based on blackbox access? There has been a series of “inference attacks” [10, 27, 38, 40] which try to infer properties (e.g., training data [40], ar- chitecture [27]) about the model within the blackbox. In this work, we focus on model functionality stealing: can one create a “knockoff” of the blackbox model solely based on observed input-output pairs? In contrast to previous works [27, 32, 45], we work with minimal assumptions on the blackbox and intend to purely steal the functionality. We formulate model functionality stealing as follows (shown in Figure 1). The adversary interacts with a black- box “victim” CNN by providing it input images and obtain- ing respective predictions. The resulting image-prediction pairs are used to train a “knockoff” model. The adver- sary’s intention is for the knockoff to compete with the vic- tim model at the victim’s task. Note that knowledge trans- fer [5, 15] approaches are a special case within our formu- lation, where the task, train/test data, and white-box teacher (victim) model are known to the adversary. Within this formulation, we spell out questions answered in our paper with an end-goal of model functionality steal- ing: 1. Can we train a knockoff on a random set of query im- ages and corresponding blackbox predictions? 2. What makes for a good set of images to query? 1 arXiv:1812.02766v1 [cs.CV] 6 Dec 2018

Upload: others

Post on 08-Jul-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Knockoff Nets: Stealing Functionality of Black-Box …Knockoff Nets: Stealing Functionality of Black-Box Models Tribhuvanesh Orekondy 1Bernt Schiele Mario Fritz2 1 Max Planck Institute

Knockoff Nets: Stealing Functionality of Black-Box Models

Tribhuvanesh Orekondy1 Bernt Schiele1 Mario Fritz2

1 Max Planck Institute for Informatics, Saarland Informatics Campus2 CISPA Helmholtz Center i.G., Saarland Informatics Campus

Abstract

Machine Learning (ML) models are increasingly de-ployed in the wild to perform a wide range of tasks. In thiswork, we ask to what extent can an adversary steal func-tionality of such “victim” models based solely on black-box interactions: image in, predictions out. In contrast toprior work, we present an adversary lacking knowledge oftrain/test data used by the model, its internals, and seman-tics over model outputs. We formulate model functionalitystealing as a two-step approach: (i) querying a set of inputimages to the blackbox model to obtain predictions; and (ii)training a “knockoff” with queried image-prediction pairs.We make multiple remarkable observations: (a) queryingrandom images from a different distribution than that of theblackbox training data results in a well-performing knock-off; (b) this is possible even when the knockoff is repre-sented using a different architecture; and (c) our reinforce-ment learning approach additionally improves query sam-ple efficiency in certain settings and provides performancegains. We validate model functionality stealing on a rangeof datasets and tasks, as well as on a popular image analy-sis API where we create a reasonable knockoff for as littleas $30.

1. IntroductionMachine Learning (ML) models and especially deep

neural networks are deployed to improve productivity or ex-perience e.g., photo assistants in smartphones, image recog-nition APIs in cloud-based internet services, and for navi-gation and control in autonomous vehicles. Developing andengineering such models for commercial use is a productof intense time, money, and human effort – ranging fromcollecting a massive annotated dataset to tuning the rightmodel for the task. The details of the dataset, exact modelarchitecture, and hyperparameters are naturally kept confi-dential to protect the models’ value. However, in order tobe monetized or simply serve a purpose, they are deployedin various applications (e.g., home assistants) to function as

API

Similar functionality

Blackbox Classifier Knockoff Classifier

grebe

finchani

tern

vireojay

Train data

.01

.13

.42

.03

...

.23

.04

.00

.33

...

.07

.01

.01

.05

......

.01

.00

.00

.18

...

Figure 1: An adversary can create a “knockoff” of a blackboxmodel solely by interacting with its API: image in, prediction out.The knockoff bypasses the monetary costs and intellectual effortinvolved in creating the blackbox model.

blackboxes: input in, predictions out.Large-scale deployments of deep learning models in the

wild has motivated the community to ask: can someoneabuse the model solely based on blackbox access? Therehas been a series of “inference attacks” [10, 27, 38, 40]which try to infer properties (e.g., training data [40], ar-chitecture [27]) about the model within the blackbox. Inthis work, we focus on model functionality stealing: canone create a “knockoff” of the blackbox model solely basedon observed input-output pairs? In contrast to previousworks [27, 32, 45], we work with minimal assumptions onthe blackbox and intend to purely steal the functionality.

We formulate model functionality stealing as follows(shown in Figure 1). The adversary interacts with a black-box “victim” CNN by providing it input images and obtain-ing respective predictions. The resulting image-predictionpairs are used to train a “knockoff” model. The adver-sary’s intention is for the knockoff to compete with the vic-tim model at the victim’s task. Note that knowledge trans-fer [5, 15] approaches are a special case within our formu-lation, where the task, train/test data, and white-box teacher(victim) model are known to the adversary.

Within this formulation, we spell out questions answeredin our paper with an end-goal of model functionality steal-ing:

1. Can we train a knockoff on a random set of query im-ages and corresponding blackbox predictions?

2. What makes for a good set of images to query?

1

arX

iv:1

812.

0276

6v1

[cs

.CV

] 6

Dec

201

8

Page 2: Knockoff Nets: Stealing Functionality of Black-Box …Knockoff Nets: Stealing Functionality of Black-Box Models Tribhuvanesh Orekondy 1Bernt Schiele Mario Fritz2 1 Max Planck Institute

3. How can we improve sample efficiency of queries?4. What makes for a good knockoff architecture?

2. Related Work

Privacy, Security and Computer Vision. Privacy hasbeen largely addressed within the computer vision commu-nity by proposing models [28, 30, 31, 42, 49, 50] which rec-ognize and control privacy-sensitive information in visualcontent. The community has also recently studied securityconcerns entailing real-world usage of models e.g., adver-sarial perturbations [2,20,25,26,29,34] in black- and white-box attack scenarios. In this work, we focus on functionalitystealing of CNNs in a blackbox attack scenario.

Model Stealing. Stealing various attributes of a blackboxML model has been recently gaining popularity: parameters[45], hyperparameters [48], architecture [27], informationon training data [40] and decision boundaries [32]. Theseworks lay the groundwork to precisely reproduce the black-box model. In contrast, we investigate stealing functionalityof the blackbox independent of its internals. Although twoworks [32, 45] are related to our task, they make relativelystronger assumptions (e.g., model family is known, victim’sdata is partly available). In contrast, we present a weakeradversary.

Knowledge Distillation. Distillation [15] and related ap-proaches [5, 6, 11, 51] transfer the knowledge from a com-plex “teacher” to a simpler “student” model. Within ourproblem formulation, this is a special case when the adver-sary has strong knowledge of the victim’s blackbox modele.g., architecture, train/test data is known. Although we dis-cuss this, a majority of the paper makes weak assumptionsof the blackbox.

Active Learning. Active Learning [7, 44] (AL) aims toreduce labeling effort while gathering data to train a model.Ours is a special case of pool-based AL [39], where thelearner (adversary) chooses from a pool of unlabeled data.However, unlike AL, the learner’s image pool in our caseis chosen without any knowledge of the data used by theoriginal model. Moreover, while AL considers the imageto be annotated by a human-expert, ours is annotated withpseudo-labels by the blackbox.

3. Problem StatementWe now formalize the task of functionality stealing (see

also Figure 2).

Functionality Stealing. In this paper, we introduce thetask as: given blackbox query access to a “victim” modelFV : X → Y , to replicate its functionality using “knock-off” model FA of the adversary. As shown in Figure 2, weset it up as a two-player game between a victim V and anadversaryA. Now, we discuss the assumptions in which the

Can A steal functionality of FV : 1. when PV and FV are unknown? 2. using minimum queries B?

B queriesSelect Images

Select Arch.Transfer Set

<latexit sha1_base64="kyP/x2zw0A4R9MhQM2HA9AbOO/g=">AAAEInicfVPdbtMwFPYWYFv52+CSm4hqEuJiatAEXG5iQkgIMX66FTWhOnFPM6u2E9nO2mDlEbiFF+BpuENcIfEwOG0k2rSapchfzvedP9snzjjTptP5s7HpXbt+Y2t7p3Xz1u07d3f37p3pNFcUuzTlqerFoJEziV3DDMdephBEzPE8Hr+o+PNLVJql8qMpMowEJJKNGAXjTB9eDo4Hu+3OQWe2/FUQ1KBN6nU62PN2wmFKc4HSUA5a94NOZiILyjDKsWyFucYM6BgS7DsoQaCO7KzW0t93lqE/SpX7pPFn1kUPC0LrQsROKcBc6CZXGddx/dyMnkeWySw3KOk80Sjnvkn9qnF/yBRSwwsHgCrmavXpBSigxh1Pa99fjKVpZKsMGXxJl/qxsWj8V4ZWKHFCUyFADm04BAOlDSt/CtyelOUyH4+TJfpzaHBqbGVuKN1F8loax/Z9k3bu04VIvTV8scB/avJVodo1glcF4RAjb6pmoRQu6i6Rlv0gcu0J2w6aUdyJKY3mfzPdlWI5c+9pSfO6qXE5JuUsw2QNNZ1T0zVUMaeKsrqrE3SvVuEbl+RthgpMqh7bEFQiwIWo96tkTM5lbnfDEzRHZRWcPTkIHH532D56Wo/RNnlAHpJHJCDPyBF5RU5Jl1CSkK/kG/nu/fB+er+833Pp5kbtc58sLe/vP6vPcfc=</latexit>

Train Knockoff

<latexit sha1_base64="QU9X5DdMTuNaaPQNJjmln//QQTM=">AAAEM3icfVPdbtMwFPYWfrby141LbiKqSYOLqUEIuEEaYkJICDEQ3YaaUJ24p51V24lsZ02w8ircwgvwMIg7xC3vgNNGok2rWYr85Xzf+bOP45Qzbbrdnxub3pWr165vbbdu3Lx1+057Z/dEJ5mi2KMJT9RZDBo5k9gzzHA8SxWCiDmexpOXFX96gUqzRH40RYqRgLFkI0bBONOgvRteIC385/6rwYv9CucPBu1O96A7W/4qCGrQIfU6Hux42+EwoZlAaSgHrftBNzWRBWUY5Vi2wkxjCnQCY+w7KEGgjuys+NLfc5ahP0qU+6TxZ9ZFDwtC60LETinAnOsmVxnXcf3MjJ5Flsk0MyjpPNEo475J/Ook/CFTSA0vHACqmKvVp+eggBp3Xq09fzGWppGtMqTwJVnqx8ai8V8ZWqHEKU2EADm04RAMlDas/Clwe1SWy3w8GS/Rn0ODubGVuaF0N8traRzbD03auecLkc7W8MUC/6nJV4Vq1wheFoRDjLypmoVSuKhzs1T2g8i1J2wnaEZxJ6Y0mv/N9FaK5czN05LmTVPjckzLWYbpGiqfU/kaqphTRVnd1RG6qVX41iV5l6ICk6iHNgQ1FuBC1PtlMibnMre7xxM0n8oqOHl0EDj8/nHn8En9jLbIPXKf7JOAPCWH5DU5Jj1CSU6+km/ku/fD++X99v7MpZsbtc9dsrS8v/8AYrl3sQ==</latexit>

Adversary ASelect Images

Select ModelAnnotate

Victim V

<latexit sha1_base64="Hah7Xqk0gSTBP68Hx4eTFZcyPXU=">AAAEInicfVNbb9MwFPYWLlu5bfDIS0Q1CfEwNQgBj5M2TUgIMS7tippQnbinnVXbiWxnbbDyE3iFP8Cv4Q3xhMSPwUkj0abVLEX+cr7v3GyfOOVMm07nz9a2d+36jZs7u61bt+/cvbe3f7+nk0xR7NKEJ6ofg0bOJHYNMxz7qUIQMcfzeHpc8ueXqDRL5EeTpxgJmEg2ZhSMM304HfaGe+3OYada/joIatAm9Tob7nu74SihmUBpKAetB0EnNZEFZRjlWLTCTGMKdAoTHDgoQaCObFVr4R84y8gfJ8p90viVddnDgtA6F7FTCjAXusmVxk3cIDPjl5FlMs0MSrpINM64bxK/bNwfMYXU8NwBoIq5Wn16AQqoccfTOvCXY2ka2TJDCl+SlX5sLBr/paEVSpzRRAiQIxuOwEBhw9KfArcnRbHKx9PJCv05NDg3tjQ3lO4ieS2NY/u+STv3+VKk/gY+X+I/NfmyUO0awauCcIiRN1VVKIXLukukxSCIXHvCtoNmFHdiSqP530x3rVjO3Hta0bxualyOWVFlmG2g5gtqvoHKF1RelHd1gu7VKnzjkrxNUYFJ1BMbgpoIcCHq/SoZkwuZ293wBM1RWQe9p4eBw++etY+e12O0Qx6SR+QxCcgLckRekTPSJZRMyFfyjXz3fng/vV/e74V0e6v2eUBWlvf3H/o7cgw=</latexit>

<latexit sha1_base64="6ct2FeXi5rj2vIc1zPKO0JLSWuM=">AAAENHicfVPNbtNAEN7W/LThL6VHLiuiSoVDFVcIOFaiBySECIikQbGJxptJusqube2um5iVn4UrvADvgsQNceUZWCeWSJyoI1k7nu+bv92ZKBVcm3b7586ud+Pmrdt7+407d+/df9A8eNjTSaYYdlkiEtWPQKPgMXYNNwL7qUKQkcCLaPqqxC+uUGmexB9NnmIoYRLzMWdgnGnYPAyukM2HnAaaS9oZ9o77T4bNVvukvRC6qfiV0iKVdIYH3n4wSlgmMTZMgNYDv52a0IIynAksGkGmMQU2hQkOnBqDRB3aRfUFPXKWER0nyn2xoQvrqocFqXUuI8eUYC51HSuN27BBZsYvQ8vjNDMYs2WicSaoSWh5FXTEFTIjcqcAU9zVStklKGDGXVjjiK7G0iy0ZYYUviRr/dhI1v5LQyOIccYSKSEe2WAEBgoblP4MhD0vinU8mk7W4M+BwbmxpbnGdE8rKmoU2Q912LnPVyL1t+D5Cv6pjpeFatcIXhdEQISizlqEUrjKc4NVDPzQtSdty69HcTemNJr/zXQ3ihXczdMa502d43LMikWG2RZovoTmW6B8CeVF+Vbn6KZW4VuX5F2KCkyintoA1ESCC1Gd19F4vKS50y2PX1+VTaV3euI7/f2z1tnzao32yCPymBwTn7wgZ+Q16ZAuYSQnX8k38t374f3yfnt/ltTdncrnkKyJ9/cfv894Tw==</latexit>

<latexit sha1_base64="QIB5YjJx7ElMAXrDi9vttmDPbFY=">AAAEP3icfVNfi9NAEN+7+Oeu/uvpoyCL5eAUORo50BfhwHsQRDzF9ipNDJvttLd0dxN2N9fGJW9+Gl/1C/gx/AS+ia++uUkDtmm5gbCT+f1mZmd2Jk4506bb/bm17V25eu36zm7rxs1bt++09+72dZIpCj2a8EQNYqKBMwk9wwyHQaqAiJjDWTx9WeJnF6A0S+QHk6cQCjKRbMwoMc4UtR8EI2JI1McvcGDxQXABdB6xJziP2CMcFFG70z3sVoLXFb9WOqiW02jP2w1GCc0ESEM50Xrod1MTWqIMoxyKVpBpSAmdkgkMnSqJAB3aqpAC7zvLCI8T5T5pcGVd9rBEaJ2L2DEFMee6iZXGTdgwM+PnoWUyzQxIukg0zjg2CS67gkdMATU8dwqhirm7YnpOFKHG9a61j5djaRraMkNKPicr9dhYNP5LQyuQMKOJEESObNXtwgalPyXcnhTFKh5PJyvwp8DA3NjS3GC6V+Y1NY7t+ybs3OdLkQYb8HwJ/9jEy4tqVwhcFoSTGHiTVYVSsMxzU1UM/dCVJ2zHb0ZxHVMazP9iemuX5czN0wrndZPjcsyKKsNsAzRfQPMNUL6A8qJ8qxNwU6vgjUvyNgVFTKIe24CoiSAuRH1eRmNyQXOnWx6/uSrrSv/poe/0d0ed46N6jXbQffQQHSAfPUPH6BU6RT1E0Rf0FX1D370f3i/vt/dnQd3eqn3uoRXx/v4DZ4R8Hg==</latexit>

Train Model

<latexit sha1_base64="N3yXjyNJ3jrgRCo4Bc+KbkPPYlA=">AAAEM3icfVPdbtMwFPYWfrby141LbiKqSYOLqZkQcIM0iQkhIcRAtCtqQnXinnZW7SSynTXByqtwCy/AwyDuELe8A04TiTatZinyl/N9588+DhPOlO52f25tO9eu37i5s9u6dfvO3Xvtvf2+ilNJsUdjHstBCAo5i7CnmeY4SCSCCDmeh7OXJX9+iVKxOPqo8wQDAdOITRgFbU2j9r5/iTR3X7ivRv3DEmePRu1O96i7WO468GrQIfU6G+05u/44pqnASFMOSg29bqIDA1IzyrFo+anCBOgMpji0MAKBKjCL4gv3wFrG7iSW9ou0u7AuexgQSuUitEoB+kI1udK4iRumevI8MCxKUo0RrRJNUu7q2C1Pwh0ziVTz3AKgktlaXXoBEqi259U6cJdjKRqYMkMCX+KVfkwoGv+loeVHOKexEBCNjT8GDYXxS38K3JwWxSofzqYr9GdfY6ZNaW4o7c3yWhqG5kOTtu7ZUqTBBj5f4j81+bJQZRvBq4JwCJE3VYtQEpd1dpaKoRfY9oTpeM0o9sSkQv2/md5asZzZeVrRvGlqbI55scgw30BlFZVtoPKKyovyrk7RTq3EtzbJuwQl6Fg+Nj7IqQAbot6vkrGoktndPh6v+VTWQf/4yLP4/ZPOydP6Ge2QB+QhOSQeeUZOyGtyRnqEkox8Jd/Id+eH88v57fyppNtbtc99srKcv/8Asbh3xg==</latexit>

Black-box model

<latexit sha1_base64="8b7XI9YUKky62LOkX5RdgrRCrAA=">AAAEKXicfVPNbtNAEN7W/LThr4UjF4uoUuEQ2RUCequgQkgIURBJA7GpxptJusru2tpdNzErvwVXeAGehhtw5UVYJ0EkbuhI1o7n++ZvdybJONMmCH6urXuXLl+5urHZuHb9xs1bW9u3OzrNFcU2TXmquglo5Exi2zDDsZspBJFwPE5Gzyr8+AyVZql8Z4oMYwFDyQaMgnGmD89POrvRGdLJ/ZOtZtDaD8L9R6F/XglbwVSaZC5HJ9veZtRPaS5QGspB614YZCa2oAyjHMtGlGvMgI5giD2nShCoYzstufR3nKXvD1LlPmn8qXXRw4LQuhCJYwowp7qOVcZVWC83gyexZTLLDUo6SzTIuW9Sv+rf7zOF1PDCKUAVc7X69BQUUONuqbHjL8bSNLZVhgw+pUv92ETU/itDI5I4pqkQIPs26oOB0kaVPwVuD8tyGU9GwyX4Y2RwYmxlrjHde/I5NUns2zrs3CcLkbor8GIBf1/Hq0K1awQvCsIhQV5nTUMpXOS5WSp7YezaE7YZ1qO4G1Mazb9m2ueK5czN0xLnZZ3jcozLaYbxCmgygyYroGIGFWX1VofoplbhK5fkdYYKTKoe2AjUUIALMT8vojE5o7nTLc/fDfH/r3T2WqHT3zxsHjydr9EGuUvukV0SksfkgLwgR6RNKJHkM/lCvnrfvO/eD+/XjLq+Nve5Q5bE+/0HJi91Dw==</latexit>

v4

<latexit sha1_base64="ng0rOiob/LABka60Ws6imjwvQJU=">AAAERHicfVPNbtNAEN7W/LThr4UjF4uoUuFQxagSHIvoAQkhAiJtUGyi8WaSrrJrW7vrJmHlB+BpuMIL8A68AzfEFTFOLJE4UUeydjzfNzM7szNxJoWxrdbPrW3v2vUbN3d2G7du37l7b2///plJc82xw1OZ6m4MBqVIsGOFldjNNIKKJZ7H45clfn6J2og0+WBnGUYKRokYCg6WTP29ZniJfNoXfmgs8LFG6cJMFC40QhV+u//isPuYWK2j1lz8dSWolCarpN3f93bDQcpzhYnlEozpBa3MRg60FVxi0QhzgxmlgxH2SE1AoYncvJrCPyDLwB+mmr7E+nPrsocDZcxMxcRUYC9MHSuNm7BebofPIyeSLLeY8EWiYS59m/pla/yB0MitnJECXAu6q88vQAO31MDGgb8cy/DIlRky+Jyu1ONiVfsvDY0wwQlPlYJk4MIBWKAWl/4cpDstilU8Ho9W4E+hxal1pbnGpKeWFTWO3fs6TO7TpUjdDfhsCf9Yx8uLGioErwoiIUZZZ81DaVzm0aAVvSCi8pRrBvUo1DFt0P4vprN2WSlonlY4r+scyjEp5hkmG6DpAppugGYLaFaUb3WKNLUa31CStxlqsKl+4kLQIwUUojqvoolkQaOTlieor8q6cvb0KCD93XHz5Lhaox32kD1ihyxgz9gJe8XarMM4+8K+sm/su/fD++X99v4sqNtblc8DtiLe338rrX+y</latexit>

<latexit sha1_base64="y0b/flo0oRrATC0f4FSN3qLPP+0=">AAAEPXicfZPNbtNAEMe3NR9t+GgKRzisiCoVhKoYgeBYiQohIURBJA2KTTTeTNJVdm1rd93ErHzhabjCC/AcPAA3xJUr68SIxI26kuXZ+f1nZmc/olRwbdrtHxub3qXLV65ubTeuXb9xc6e5e6urk0wx7LBEJKoXgUbBY+wYbgT2UoUgI4En0eR5yU/OUGmexO9NnmIoYRzzEWdgnGvQvBtYuh+cIZsN+EP6YtD9N7lPg2LQbLUP2vNBzxt+ZbRINY4Hu952MExYJjE2TIDWfb+dmtCCMpwJLBpBpjEFNoEx9p0Zg0Qd2nkbBd1zniEdJcp9saFz73KEBal1LiOnlGBOdZ2VznWsn5nRs9DyOM0MxmxRaJQJahJa7gkdcoXMiNwZwBR3a6XsFBQw43ausUeXc2kW2rJCCp+SlX5sJGvz0tEIYpyyREqIhzYYgoHCBmU8A2GPimKVR5PxCv4YGJwZW7prSnfGopJGkX1Xxy58tpSpt4bnS/xDnZcL1a4RvCiJgAhFXTVPpXBZ5y5U0fdD1560Lb+exe2Y0mj+N9M5t1jB3X1a0byqa1yNaTGvMF2DZgs0W4PyBcqL8qyO0N1aha9dkTcpKjCJemADUGMJLkX1v0jG44XM/d3j8etP5bzRfXTgO/vt49bhk+oZbZE75B7ZJz55Sg7JS3JMOoSRz+QL+Uq+ed+9n94v7/dCurlRxdwmK8P78xdz6Huf</latexit>

<latexit sha1_base64="9fHw5dc7GQ1vN52G/aMW0n+huDk=">AAAEXHicfVNdb9MwFPXWAqNjsIHECy8R1aSxh6lBk0DiZRJ7QEKIgeg21JTpxr3trNpJZN+sDVZ+Bq/wu3jht+A0kdak1SxFvjnn3E/bYSKFoV7v78Zmq33v/oOth53tRzuPn+zuPT03cao59nksY30ZgkEpIuyTIImXiUZQocSLcPq+4C9uUBsRR98oS3CoYBKJseBADhoEhHOyB/Aq9652u72j3mJ5q4ZfGV1WrbOrvda7YBTzVGFEXIIxA7+X0NCCJsEl5p0gNZgAn8IEB86MQKEZ2kXNubfvkJE3jrX7IvIW6LKHBWVMpkKnVEDXpskV4DpukNL47dCKKEkJI14mGqfSo9grBuCNhEZOMnMGcC1crR6/Bg2c3Jg6+95yLMOHtsiQwM+41o8l4ZpxkMYIZzxWCqLRYUA0wjGkkqxURHWPUDX+C6AT3LrbYAQEuQ2KjBykPc3zOh9OJzX6R3l8BdxQuisgK2kY2q9N2rnPlyJdruGzJf57ky8KNa4RvCuIhBBlU7UItTw0G9wgzwf+0LWnbNdvRnET0wbptpn+SrFSuBtY03xsalyOWb7IMFtDzUtqvobKSirLi7M6RXfPNX5yST4nqIFifWgD0BMFLkS13yUTUSlzu3tufvNxrRrnr498Z3857p4cVw9vi71gL9kB89kbdsI+sDPWZ5zF7Bf7zf60/rXb7e32Tind3Kh8nrHaaj//D51UhoU=</latexit> <latexit sha1_base64="S/p28qvSWuxE6fyiGKwE/0B3cqM=">AAAEXHicfVNdb9MwFPXWAqNjsIHECy8R1aSxh6lBk0DiZRJ7QEKIgeg21JTpxr3trNpJZN+sDVZ+Bq/wu3jht+A0kdak1SxFvjnn3E/bYSKFoV7v78Zmq33v/oOth53tRzuPn+zuPT03cao59nksY30ZgkEpIuyTIImXiUZQocSLcPq+4C9uUBsRR98oS3CoYBKJseBADhoEhHOyB+Gr3Lva7faOeovlrRp+ZXRZtc6u9lrvglHMU4URcQnGDPxeQkMLmgSXmHeC1GACfAoTHDgzAoVmaBc1596+Q0beONbui8hboMseFpQxmQqdUgFdmyZXgOu4QUrjt0MroiQljHiZaJxKj2KvGIA3Eho5ycwZwLVwtXr8GjRwcmPq7HvLsQwf2iJDAj/jWj+WhGvGQRojnPFYKYhGhwHRCMeQSrJSEdU9QtX4L4BOcOtugxEQ5DYoMnKQ9jTP63w4ndToH+XxFXBD6a6ArKRhaL82aec+X4p0uYbPlvjvTb4o1LhG8K4gEkKUTdUi1PLQbHCDPB/4Q9eesl2/GcVNTBuk22b6K8VK4W5gTfOxqXE5Zvkiw2wNNS+p+RoqK6ksL87qFN091/jJJfmcoAaK9aENQE8UuBDVfpdMRKXM7e65+c3HtWqcvz7ynf3luHtyXD28LfaCvWQHzGdv2An7wM5Yn3EWs1/sN/vT+tdut7fbO6V0c6PyecZqq/38P6FFhoY=</latexit>

Deploy

Figure 2: Problem Statement. Laying out the task of modelfunctionality stealing in the view of two players - victim V andadversary A. We group adversary’s moves into (a) Transfer SetConstruction (b) Training Knockoff FA.

players operate and their corresponding moves in this game.

Victim’s Move. The victim’s end-goal is to deploy atrained CNN model FV in the wild for a particular task(e.g., fine-grained bird classification). To train this par-ticular model, the victim: (i) collects task-specific imagesx ∼ PV (X) and obtains expert annotations resulting in adataset DV = {(xi, yi)}; (ii) selects the model FV thatachieves best performance (accuracy) on a held-out test setof imagesDtest

V . The resulting model is deployed as a black-box which predicts output probabilities y = FV (x) givenan image x. Furthermore, we assume each prediction incursa cost (e.g., monetary, latency).

Adversary’s Unknowns. The adversary is presented witha blackbox CNN image classifier, which given any imagex ∈ X returns a K-dim posterior probability vector y ∈[0, 1]K ,

∑k yk = 1. We relax this later by considering

truncated versions of y. We assume remaining aspects tobe unknown: (i) the internals of FV e.g., hyperparametersor architecture; (ii) the data used to train and evaluate themodel; and (iii) semantics over the K classes.

Adversary’s Attack. To train a knockoff, the adversary:(i) interactively queries images {xi

π∼ PA(X)} using strat-egy π to obtain a “transfer set” of images and pseudo-labels{(xi, FV (xi))}Bi=1; and (ii) selects an architecture FA forthe knockoff and trains it to mimic the behaviour of FV onthe transfer set.

Objective. We focus on the adversary, whose primary ob-jective is training a knockoff that performs well on the taskfor which FV was designed i.e., on an unknown Dtest

V . Inaddition, we address two secondary objectives: (i) sample-efficiency: maximizing performance within a budget of Bblackbox queries; and (ii) understanding what makes forgood images to query the blackbox.

Victim’s Defense. Although we primarily address the ad-versary’s strategy in the paper, we briefly discuss victim’scounter strategies (in Section 6) of reducing informative-

Page 3: Knockoff Nets: Stealing Functionality of Black-Box …Knockoff Nets: Stealing Functionality of Black-Box Models Tribhuvanesh Orekondy 1Bernt Schiele Mario Fritz2 1 Max Planck Institute

<latexit sha1_base64="zBtzm7LZ9qJygEzCBYfnR/AKwNA=">AAAEJXicfVPdbtMwFPYWYFv52+CSm4hq0uBiahASXOxiiF0gIURBtCtqQnXinnamthPZztpg5R24hRfgabhDSFzxKjhtJNq0mqXIX873nT/bJ04506bV+rO17V27fmNnd69x89btO3f3D+51dZIpih2a8ET1YtDImcSOYYZjL1UIIuZ4Hk9elvz5JSrNEvnB5ClGAsaSjRgF40zd9uDFUe/RYL/ZOm7Nl78Oggo0SbXagwNvLxwmNBMoDeWgdT9opSayoAyjHItGmGlMgU5gjH0HJQjUkZ2XW/iHzjL0R4lynzT+3LrsYUFonYvYKQWYC13nSuMmrp+Z0fPIMplmBiVdJBpl3DeJX/buD5lCanjuAFDFXK0+vQAF1LgTahz6y7E0jWyZIYUvyUo/Nha1/9LQCCVOaSIEyKENh2CgsGHpT4Hbs6JY5ePJeIX+FBqcGVuaa0p3l7ySxrF9X6ed+2wpUm8Dny/xH+t8Wah2jeBVQTjEyOuqeSiFy7pLpEU/iFx7wjaDehR3Ykqj+d9MZ61Yztx7WtG8rmtcjmkxzzDdQM0W1GwDlS+ovCjv6gzdq1X4xiV5m6ICk6jHNgQ1FuBCVPtVMiYXMre74Qnqo7IOuk+OA4ffPW2enlRjtEsekIfkiATkGTklr0ibdAgln8lX8o189354P71f3u+FdHur8rlPVpb39x/DMHLO</latexit><latexit sha1_base64="DSggV3G3uhbq7ORrsRcdD7EwIzY=">AAAEJXicfVPdbtMwFPYWYFv52+CSm4hq0uBiaiYkdsHFJHaBhBAF0a6oCdWJe9qZ2k5kO2uDlXfgFl6Ap+EOIXHFq+C0kWjTapYifznfd/5snzjlTJtW68/Wtnfj5q2d3b3G7Tt3793fP3jQ1UmmKHZowhPVi0EjZxI7hhmOvVQhiJjjRTx5WfIXV6g0S+QHk6cYCRhLNmIUjDN124PuUe/JYL/ZOm7Nl78Oggo0SbXagwNvLxwmNBMoDeWgdT9opSayoAyjHItGmGlMgU5gjH0HJQjUkZ2XW/iHzjL0R4lynzT+3LrsYUFonYvYKQWYS13nSuMmrp+Z0WlkmUwzg5IuEo0y7pvEL3v3h0whNTx3AKhirlafXoICatwJNQ795ViaRrbMkMKXZKUfG4vaf2lohBKnNBEC5NCGQzBQ2LD0p8DteVGs8vFkvEJ/Cg3OjC3NNaW7S15J49i+r9POfbYUqbeBz5f4j3W+LFS7RvC6IBxi5HXVPJTCZd0V0qIfRK49YZtBPYo7MaXR/G+ms1YsZ+49rWhe1zUux7SYZ5huoGYLaraByhdUXpR3dY7u1Sp845K8TVGBSdRTG4IaC3Ahqv06GZMLmdvd8AT1UVkH3ZPjwOF3z5pnL6ox2iWPyGNyRALynJyRV6RNOoSSz+Qr+Ua+ez+8n94v7/dCur1V+TwkK8v7+w8R6nLj</latexit>

FV

FA

T

<latexit sha1_base64="iE+CGaA5fkJj8jRRg47yfaAbNX4=">AAAEaHicfVNva9NAGL+tVWf91yki4puwMpC9GI0ICoIM7AthilPsVmliuVyedkfvknD3ZGs88mF8q5/Ir+Cn8NIE16RlB+Ge/H6/5+/dBYngGvv9P1vbrfaNm7d2bnfu3L13/0F39+GpjlPFYMhiEatRQDUIHsEQOQoYJQqoDAScBfN3BX92AUrzOPqKWQK+pLOITzmjaKFJ97EnKZ4zKsyHfOIhLNAcD/JJt9c/7C+Xs264ldEj1TqZ7LbeeGHMUgkRMkG1Hrv9BH1DFXImIO94qYaEsjmdwdiaEZWgfbOsP3f2LRI601jZL0Jnia56GCq1zmRglUW1uskV4CZunOL0tW94lKQIESsTTVPhYOwUw3BCroChyKxBmeK2VoedU0UZ2pF19p3VWJr5psiQ0B9xrR+D3DZjIQURXLJYShqFBx5iCFOaCjRCItY9Atn4L4COd+VuvJAizc3/0xnkeZ0P5rMa/b08vAJuKO11EJU0CMyXJm3dFyuRRhv4bIX/1uSLQrVtBK4LImgAoqlahlodmvEugOVj17ftSdNzm1HsxJQGvGpmuFas4PYG1jTHTY3NcZkvM1xuoBYltdhAZSWV5cVZDcDecwUfbZJPCSiKsTowHlUzSW2Iar9OxqNSZnf73Nzm41o3Tl8cutb+/LJ39LZ6eDvkGdkjz4lLXpEj8p6ckCFhxJCf5Bf53frb7raftJ+W0u2tyucRqa323j8zrYu6</latexit>

<latexit sha1_base64="rFaWapHAMp5yPvQz1WlXNj/a7T4=">AAAEaHicfVN/a9NAGL6tVWf9sU4REf8JKwPZH6MRQUGQwSYIKk6x26SJ5XJ52x29S8Ldm7XxyIfxX/1EfgU/hZcmuCYtOwj35nme9+fdBYngGvv9PxubrfaNm7e2bnfu3L13f7u78+BUx6liMGCxiNV5QDUIHsEAOQo4TxRQGQg4C6ZHBX92CUrzOPqKWQK+pJOIjzmjaKFR95EnKV4wKsyHfOQhzNEcvc1H3V7/oL9YzqrhVkaPVOtktNN67YUxSyVEyATVeuj2E/QNVciZgLzjpRoSyqZ0AkNrRlSC9s2i/tzZs0jojGNlvwidBbrsYajUOpOBVRbV6iZXgOu4YYrjV77hUZIiRKxMNE6Fg7FTDMMJuQKGIrMGZYrbWh12QRVlaEfW2XOWY2nmmyJDQn/EtX4MctuMhRREMGOxlDQK9z3EEMY0FWiERKx7BLLxXwAd78rdeCFFmpv/p3Oc53U+mE5q9Pfy8Aq4obTXQVTSIDBfmrR1ny9FOl/DZ0v8tyZfFKptI3BdEEEDEE3VItTy0Ix3CSwfur5tT5qe24xiJ6Y04FUzg5ViBbc3sKZ539TYHLN8kWG2hpqX1HwNlZVUlhdndQz2niv4aJN8SkBRjNW+8aiaSGpDVPt1Mh6VMrvb5+Y2H9eqcfr8wLX25xe9wzfVw9siT8kueUZc8pIcknfkhAwII4b8JL/I79bfdrf9uP2klG5uVD4PSW21d/8BGByLsw==</latexit>

<latexit sha1_base64="ruoVE5pApEDvDMsHvcWvbYtAUVg=">AAAEXXicfVPNbtNAEN42KZRQSgsHDlwsokqohypGCJC4VKIHJIQoqGmDYhONN5N0lV3b2h03MSu/Bld4LU68CuskUmMn6krWjuf75ndnolQKQ53O363tRnPn3v3dB62He4/2Hx8cPrk0SaY5dnkiE92LwKAUMXZJkMReqhFUJPEqmnwo8asb1EYk8QXlKYYKxrEYCQ7kVEFwgzwfXPwICLLBQbtz0pkfb13wl0KbLc/54LDxPhgmPFMYE5dgTN/vpBRa0CS4xKIVZAZT4BMYY9+JMSg0oZ0nXXhHTjP0Rol2X0zeXLtqYUEZk6vIMRXQtaljpXIT1s9o9C60Ik4zwpgvAo0y6VHilR3whkIjJ5k7AbgWLlePX4MGTq5PrSNv1ZfhoS0jpPAzqdRjSbhinEpjjFOeKAXx8DggGuIIMklWKqKqRaRq/6WiFdya22AIBIUNyogcpD0riioeTcYV2L0bzsiW6hrTzYBcUqPIfqvDzny24qm3Ac9X8O91vEzUuELwLicSIpR11tzVatNsOYNF3w9decq2/boX1zFtkG6L6a4lK4WbwArnU53jYkyLeYTpBmi2gGYboHwB5UX5Vmfo5lzjZxfkS4oaKNHHNgA9VuBcLO+7aCJe0Nzt1s2vL9e6cPnqxHfy19ft0zfLxdtlz9kL9pL57C07ZR/ZOesyzlL2i/1mfxr/mjvNveb+grq9tbR5yiqn+ew/NkiHcA==</latexit>

<latexit sha1_base64="h69/yMBGaHovcmD+xqoD2hhEm04=">AAAEXXicfVPNbtNAEN42KZRQSgsHDlwsokqohypGCJC4VKIHJIQoP2mDYhON15N0lV3b2h03MSu/Bld4LU68CuskUhMn6krWjuf75ndnokwKQ53O363tRnPnzt3de637ew/2Hx4cProwaa45dnkqU92LwKAUCXZJkMRephFUJPEyGr+r8Mtr1EakyTcqMgwVjBIxFBzIqYLgGnkx+PojIMgHB+3OSWd2vHXBXwhttjjng8PG2yBOea4wIS7BmL7fySi0oElwiWUryA1mwMcwwr4TE1BoQjtLuvSOnCb2hql2X0LeTLtsYUEZU6jIMRXQlaljlXIT1s9p+Ca0IslywoTPAw1z6VHqVR3wYqGRkyycAFwLl6vHr0ADJ9en1pG37Mvw0FYRMviZrtRjSbhinEpjghOeKgVJfBwQxTiEXJKVimjVIlK1/0rRCm7MbRADQWmDKiIHac/KchWPxqMV2L0bTslW6hrTzYBcUKPIfqnDzny65Km3AS+W8O91vErUuELwNicSIpR11szVctNsNYNl3w9decq2/boX1zFtkG6K6a4lK4WbwBXOhzrHxZiUswiTDdB0Dk03QMUcKsrqrc7QzbnGjy7Ipww1UKqPbQB6pMC5WNy30UQyp7nbrZtfX6514eLFie/kzy/bp68Wi7fLnrJn7Dnz2Wt2yt6zc9ZlnGXsF/vN/jT+NXeae839OXV7a2HzmK2c5pP/MlWHbw==</latexit>

<latexit sha1_base64="R4NA++K4CsdPJh0DBNt9p6UAEvo=">AAAEZHicfVPditNAFJ7dVl3rqq2LV4IEy4LsxdIsooI3C+6FIOIqdrfSxDKZnHaHziRh5mTbOORRvNVn8gV8DidNYJu07ECYk/N953fOCRLBNQ4Gf3d2W+07d+/t3e882H/46HG39+RCx6liMGSxiNUooBoEj2CIHAWMEgVUBgIug/n7Ar+8BqV5HH3DLAFf0lnEp5xRtKpJt+ddA8smHsISDaoU8km3PzgerI6zKbiV0CfVOZ/0Wu+8MGaphAiZoFqP3UGCvqEKOROQd7xUQ0LZnM5gbMWIStC+WeWeO4dWEzrTWNkvQmelXbcwVGqdycAyJcUr3cQK5TZsnOL0rW94lKQIESsDTVPhYOwUjXBCroChyKxAmeI2V4ddUUUZ2nZ1Dp11X5r5poiQ0J9xrR6D3BZjVQoiWLBYShqFRx5iCFOaCjRCItYtAtn4LxQd78bceCFFmhuviMioMGd5XseD+awG/yifr1A3mHYUREUNAvO1CVvz5Zqn0RY8W8O/N/EiUW0LgducCBqAaLJWrtabZopBzMeub8uTpu82vdiOKQ14U8xwI1nB7QTWOB+bHBtjka8iLLZAyxJaboGyEsry4q3OwM65gk82yOcEFMVYHRmPqpmk1kV130bjUUmzt103t7lcm8LFybFr5S+v+qevq8XbI8/IC/KSuOQNOSUfyDkZEkYW5Bf5Tf60/rX32wftpyV1d6eyOSC1037+H7QdilQ=</latexit>

<latexit sha1_base64="7ONj3jnZWerMZ9ZBDyOlkJLX6tU=">AAAEdXicfVNNb9NAEN02AUr4SuGIkKyGotJDiRECJC6VyAEpQhRE2qA4VOv1JF1l17Z2x03Mynd+DVf4K/wSrqw/pCZO1JWsHb/3ZmZndtaPBdfY7f7d2m40b9y8tXO7defuvfsP2rsPT3WUKAYDFolIDX2qQfAQBshRwDBWQKUv4Myfvc/5s0tQmkfhV0xjGEs6DfmEM4oWOm/veQgLNP0wmgsIpuD0bE4uREE7B/3e8+y83ekedYvlrBtuZXRItU7OdxvvvCBiiYQQmaBaj9xujGNDFXImIGt5iYaYshmdwsiaIZWgx6YoJnP2LRI4k0jZL0SnQJc9DJVap9K3SknxQte5HNzEjRKcvB0bHsYJQsjKRJNEOBg5eWecgCtgKFJrUKa4PavDLqiiDG3/WvvOcizNxibPENMf0Uo9BrktxkIKQpizSEoaBoceYgATmgg0QiKueviy9p8DLe/K3XgBRZoZL8/IqDC9LFvl/dl0hf5e3moO15R2NkQl9X3zpU5b98VSpOEGPl3iv9X5/KDaFgLXBRHUB1FXFaGWm2a8S2DZyB3b8qTpuPUotmNKA14VM1g7rOB2Alc0/brG5phnRYb5BmpRUosNVFpSaZbfVQ/snCv4aJN8ikFRjNSh8aiaSmpDVPt1Mh6WMrvb5+bWH9e6cfryyLX251ed49fVw9shj8keOSAueUOOyQdyQgaEkZ/kF/lN/jT+NZ80nzafldLtrcrnEVlZzRf/AZL4kHw=</latexit>

<latexit sha1_base64="8wTj6WN7Boy6A+qXQ9V69CHbSp4=">AAAEXHicfVNdb9MwFPXWAqNjsIHECy8R1SS0h6mZECDxMok9ICG0geg21ITpxr3trNpJZN+sLVZ+Bq/wu3jht+A0kdak1SxFvjnn3E/bUSqFoV7v78Zmq33v/oOth53tRzuPn+zuPT03SaY59nkiE30ZgUEpYuyTIImXqUZQkcSLaPKh4C9uUBuRxN9onmKoYByLkeBADhoEhDOyp5k2+dVut3fYWyxv1fAro8uqdXa113ofDBOeKYyJSzBm4PdSCi1oElxi3gkygynwCYxx4MwYFJrQLmrOvX2HDL1Rot0Xk7dAlz0sKGPmKnJKBXRtmlwBruMGGY3ehVbEaUYY8zLRKJMeJV4xAG8oNHKSc2cA18LV6vFr0MDJjamz7y3HMjy0RYYUfia1fiwJ14yDNMY45YlSEA8PAqIhjiCTZKUiqntEqvFfAJ3g1t0GQyDIbVBk5CDtSZ7X+WgyrtE/yuMr4IbSXQFZSaPIfm3Szn22FOlyDT9f4r83+aJQ4xrBu4JIiFA2VYtQy0OzwQ3yfOCHrj1lu34zipuYNki3zfRXipXC3cCa5lNT43JM80WG6RpqVlKzNdS8pOZ5cVYn6O65xs8uyWmKGijRBzYAPVbgQlT7XTIRlzK3u+fmNx/XqnF+dOg7+8vr7vGb6uFtsRfsJXvFfPaWHbOP7Iz1GWcJ+8V+sz+tf+12e7u9U0o3NyqfZ6y22s//A+xdh14=</latexit>

S

<latexit sha1_base64="4C8jBXT8hsFxBptqPqFQJeN3bpI=">AAAEVnicfVNdb9MwFPXWjo3y1cEjLxHVJLSHqUEIkHiZxB6QEGIgug01YbpxbzurdhLZN2uDld/AK/wz+DMIp420Jq1mKfLNOed+2ddRKoWhfv/v1narvXNnd+9u5979Bw8fdfcfn5kk0xwHPJGJvojAoBQxDkiQxItUI6hI4nk0fVfy59eojUjir5SnGCqYxGIsOJCDBsE18vyy2+sf9RfLWzf8yuixap1e7rfeBqOEZwpj4hKMGfr9lEILmgSXWHSCzGAKfAoTHDozBoUmtItqC+/AISNvnGj3xeQt0FUPC8qYXEVOqYCuTJMrwU3cMKPxm9CKOM0IY75MNM6kR4lXtu6NhEZOMncGcC1crR6/Ag2c3AF1DrzVWIaHtsyQwo+k1o8l4ZpxkMYYZzxRCuLRYUA0wjFkkqxURHWPSDX+S6AT3LjbYAQEhQ3KjBykPSmKOh9NJzX6e0A4J1vCDaW7fFlJo8h+adLOfb4S6WIDn6/w35p8WahxjeBtQSREKJuqRajVQ7Pl8BVDP3TtKdvzm1HciWmDdNPMYK1YKdwE1jQfmhqXY1YsMsw2UPMlNd9A5UsqL8q7OkE35xo/uiSfUtRAiT60AeiJAhei2m+TiXgpc7t7bn7zca0bZy+OfGd/ftk7flU9vD32lD1jz5nPXrNj9p6dsgHjTLCf7Bf73frT+tfeae8updtblc8TVlvt7n+FkIVy</latexit>

<latexit sha1_base64="+hhaQhuXu5+xz4L8jZQ13JWqzQk=">AAAEXHicfVNNb9NAEN02AUpKoQWJCxeLqBLqoYoRAiQulegBCSEKIm1RbKrxZpKssmtbu+MmZuWfwRV+Fxd+C2vHUhMn6kqWx++9+fRslEphqNf7u7Xdat+5e2/nfmf3wd7DR/sHj89NkmmOfZ7IRF9GYFCKGPskSOJlqhFUJPEimr4v+Ytr1EYk8TfKUwwVjGMxEhzIQYNgAmSDa+R5cbXf7R33quOtG35tdFl9zq4OWu+CYcIzhTFxCcYM/F5KoQVNgkssOkFmMAU+hTEOnBmDQhPaqubCO3TI0Bsl2j0xeRW67GFBGZOryCkV0MQ0uRLcxA0yGr0NrYjTjDDmi0SjTHqUeOUAvKHQyEnmzgCuhavV4xPQwMmNqXPoLccyPLRlhhR+Jiv9WBKuGQdpjHHGE6UgHh4FREMcQSbJSkW06hGpxncJdIIbdxsMgaCwQZmRg7SnRbHKR9PxCv0jIJyTLeGG0q2ArKVRZL82aec+X4p0uYHPl/jvTb4s1LhG8LYgEiKUTVUVanlo1fYVAz907Snb9ZtR3MS0Qbpppr9WrBRuA1c0H5sal2NWVBlmG6j5gppvoPIFlRflvzpFt+caP7kkn1PUQIk+sgHosQIXon7fJhPxQube7rr5zcu1bpy/PPad/eVV9+R1ffF22DP2nL1gPnvDTtgHdsb6jLOE/WK/2Z/Wv3a7vdveW0i3t2qfJ2zltJ/+B3W0h0A=</latexit>

<latexit sha1_base64="AuQTufk8c2WTwsw6sYu3nbeM0po=">AAAEW3icfVNdaxNBFJ02UWus2io++bIYCrUPJSuCgiAF+yCIWMW00exa7k5u0iEzs8vM3TbrsP/CV/1fPvhfnE0CTTahA8vcPefcr5k7SSaFpU7n78Zmo3nr9p2tu6172/cfPNzZfXRq09xw7PJUpqaXgEUpNHZJkMReZhBUIvEsGb+r+LNLNFak+isVGcYKRloMBQfy0PeIcEJuH56X5zvtzmFnuoJVI5wbbTZfJ+e7jTfRIOW5Qk1cgrX9sJNR7MCQ4BLLVpRbzICPYYR9b2pQaGM3LbkM9jwyCIap8Z+mYIouejhQ1hYq8UoFdGHrXAWu4/o5DV/HTugsJ9R8lmiYy4DSoOo/GAiDnGThDeBG+FoDfgEGOPlTau0Fi7Esj12VIYOf6VI/joRvxkMGNV7xVCnQg4OIaIBDyCU5qYiWPRJV+6+AVnTt7qIBEJQuqjJykO64LJf5ZDxaon/Mbq+Ca0o/AXIuTRL3pU5798lCpN4avljgv9X5qlDrG8GbgkhIUNZV01CLh+aiS+RlP4x9e8q1w3oUf2LGIl03010pVgo/gUuaD3WNz3FVTjNcraEmM2qyhipmVFFWd3WMfs4NfvRJPmVogFJz4CIwIwU+xHy/SSb0TOZ3/9zC+uNaNU5fHIbe/vyyffR2/vC22FP2jO2zkL1iR+w9O2Fdxplmv9hv9qfxr9lotprbM+nmxtznMVtazSf/AfdchmU=</latexit> <latexit sha1_base64="n94rD8q+6Je1PPlVS3tV+LkJkCM=">AAAEW3icfVNdb9MwFPXWAqMM2EA88RJRTRp7mBqEBBISmsQekBBiILoVmjA5zm1n1XYi+2ZrsPIveIX/xQP/BaeJtCatZinyzTnnftnXUSq4wcHg78Zmp3vr9p2tu7172/cfPNzZfXRqkkwzGLJEJHoUUQOCKxgiRwGjVAOVkYCzaPau5M8uQRueqK+YpxBKOlV8whlFB30PEOZo96PnxflOf3A4WCxv1fBro0/qdXK+23kTxAnLJChkghoz9gcphpZq5ExA0QsyAyllMzqFsTMVlWBCuyi58PYcEnuTRLtPobdAlz0slcbkMnJKSfHCtLkSXMeNM5y8Di1XaYagWJVokgkPE6/s34u5BoYidwZlmrtaPXZBNWXoTqm35y3HMiy0ZYaU/kwa/VjkrhkHaVBwxRIpqYoPAsQYJjQTaIVEbHpEsvVfAr3g2t0GMUVa2KDMyKiwx0XR5KPZtEH/qG6vhFtKNwGilkaR/dKmnft8KdJoDZ8v8d/afFmocY3ATUEEjUC0VYtQy4dmg0tgxdgPXXvS9v12FHdi2gBeNzNcKVZwN4ENzYe2xuW4KhYZrtZQ84qar6HyisqL8q6Owc25ho8uyacUNMVEH9iA6qmkLkS93yTjqpK53T03v/24Vo3TF4e+sz+/7B+9rR/eFnlKnpF94pNX5Ii8JydkSBhR5Bf5Tf50/nU73V53u5JubtQ+j0ljdZ/8B/tMhmY=</latexit>

Figure 3: Comparison to KD. (a) Adversary has access only toimage distribution PA(X) (b) Training in a KD-manner requiresstronger knowledge of the victim. Both S and FA are trained toclassify images x ∈ PV (X)

ness of predictions by truncation e.g., rounding-off.Remarks: Comparison to Knowledge Distillation (KD).Training the knockoff model is reminiscent of KD ap-proaches [15, 36], whose goal is to transfer the knowledgefrom a larger teacher network T (white-box) to a compactstudent network S (knockoff) via the transfer set. We il-lustrate key differences between KD and our setting in Fig-ure 3: (a) Independent distribution PA: FA is trained onimages x ∼ PA(X) independent to distribution PV usedfor training FV ; (b) Data for supervision: Student networkS minimize variants of KD loss:

LKD = λ1LCE(ytrue, yS) + λ2LCE(yτS , yτT )

where yτT = softmax(aT /τ) is the softened posterior distri-bution of logits a controlled by temperature τ . In contrast,the knockoff (student) in our case lacks logits aT and truelabels ytrue to supervise training.

4. Generating KnockoffsIn this section, we elaborate on the adversary’s approach

in two steps: transfer set construction (Section 4.1) andtraining knockoff FA (Section 4.2).

4.1. Transfer Set Construction

The goal is to obtain a transfer set i.e., image-predictionpairs, on which the knockoff will be trained to imitate thevictim’s blackbox model FV .Selecting PA(X). The adversary first selects an imagedistribution to sample images. We consider this to be a largediscrete set of images. For instance, one of the distributionsPA we consider is the 1.2M images of ILSVRC dataset [8].Sampling Strategyπ. Once the image distributionPA(X)

is chosen, the adversary samples images x π∼ PA(X) usinga strategy π. We consider two strategies.

= animal<latexit sha1_base64="uqRPe7FBGLHqCtaEE7LREgY4kbI=">AAAEIHicfVPdbtMwFPYWfrbyt8ElNxHVJMTF1KBJcDlpu5iEEBuiW1ETphP3tLNqO5HtrM2sPAG38AI8DXeIS3ganCYSTVrNUuQv5/vOn+0Tp5xp0+v92dj07ty9d39ru/Pg4aPHT3Z2n57rJFMU+zThiRrEoJEziX3DDMdBqhBEzPEinh6V/MU1Ks0S+cnkKUYCJpKNGQXjTGc3lzvd3n5vsfxVENSgS+p1ernrbYejhGYCpaEctB4GvdREFpRhlGPRCTONKdApTHDooASBOrKLSgt/z1lG/jhR7pPGX1iXPSwIrXMRO6UAc6XbXGlcxw0zM34bWSbTzKCkVaJxxn2T+GXb/ogppIbnDgBVzNXq0ytQQI07nM6evxxL08iWGVK4SRr92Fi0/ktDJ5Q4o4kQIEc2HIGBwoalPwVuj4uiycfTSYP+EhqcG1uaW0p3jbyWxrH92Kad+3wp0mANny/xn9t8Wah2jeBtQTjEyNuqRSiFy7prpMUwiFx7wnaDdhR3Ykqj+d9Mf6VYztx7amjetTUux6xYZJitoeYVNV9D5RWVF+VdHaN7tQrfuyQfUlRgEvXKhqAmAlyIer9NxmQlc7sbnqA9Kqvg/PV+4PDZQffwoB6jLfKcvCAvSUDekENyQk5Jn1CC5Cv5Rr57P7yf3i/vdyXd3Kh9npHG8v7+A8iicXU=</latexit>

<latexit sha1_base64="lutrf/tTSHOs7cD5ZS3Y6GooDkM=">AAAEJ3icfVPdbtMwFPYWYFv52+CSm4hq0uBiaiYEu5zELpAQYiC6FTWhOnFPO6u2E9nO2szKS3ALL8DTcIfgkjfBaSLRptUsRf5yvu/82T5xypk2nc6fjU3v1u07W9s7rbv37j94uLv36FwnmaLYpQlPVC8GjZxJ7BpmOPZShSBijhfx5HXJX1yh0iyRn0yeYiRgLNmIUTDO1AtTNjAH188Gu+3OYWe+/FUQ1KBN6nU22PN2wmFCM4HSUA5a94NOaiILyjDKsWiFmcYU6ATG2HdQgkAd2XnBhb/vLEN/lCj3SePPrYseFoTWuYidUoC51E2uNK7j+pkZHUeWyTQzKGmVaJRx3yR+2b0/ZAqp4bkDQBVztfr0EhRQ486ote8vxtI0smWGFK6TpX5sLBr/paEVSpzSRAiQQxsOwUBhw9KfArenRbHMx5PxEv0lNDgztjQ3lO42eS2NY/uxSTv32UKk3ho+X+A/N/myUO0awZuCcIiRN1XzUAoXdVdIi34QufaEbQfNKO7ElEbzv5nuSrGcufe0pHnb1Lgc02KeYbqGmlXUbA2VV1RelHd1iu7VKnznkrxPUYFJ1HMbghoLcCHq/SYZk5XM7W54guaorILzo8PA4Q8v2icv6zHaJk/IU3JAAvKKnJA35Ix0CSWcfCXfyHfvh/fT++X9rqSbG7XPY7K0vL//AKrvdBY=</latexit>

= 0.3

bird

sparrow

0.5

0.2

<latexit sha1_base64="MHx89ZLivKUkF26SlAOaWppLM6o=">AAAEJnicfVPNbtNAEN7W/LThr4UjF4uoEuJQxQgBx0r0gIQQBZEmKDbReDNJV9m1rd11E7PyQ3CFF+BpuCHEjUdhnFgicaKOZO14vm/+dmfiTApjO50/O7vetes3bu7tt27dvnP33sHh/XOT5ppjl6cy1f0YDEqRYNcKK7GfaQQVS+zF01cV3rtEbUSafLRFhpGCSSLGgoMlUy+8RD4f2uFBu3PcWYi/qQS10ma1nA0Pvf1wlPJcYWK5BGMGQSezkQNtBZdYtsLcYAZ8ChMckJqAQhO5Rb2lf0SWkT9ONX2J9RfWVQ8HyphCxcRUYC9ME6uM27BBbscvIyeSLLeY8GWicS59m/pV8/5IaORWFqQA14Jq9fkFaOCWrqh15K/GMjxyVYYMvqRr/bhYNf4rQytMcMZTpSAZuXAEFkoXVv4cpDsty3U8nk7W4M+hxbl1lbnBpMeUNTWO3YcmTO7zlUj9LXixgn9q4lWhhhrBq4JIiFE2WYtQGld5NEzlIIioPeXaQTMK3Zg2aP83090oVgqapzXOmyaHcszKRYbZFmi+hOZboGIJFWX1VqdIU6vxLSV5l6EGm+onLgQ9UUAh6vMqmkiWNDppeYLmqmwq50+PA9LfP2ufPK/XaI89ZI/YYxawF+yEvWZnrMs4m7Kv7Bv77v3wfnq/vN9L6u5O7fOArYn39x/ErnQe</latexit>

<latexit sha1_base64="lxlXnLOtKLwbRUMSHkBPjb71NwE=">AAAEInicfVNbb9MwFPYWLlu5bfDIS0Q1CfEwNWgCHiexBySEGJduRU2oTtzTzqrtRLazNlj5CbzCH+DX8IZ4QuLH4DSRaNJqliJ/Od93brZPnHKmTa/3Z2vbu3b9xs2d3c6t23fu3tvbv3+mk0xR7NOEJ2oQg0bOJPYNMxwHqUIQMcfzePay5M8vUWmWyI8mTzESMJVswigYZ/qgRma01+0d9pbLXwdBDbqkXqejfW83HCc0EygN5aD1MOilJrKgDKMci06YaUyBzmCKQwclCNSRXdZa+AfOMvYniXKfNP7SuuphQWidi9gpBZgL3eZK4yZumJnJi8gymWYGJa0STTLum8QvG/fHTCE1PHcAqGKuVp9egAJq3PF0DvzVWJpGtsyQwpek0Y+NReu/NHRCiXOaCAFybMMxGChsWPpT4PakKJp8PJs26M+hwYWxpbmldBfJa2kc2/dt2rkvViINNvD5Cv+pzZeFatcIXhWEQ4y8rVqGUriqu0RaDIPItSdsN2hHcSemNJr/zfTXiuXMvaeG5nVb43LMi2WG+QZqUVGLDVReUXlR3tUJuler8I1L8jZFBSZRT2wIairAhaj3q2RMVjK3u+EJ2qOyDs6eHgYOvzvqHj+rx2iHPCSPyGMSkOfkmLwip6RPKJmSr+Qb+e798H56v7zflXR7q/Z5QBrL+/sPDwlyVg==</latexit>

<latexit sha1_base64="TFbnaBi5o/Et9D5tGthcNwX1lZM=">AAAEgXicfVPdahNBFJ62UWv8a/XSm8FQaHtRsqWo0JuCvSiIGMW0lWwMs5OTZMjM7jJztsk67Fv4NN7qS/g2nk0CTbahA8ucPd93/udEqVYOm81/G5tbtQcPH20/rj95+uz5i53dl5cuyayEtkx0Yq8j4UCrGNqoUMN1akGYSMNVNP5Q4lc3YJ1K4m+Yp9A1YhirgZICSdXbOQoRpuj3owN+ocAKK0eEaZ4mWsmcFzw0AkdR5FtFL0xVb6fRPGrODr8rBAuhwRan1dvdOg37icwMxCi1cK4TNFPsemFRSQ1FPcwcpEKOxRA6JMbCgOv6WWEF3yNNnw8SS1+MfKZdtvDCOJebiJhlmq6Klcp1WCfDwfuuV3GaIcRyHmiQaY4JL7vE+8qCRJ2TIKRVlCuXI2GFROplfY8v+3Ky68sIqfiZrNTjUVExpLIQw0Qmxoi4fxgi9mEgMo1eG8RVi8hU/ktFPbw192FfoCj8bCw0KH9eFKt4NB6uwD/mEy7VFSa9E72g0oC/VmEyny55ul6D50v49ypeJuqoELjPiRYR6Cpr5mq5aT68AVl0gi6VZ3wjqHqhjlkHeFtM+06yWtELXOF8rHIoxqSYRZisgaZzaLoGyudQXpSzOgd65xY+UZDPKW0UJvbQh8IOjSAXi/s+mornNLpp3YLqct0VLo+PApK/nDTO3i4Wb5u9Zm/YPgvYO3bGLliLtZlkv9hv9of9rW3VDmrN2vGcurmxsHnFVk7t9D93qJRt</latexit>

<latexit sha1_base64="9RSQuMm5X363WqgqAOBtqkqZw8A=">AAAEInicfVPdbtMwFPYWYFv52+CSm4hqEuJiatAkuEKbmBASQoyfbkVNqE7c08yq7US2szZYeQRu4QV4Gu4QV0g8DE4biTatZinyl/N958/2iTPOtOl0/mxseteu39ja3mndvHX7zt3dvXtnOs0VxS5Neap6MWjkTGLXMMOxlykEEXM8j8cvKv78EpVmqfxoigwjAYlkI0bBONOHl4PjwW67c9CZLX8VBDVok3qdDva8nXCY0lygNJSD1v2gk5nIgjKMcixbYa4xAzqGBPsOShCoIzurtfT3nWXoj1LlPmn8mXXRw4LQuhCxUwowF7rJVcZ1XD83o2eRZTLLDUo6TzTKuW9Sv2rcHzKF1PDCAaCKuVp9egEKqHHH09r3F2NpGtkqQwZf0qV+bCwa/5WhFUqc0FQIkEMbDsFAacPKnwK3J2W5zMfjZIn+HBqcGluZG0p3kbyWxrF936Sd+3QhUm8NXyzwn5p8Vah2jeBVQTjEyJuqWSiFi7pLpGU/iFx7wraDZhR3Ykqj+d9Md6VYztx7WtK8bmpcjkk5yzBZQ03n1HQNVcypoqzu6gTdq1X4xiV5m6ECk6rHNgSVCHAh6v0qGZNzmdvd8ATNUVkFZ08OAoffHbaPntdjtE0ekIfkEQnIU3JEXpFT0iWUJOQr+Ua+ez+8n94v7/dcurlR+9wnS8v7+w+uN3H/</latexit>

Reward signal

Train

Update policy<latexit sha1_base64="crajI7D0cI0TvgJpIM9EHSbvQzE=">AAAEK3icfVPNbtNAEN7W/LThr4UjF4uoUuEQ2QgBvaBKVAgJIQoiaVBsovFmkq6yu7Z2103Myq/BFV6Ap+EE4sp7sE6CSNzQkawdz/fN3+5MknGmTRD82Nj0Ll2+cnVru3Ht+o2bt3Z2b3d0miuKbZryVHUT0MiZxLZhhmM3Uwgi4XiSjJ9X+MkZKs1S+d4UGcYCRpINGQXjTNGLfmc/OkM67Zv7/Z1m0DoIwoPHoX9eCVvBTJpkIcf9XW87GqQ0FygN5aB1LwwyE1tQhlGOZSPKNWZAxzDCnlMlCNSxnRVd+nvOMvCHqXKfNP7MuuxhQWhdiMQxBZhTXccq4zqsl5vh09gymeUGJZ0nGubcN6lf3YA/YAqp4YVTgCrmavXpKSigxt1TY89fjqVpbKsMGXxKV/qxiaj9V4ZGJHFCUyFADmw0AAOljSp/CtweleUqnoxHK/DHyODU2MpcY7oX5Qtqkth3ddi5T5ciddfgxRL+oY5XhWrXCF4UhEOCvM6ahVK4zHPTVPbC2LUnbDOsR3E3pjSaf820zxXLmZunFc6rOsflmJSzDJM10HQOTddAxRwqyuqtjtBNrcLXLsmbDBWYVD2wEaiRABdicV5EY3JOc6dbnr8b4v9f6TxshU5/+6h5+GyxRlvkLrlH9klInpBD8pIckzahJCOfyRfy1fvmffd+er/m1M2Nhc8dsiLe7z+PFnXy</latexit>

<latexit sha1_base64="Ty9s5SiTw/AO8bAL2dGQ6rsw/78=">AAAEJnicfVPNbtNAEN7W/LThr4UjF4uoEuJQxQgJTqgSPSAhREGkCYpNNN5M0lV2bWt33cSs/BBc4QV4Gm4IceNRGCeWSJyoI1k7nu+bv92ZOJPC2E7nz86ud+36jZt7+61bt+/cvXdweP/cpLnm2OWpTHU/BoNSJNi1wkrsZxpBxRJ78fRVhfcuURuRJh9tkWGkYJKIseBgydQLL5HPh3Z40O4cdxbibypBrbRZLWfDQ28/HKU8V5hYLsGYQdDJbORAW8Ellq0wN5gBn8IEB6QmoNBEblFv6R+RZeSPU01fYv2FddXDgTKmUDExFdgL08Qq4zZskNvxi8iJJMstJnyZaJxL36Z+1bw/Ehq5lQUpwLWgWn1+ARq4pStqHfmrsQyPXJUhgy/pWj8uVo3/ytAKE5zxVClIRi4cgYXShZU/B+lOy3Idj6eTNfhzaHFuXWVuMOkxZU2NY/ehCZP7fCVSfwterOCfmnhVqKFG8KogEmKUTdYilMZVHg1TOQgiak+5dtCMQjemDdr/zXQ3ipWC5mmN86bJoRyzcpFhtgWaL6H5FqhYQkVZvdUp0tRqfEtJ3mWowab6iQtBTxRQiPq8iiaSJY1OWp6guSqbyvnT44D098/aJy/rNdpjD9kj9pgF7Dk7Ya/ZGesyzqbsK/vGvns/vJ/eL+/3krq7U/s8YGvi/f0HxxZ0Jg==</latexit>

<latexit sha1_base64="sM/zdGsjrCE1JwfmLbJ78MMA9tg=">AAAEJnicfVPNbtNAEN7W/LThr4UjF4uoEuJQxQgJTqgSPSAhREGkCYpNNN5M0lV2bWt33cSs/BBc4QV4Gm4IceNRGCeWSJyoI1k7nu+bv92ZOJPC2E7nz86ud+36jZt7+61bt+/cvXdweP/cpLnm2OWpTHU/BoNSJNi1wkrsZxpBxRJ78fRVhfcuURuRJh9tkWGkYJKIseBgydQLL5EXQzs8aHeOOwvxN5WgVtqslrPhobcfjlKeK0wsl2DMIOhkNnKgreASy1aYG8yAT2GCA1ITUGgit6i39I/IMvLHqaYvsf7CuurhQBlTqJiYCuyFaWKVcRs2yO34ReREkuUWE75MNM6lb1O/at4fCY3cyoIU4FpQrT6/AA3c0hW1jvzVWIZHrsqQwZd0rR8Xq8Z/ZWiFCc54qhQkIxeOwELpwsqfg3SnZbmOx9PJGvw5tDi3rjI3mPSYsqbGsfvQhMl9vhKpvwUvVvBPTbwq1FAjeFUQCTHKJmsRSuMqj4apHAQRtadcO2hGoRvTBu3/ZrobxUpB87TGedPkUI5Zucgw2wLNl9B8C1QsoaKs3uoUaWo1vqUk7zLUYFP9xIWgJwooRH1eRRPJkkYnLU/QXJVN5fzpcUD6+2ftk5f1Gu2xh+wRe8wC9pydsNfsjHUZZ1P2lX1j370f3k/vl/d7Sd3dqX0esDXx/v4DytR0Jw==</latexit>

<latexit sha1_base64="B2SDmFMw8ERbo6NXU9v1+m+ImVg=">AAAEP3icfVNbi9NAFJ7deNmtl+3qoyCDZWEVWRoR9EkW3AdBxFXsbqWJZTI9bYfOJGHmZNs45M1f46v+AX+Gv8A38dU3J23ANi17IMyX833nNpcolcJgu/1za9u7cvXa9Z3dxo2bt27vNffvnJkk0xw6PJGJ7kbMgBQxdFCghG6qgalIwnk0eVny5xegjUjiD5inECo2isVQcIbO1W/eDyQM8TC4AD7r42NagryPNNBiNMaH/WarfdSeG10HfgVapLLT/r63GwwSnimIkUtmTM9vpxhaplFwCUUjyAykjE/YCHoOxkyBCe18kIIeOM+ADhPtvhjp3LscYZkyJleRUyqGY1PnSucmrpfh8HloRZxmCDFfFBpmkmJCy12hA6GBo8wdYFwL1yvlY6YZR7d3jQO6nMvw0JYVUvY5WZnHRqr2XzoaQQxTnijF4oENBgxZYYMynjNpT4pilY8moxX6U4AwQ1u6a0p3yrKSRpF9X6dd+GwpU3cDny/xH+t82ahxg8BlSSSLQNZV81QalnXuUhU9P3TjKdvy61ncjmkD+H+YzlqzUrj7tKJ5Xde4GtNiXmG6gZotqNkGKl9QeVGe1Qm4W6vhjSvyNgXNMNGPbMD0SDGXolovk4l4IXOrezx+/amsg7MnR77D7562jl9Uz2iH3CMPyCHxyTNyTF6RU9IhnHwhX8k38t374f3yfnt/FtLtrSrmLlkx7+8/Db19YQ==</latexit>

<latexit sha1_base64="6o7jsYJ7VDFSp3YFhzN1iH51vec=">AAAENHicfVPNbtNAEN7W/LThL6VHLhZRReFQxRUSXEBFVAgJIQoibVBsrPF6kq6ya1u76yZm5WfhCi/AuyBxQ1x5BtaJJRIn6kjWjuf75m93Jso4U7rb/bmx6Vy5eu361nbrxs1bt++0d+6eqjSXFHs05ansR6CQswR7mmmO/UwiiIjjWTR+WeFnFygVS5OPusgwEDBK2JBR0NYUtneLB6F2n7mvwhf7/gXSaagfhu1O96A7E3dV8WqlQ2o5CXecbT9OaS4w0ZSDUgOvm+nAgNSMcixbfq4wAzqGEQ6smoBAFZhZ9aW7Zy2xO0yl/RLtzqyLHgaEUoWILFOAPldNrDKuwwa5Hj4NDEuyXGNC54mGOXd16lZX4cZMItW8sApQyWytLj0HCVTbC2vtuYuxFA1MlSGDL+lSPyYSjf/K0PITnNBUCEhi48egoTR+5U+Bm+OyXMaj8WgJ/uxrnGpTmRtM+7S8pkaR+dCErft0IVJ/DV4s4J+aeFWoso3gZUE4RMibrFkoiYs8O03lwAtse8J0vGYUe2NSof7fTG+lWM7sPC1x3jQ5NseknGWYrIGmc2i6BirmUFFWb3WMdmolvrVJ3mUoQafykfFBjgTYEPV5GY0lc5o97fJ4zVVZVU4PDzyrv3/cOXper9EWuUfuk33ikSfkiLwmJ6RHKCnIV/KNfHd+OL+c386fOXVzo/bZJUvi/P0HTEx39g==</latexit>

<latexit sha1_base64="pEb6DjyhdTtEBMnsIsu+VeK4k0g=">AAAEX3icfVNdb9MwFPXWwkaB0cET4iWimoT2UCUTAiZeKrEHJIQoiG5FTTY57m1n1U4i+2ZtsPI/eIV/xSP/BCctWpOVWYp8c865n7bDRHCNrvt7a7vRvHN3Z/de6/6Dh3uP2vuPT3WcKgYDFotYDUOqQfAIBshRwDBRQGUo4CycvSv4sytQmsfRV8wSCCSdRnzCGUULnfuS4mUYmn5+4Sf8ot1xu8eud/zKc24aXtctV4esVv9iv/HWH8cslRAhE1TrkecmGBiqkDMBectPNSSUzegURtaMqAQdmLLs3DmwyNiZxMp+EToluu5hqNQ6k6FVFmXqOleAm7hRipM3geFRkiJEbJlokgoHY6eYgTPmChiKzBqUKW5rddglVZShnVTrwFmPpVlgigwJ/R5X+jHIbTMWUhDBnMVS0mh86COOYUJTgUZIxKpHKGv/BdDyr92NP6ZIc1MeC6PCnOR5lQ9n0wp97iMs0BRwTWlvgVhJ7QF/qdPWfbEWabiBz9b4b3W+KFTbRuC2IIKGIOqqMtT60Ix/BSwfeYFtT5qOV49iJ6Y04HUzgxvFCm5vYEXzoa6xOeZ5mWG+gVosqcUGKltSWV6c1QnYe67go03yKQFFMVaHxqdqKqkNsdpvk/FoKbO7fW7/3pTzf+P0qOtZ+/PLTq+3eni75Bl5Tl4Qj7wmPfKe9MmAMKLID/KT/Gr8ae4095rtpXR7a+XzhFRW8+lfIICIuA==</latexit>

<latexit sha1_base64="Gu8g3dRzQ2+wYRTW6UL0dWRj0Xo=">AAAEZ3icfVPdahNBFJ42UWv8aaoigjeroVB7UbIiKnhTsBeCSKuYNpKN4ezsSTpkZneZOZufDvsu3uob+Qi+hbNJoMkmdGCZs9/3nd+ZCVMpDDWbf7e2K9Vbt+/s3K3du//g4W5979G5STLNscUTmeh2CAaliLFFgiS2U42gQokX4fBjwV+MUBuRxN9pmmJXwSAWfcGBHNSrPwkIJ2QP4JV36nQjgePc69UbzaPmbHnrhr8wGmyxznp7lQ9BlPBMYUxcgjEdv5lS14ImwSXmtSAzmAIfwgA7zoxBoenaWfm5t++QyOsn2n0xeTN02cOCMmaqQqdUQJemzBXgJq6TUf9914o4zQhjPk/Uz6RHiVfMwouERk5y6gzgWrhaPX4JGji5idX2veVYhndtkSGFq2SlH0vCNeMgjTGOeaIUxNFhQBRhHzJJViqiVY9Qlf4LoBZcu9sgAoLcBkVGDtKe5PkqHw4HK/TP+TkWcEnpboNcSMPQfivTzn2yFKm9gZ8u8T/KfFGocY3gTUEkhCjLqlmo5aHZYIQ87/hd156yDb8cxU1MG6TrZlprxUrhbuCK5nNZ43KM81mG8QZqMqcmG6jpnJrmxVmdoLvnGr+4JKcpaqBEH9oA9ECBC7HYb5KJeC5zu3tufvlxrRvnr498Z3990zh+u3h4O+w5e8kOmM/esWP2iZ2xFuPsiv1iv9mfyr/qbvVp9dlcur218HnMVlb1xX+TU4qJ</latexit>

Figure 4: Strategy adaptive.

4.1.1 Random Strategy

In this strategy, we randomly sample images (without re-placement) x iid∼ PA(X) to query FV . This is an extremecase where adversary performs pure exploration. However,there is a risk that the adversary samples images irrelevantto learning the task (e.g., over-querying dog images to abirds classifier).

4.1.2 Adaptive Strategy

We now incorporate a feedback signal resulting from eachimage queried to the blackbox. A policy π is learnt:

xt ∼ Pπ({xi,yi}t−1i=1)

to achieve two goals: (i) improving sample-efficiency ofqueries; and (ii) aiding interpretability of blackbox FV . Theapproach is outlined in Figure 4a. At each time-step t, thepolicy module Pπ produces a sample of images to query theblackbox. A reward signal rt is shaped based on multiplecriteria and is used to update the policy with an end-goal ofmaximizing the expected reward.

Supplementing PA. To encourage relevant queries, weenrich images in the adversary’s distribution by associat-ing each image xi with a label zi ∈ Z. No semantic rela-tion of these labels with the blackbox’s output classes is as-sumed or exploited. As an example, when PA correspondsto 1.2M images of the ILSVRC [8] dataset, we use labelsdefined over 1000 classes. These labels can be alternativelyobtained by unsupervised measures e.g., clustering or esti-mating graph-density [3,9]. We find using labels aids under-standing blackbox functionality. Furthermore, since we ex-pect labels {zi ∈ Z} to be correlated or inter-dependent, werepresent them within a coarse-to-fine hierarchy, as nodes ofa tree as shown in Figure 4b.

Actions. At each time-step t, we sample actions from adiscrete action space zt ∈ Z i.e., adversary’s independentlabel space. Drawing an action is a forward-pass (denotedby a blue line in Figure 4b) through the tree: at each level,

Page 4: Knockoff Nets: Stealing Functionality of Black-Box …Knockoff Nets: Stealing Functionality of Black-Box Models Tribhuvanesh Orekondy 1Bernt Schiele Mario Fritz2 1 Max Planck Institute

we sample a node with probability πt(z) . The probabilitiesare determined by a softmax distribution over the node po-tentials: πt(z) = eHt(z)∑

z′ Ht(z′). Upon reaching a leaf-node, a

sample of images is returned corresponding to label zt.Learning the Policy. We use the received reward rt foran action zt to update the policy π using the gradient banditalgorithm [43]. This update is equivalent to a backward-pass through the tree (denoted by a green line in Figure 4b),where the node potentials are updated as:

Ht+1(zt) = Ht(zt) + α(rt − rt)(1− πt(zt)) andHt+1(z′) = Ht(z

′) + α(rt − rt)πt(z′) ∀z′ 6= zt

where α = 1/N(z) is the learning rate, N(z) is the numberof times action z has been drawn, and rt is the mean-rewardover past ∆ time-steps.Rewards. To evaluate the quality of sampled images xt,we study three rewards. We use a margin-based certaintymeasure [19, 39] to encourage images where the victim isconfident (hence indicating the domain FV was trained on):

Rcert(yt) = P (yt,k1 |xt)− P (yt,k2 |xt) “cert”

To prevent the degenerate case of image exploitation over asingle label, we introduce a diversity reward:

Rdiv(y1:t) =∑k

max(0, yt,k − yt−∆,k) “div”

To encourage images where the knockoff prediction yt =FA(xt) does not imitate FV , we reward high loss:

RL(yt, yt) = L(yt, yt) “L”

We sum up individual rewards when multiple measures areused. To maintain an equal weighting, each reward is in-dividually rescaled to [0, 1] and subtracted with a baselinecomputed over past ∆ time-steps.

4.2. Training Knockoff FAAs a product of the previous step of interactively

querying the blackbox model, we have a transfer set{(xt, FV (xt)}Bt=1, xt

π∼ PA(X). Now we address howthis is used to train a knockoff FA.Selecting Architecture FA. Few works [27, 48] have re-cently explored reverse-engineering the blackbox i.e., iden-tifying the architecture, hyperparameters, etc. We how-ever argue this is orthogonal to our requirement of sim-ply stealing the functionality. Instead, we represent FAwith a reasonably complex architecture e.g., VGG [41] orResNet [14]. Existing findings in KD [11, 15] and modelcompression [5,13,17] indicate robustness to choice of rea-sonably complex student models. We investigate the choiceunder weaker knowledge of the teacher (FV ) e.g., trainingdata and architecture is unknown.

Blackbox (FV ) |DtrainV |+ |Dtest

V | Output classes K

Caltech256 [12] 23.3k + 6.4k 256 general object categoriesCUBS200 [47] 6k + 5.8k 200 bird speciesIndoor67 [35] 14.3k + 1.3k 67 indoor scenesDiabetic5 [1] 34.1k + 1k 5 diabetic retinopathy scales

Table 1: Four victim blackboxes FV . Each blackbox is namedin the format: [dataset][# output classes].

Training to Imitate. To bootstrap learning, we begin witha pretrained Imagenet network FA. We train the knock-off FA to imitate FV on the transfer set by minimizingthe cross-entropy (CE) loss: LCE(y, y) = −

∑k p(yk) ·

log p(yk). This is a standard CE loss, albeit weighed withthe confidence p(yk) of the victim’s label. This formulationis equivalent to minimizing the KL-divergence between thevictim’s and knockoff’s predictions over the transfer set.

5. Experimental Setup

We now discuss the experimental setup of multiple vic-tim blackboxes (Section 5.1), followed by details on the ad-versary’s approach (Section 5.2).

5.1. Black-box Victim Models FVWe choose four diverse image classification CNNs, ad-

dressing multiple challenges in image classification e.g.,fine-grained recognition. Each CNN performs a task spe-cific to a dataset. A summary of the blackboxes is presentedin Table 1 (extended descriptions in appendix).

Training the Black-boxes. All models are trained us-ing a ResNet-34 architecture (with ImageNet [8] pretrainedweights) on the training split of the respective datasets. Wefind this architecture choice achieve strong performance onall datasets at a reasonable computational cost. Models aretrained using SGD with momentum (of 0.5) optimizer for200 epochs with a base learning rate of 0.1 decayed by afactor of 0.1 every 60 epochs. We follow the train-test splitssuggested by the respective authors for Caltech-256 [12],CUBS-200-2011 [47], and Indoor-Scenes [35]. Since GTannotations for Diabetic-Retinopathy [1] test images arenot provided, we reserve 200 training images for each ofthe five classes for testing. The number of test images perclass for all datasets are roughly balanced. The test imagesof these datasets Dtest

V are used to evaluate both the victimand knockoff models.

After these four victim models are trained, we use themas a blackbox for the remainder of the paper: images in,posterior probabilities out.

5.2. Representing PAIn this section, we elaborate on the setup of two aspects

relevant to transfer set construction (Section 4.1).

Page 5: Knockoff Nets: Stealing Functionality of Black-Box …Knockoff Nets: Stealing Functionality of Black-Box Models Tribhuvanesh Orekondy 1Bernt Schiele Mario Fritz2 1 Max Planck Institute

5.2.1 Choice of PA

Our approach for transfer set construction involves the ad-versary querying images from a large discrete image distri-bution PA. In this section, we present four choices con-sidered in our experiments. Any information apart fromthe images from the respective datasets are unused in therandom strategy. For the adaptive strategy, we use image-level labels (chosen independent of blackbox models) toguide sampling.PA = PV . For reference, we sample from the exact setof images used to train the blackboxes. This is a specialcase of knowledge-distillation [15] with unlabeled data attemperature τ = 1.PA = ILSVRC [8, 37]. We use the collection of 1.2Mimages over 1000 categories presented in the ILSVRC-2012[37] challenge.PA = OpenImages [22]. OpenImages v4 is a large-scaledataset of 9.2M images gathered from Flickr. We use a sub-set of 550K unique images, gathered by sampling 2k imagesfrom each of 600 categories.PA = D2. We construct a dataset wherein the adver-sary has access to all images in the universe. In our case,we create the dataset by pooling training data from: (i) allfour datasets listed in Section 5.1; and (ii) both datasets pre-sented in this section. This results in a “dataset of datasets”D2 of 2.2M images and 2129 classes.Overlap between PA and PV . We compute overlap be-tween labels of the blackbox (K, e.g., 256 Caltech classes)and the adversary’s dataset (Z, e.g., 1k ILSVRC classes) as:100× |K ∩ Z|/|K|. Based on the overlap between the twoimage distributions, we categorize PA as:

1. PA = PV : Images queried are identical to the onesused for training FV . There is a 100% overlap.

2. Closed-world (PA = D2): Blackbox train data PVis a subset of the image universe PA. There is a 100%overlap.

3. Open-world (PA ∈ {ILSVRC, OpenImages}): Anyoverlap between PV and PA is purely coinciden-tal. Overlaps are: Caltech256 (42% ILSVRC, 44%OpenImages), CUBS200 (1%, 0.5%), Indoor67 (15%,6%), and Diabetic5 (0%, 0%).

5.2.2 Adaptive Strategy

In the adaptive strategy (Section 4.1.2), we make use ofauxiliary information (labels) in the adversary’s data PA toguide the construction of the transfer set. We represent theselabels as the leaf nodes in the coarse-to-fine concept hier-archy tree. The root node in all cases is a single concept“entity”. We obtain the rest of the hierarchy as follows: (i)D2: we add as parents the dataset the images belong to; (ii)ILSVRC: for each of the 1K labels, we obtain 30 coarse

labels by clustering the mean visual features of each labelobtained using 2048-dim pool features of an ILSVRC pre-trained Resnet model; (iii) OpenImages: We use the exacthierarchy provided by the authors.

6. Results

We now discuss the experimental results.

Training Phases. The knockoff models are trained in twophases: (a) Online: during transfer set construction (Section4.1); followed by (b) Offline: the model is retrained usingtransfer set obtained thus far (Section 4.2). All results onknockoff are reported after step (b).

Evaluation Metric. We evaluate two aspects of the knock-off: (a) Top-1 accuracy: computed on victim’s held-out testdata Dtest

V (b) sample-efficiency: best performance achievedafter a budget of B queries. Accuracy is reported in twoforms: absolute (x%) or relative to blackbox FV (x×).

In each of the following experiments, we evaluate ourapproach with identical hyperparameters across all black-boxes, highlighting the generalizability of model function-ality stealing.

6.1. Transfer Set Construction

In this section, we analyze influence of transfer set{(xi, FV (xi)} on the knockoff. For simplicity, for the re-mainder of this section we fix the architecture of the victimand knockoff to a Resnet-34 [14].

Reference: PA = PV (KD). From Table 2 (second row),we observe: (i) all knockoff models recover 0.92-1.05×performance of FV ; (ii) a better performance than FV itself(e.g., 3.8% improvement on Caltech256) due to regulariz-ing effect of training on soft-labels [15].

Can we learn by querying randomly from an indepen-dent distribution? Unlike KD, the knockoff is now trainedand evaluated on different image distributions (PA and PVrespectively). We first focus on the random strategy, whichdoes not use any auxiliary information.

We make the following observations from Table 2(random): (i) closed-world: the knockoff is able to rea-sonably imitate all the blackbox models, recovering 0.84-0.97× blackbox performance; (ii) open-world: in this chal-lenging scenario, the knockoff model has never encounteredimages of numerous classes at test-time e.g., >90% of thebird classes in CUBS200. Yet remarkably, the knockoff isable to obtain 0.81-0.96× performance of the blackbox.Moreover, results marginally vary (at most 0.04×) betweenILSVRC and OpenImages, indicating any large diverse setof images makes for a good transfer set.

Upon qualitative analysis, we find the image and pseudo-label pairs in the transfer set are semantically incoherent(Fig. 6a) for output classes non-existent in training images

Page 6: Knockoff Nets: Stealing Functionality of Black-Box …Knockoff Nets: Stealing Functionality of Black-Box Models Tribhuvanesh Orekondy 1Bernt Schiele Mario Fritz2 1 Max Planck Institute

random adaptive

PA Caltech256 CUBS200 Indoor67 Diabetic5 Caltech256 CUBS200 Indoor67 Diabetic5

PV (FV ) 78.8 (1×) 76.5 (1×) 74.9 (1×) 58.1 (1×) - - - -PV (KD) 82.6 (1.05×) 70.3 (0.92×) 74.4 (0.99×) 54.3 (0.93×) - - - -

Closed D2 76.6 (0.97×) 68.3 (0.89×) 68.3 (0.91×) 48.9 (0.84×) 82.7 (1.05×) 74.7 (0.98×) 76.3 (1.02×) 48.3 (0.83×)

Open ILSVRC 75.4 (0.96×) 68.0 (0.89×) 66.5 (0.89×) 47.7 (0.82×) 76.2 (0.97×) 69.7 (0.91×) 69.9 (0.93×) 44.6 (0.77×)OpenImg 73.6 (0.93×) 65.6 (0.86×) 69.9 (0.93×) 47.0 (0.81×) 74.2 (0.94×) 70.1 (0.92×) 70.2 (0.94×) 47.7 (0.82×)

Table 2: Accuracy on test sets. Accuracy of blackbox FV indicated in gray and knockoffs FA in black. KD = Knowledge Distillation.Closed- and open-world accuracies reported at B=60k.

0k 1k 10k 100kBudget B

0255075

100

Accu

racy

Caltech256

0k 1k 10k 100kBudget B

CUBS200

0k 1k 10k 100kBudget B

Indoor67PA = D2 ILSVRC OpenImg PV π = adaptive random

0k 1k 10k 100kBudget B

Diabetic5

Figure 5: Performance of the knockoff at various budgets. Presented for various choices of adversary’s image distribution (PA) andsampling strategy π. represents accuracy of blackbox FV and represents chance-level performance.

PA. However, when relevant images are presented at test-time (Fig. 6b), the adversary displays strong performance.Furthermore, we find the top predictions by knockoff rele-vant to the image e.g., predicting one comic character (su-perman) for another.

How sample-efficient can we get? Now we evaluatethe adaptive strategy (discussed in Section 4.1.2). Notethat we make use of auxiliary information of the images inthese tasks (labels of images in PA). We use the rewardset which obtained the best performance in each scenario:{certainty} in closed-world and {certainty, diversity, loss}in open-world.

From Figure 5, we observe: (i) closed-world: adaptiveis extremely sample-efficient in all but one case. Itsperformance is comparable to KD in spite of samplesdrawn from a 36-188× larger image distribution. Wefind significant sample-efficiency improvements e.g., whileCUBS200-random reaches 68.3% at B=60k, adaptiveachieves this 6× quicker at B=10k. We find compara-bly low performance in Diabetic5 as the blackbox ex-hibits confident predictions for all images resulting in poorfeedback signal to guide policy; (ii) open-world: althoughwe find marginal improvements over random in this chal-lenging scenario, they are pronounced in few cases e.g.,1.5× quicker to reach an accuracy 57% on CUBS200 withOpenImages. (iii) as an added-benefit apart from sample-efficiency, from Table 2, we find adaptive display im-proved performance (up to 4.5%) consistently across allchoices of FV .

What can we learn by inspecting the policy? From previ-

ous experiments, we observed two benefits of the adaptivestrategy: sample-efficiency (although more prominent inthe closed-world) and improved performance. The policy πtlearnt by adaptive (Section 4.1.2) additionally allows us tounderstand what makes for good images to query. πt(z) isa discrete probability distribution indicating preference overaction z. Each action z in our case corresponds to labels inthe adversary’s image distribution.

We visualize πt(z) in Figure 7, where each bar repre-sents an action and its color, the parent in the hierarchy.We observe: (i) closed-world (Fig. 7 top): actions sampledwith higher probabilities consistently correspond to outputclasses of FV . Upon analyzing parents of these actions (thedataset source), the policy also learns to sample images forthe output classes from an alternative richer image sourcee.g., “ladder” images in Caltech256 sampled from Open-Images instead; (ii) open-world (Fig. 7 bottom): unlikeclosed-world, the optimal mapping between adversary’s ac-tions to blackbox’s output classes is non-trivial and unclear.However, we find top actions typically correspond to outputclasses of FV e.g., indigo bunting. The policy, in addition,learns to sample coarser actions related to the FV ’s task e.g.,predominantly drawing from birds and animals images toknockoff CUBS200.

What makes for a good reward? Using the adaptivesampling strategy, we now address influence of three re-wards (discussed in Section 4.1.2). We observe: (i) closed-world (Fig. 8 left): All reward signals in adaptive helpswith the sample efficiency over random. Reward cert(which encourages exploitation) provides the best feedback

Page 7: Knockoff Nets: Stealing Functionality of Black-Box …Knockoff Nets: Stealing Functionality of Black-Box Models Tribhuvanesh Orekondy 1Bernt Schiele Mario Fritz2 1 Max Planck Institute

<latexit sha1_base64="hxlY4mPthYhOm4tFnDsLuZ0FQ6E=">AAAEanicfVPNbtNAEN42AUr4S8sBoV4sQqXSQxQjJJC4VKIHJIQoqGmD4lCNN5N0lV3b2h03CSs/DVd4IN6Bh2AdW5A4UVeydvx938zszM6GiRSGOp3fW9u1+q3bd3buNu7df/DwUXN379zEqebY5bGMdS8Eg1JE2CVBEnuJRlChxItw8i7nL65RGxFHZzRPcKBgHImR4EAOumw+CQhnZA/hhXemITIj1J5Byi6brU67s1jeuuGXRouV6/Ryt/Y2GMY8VRgRl2BM3+8kNLCgSXCJWSNIDSbAJzDGvjMjUGgGdlFB5h04ZOiNYu2+iLwFuuxhQRkzV6FTKqArU+VycBPXT2n0ZmBFlKSEES8SjVLpUezl7fCGQiMnOXcGcC3cWT1+BRo4uaY1DrzlWIYPbJ4hge/xSj2WhCvGQRojnPJYKYiGRwHREEeQSrJSEa16hKrynwON4L+7DYZAkNkgz8hB2pMsW+XDyXiF/lZcZQ5XlG4gZCkNQ/ulSjv32VKk3gZ+vsR/rfL5QY0rBG8KIiFEWVUtQi03zQbXyLO+P3DlKdvyq1Fcx3Q+nv+K6a4dVgo3gSuaD1WNyzHNFhmmG6hZQc02UPOCmmf5XZ2gm3ONH12STwlqoFgf2QD0WIELUe43yURUyNzunptffVzrxvnLtu/sz69ax+3y4e2wffaMHTKfvWbH7D07ZV3GWcZ+sJ/sV+1Pfa/+tL5fSLe3Sp/HbGXVn/8F8EiL2Q==</latexit><latexit sha1_base64="nSwIrIrJQ80GyHfW5uMnI2IwlOg=">AAAEZnicfVPNbtNAEN42AUqANqVCHLhYRJVKD1WMkEDiUokekBCioKYNik20Xk/SVXZta3fcxKz8LFzhkXgDHoN1bEHsRF3J2vF83/zuTJAIrrHf/7213WrfuXtv537nwcNHu3vd/ceXOk4VgwGLRayGAdUgeAQD5ChgmCigMhBwFczeFfjVDSjN4+gCswR8SacRn3BG0arG3QMPYYHmKHjhXIBGRwPm426vf9JfHmddcCuhR6pzPt5vvfXCmKUSImSCaj1y+wn6hirkTEDe8VINCWUzOoWRFSMqQftmmX3uHFpN6ExiZb8InaV21cJQqXUmA8uUFK91EyuUm7BRipM3vuFRkiJErAw0SYWDsVO0wgm5AoYiswJlittcHXZNFWVoG9Y5dFZ9aeabIkJCv8e1egxyW4xVKYhgzmIpaRQee4ghTGgq0AiJWLcIZOO/UHS8/+bGCynS3HhFREaFOcvzOh7MpjX4W/mMhbrBtMMgKmoQmC9N2JovVjwNN+DZCv61iReJalsI3OZE0ABEk7V0tdo0490Ay0eub8uTpuc2vdiOqWI8/xUzWEtWcDuBNc6HJsfGmOfLCPMN0KKEFhugrISyvHirM7BzruCjDfIpAUUxVsfGo2oqqXVR3bfReFTS7G3XzW0u17pw+fLEtfLnV73Tk2rxdsgz8pwcEZe8JqfkPTknA8JIRn6Qn+RX6097t/2k/bSkbm9VNgekdtrOX8BLig0=</latexit>

H. Sparrow: 0.73 Gadwall: 0.08

T. Sparrow: 0.06

H. Sparrow: 0.41 Frigate bird: 0.06 B. Cowbird: 0.05

H. Sparrow: 0.81 WC. Sparrow: 0.04 WT. Sparrow: 0.03

M. Warbler: 0.17 H. Sparrow: 0.14 D. E. Junco: 0.10

Gym: 0.98 Locker room: 0.01 Bowling: 0.004

Gym: 0.724 Museum: 0.20

Studio Music: 0.02

Gym: 0.98 Airport Inside: 0.0 TV Studio: 0.00

Hairsalon: 0.23 Gym: 0.17

Office: 0.11

Proliferative: 0.99 Moderate: 0.006

No DR: 0.003

Proliferative: 0.6 Severe: 0.24

Moderate: 0.13

Proliferative: 0.99 Moderate: 0.003 No DR: 0.002

Severe: 0.42 Proliferative: 0.35

Moderate: 0.15

H. Simpson: 0.81 Refrigerator: 0.07 Strain glass: 0.01

H. Simpson: 0.41 Backpack: 0.09 Gas Pump: 0.09

H. Simpson: 0.99 Cartman: 0.00

Stained glass: 0.0

Superman: 0.42 H. Simpson: 0.40 Backpack: 0.08

<latexit sha1_base64="9h7hZ5F8lWUdKEgig6eepj/EQrc=">AAAEUnicfVNNb9NAEN00AdpQIIUjF4u0EuJQxan4uCBVaoWQEKIgkgbFIRpvxskqu7a1u25iVv4d/Bqu8Ae48Fc4sXYskbhR97Lj9968mdn1+jFnSnc6f2o79cat23d295p39+/df9A6eNhXUSIp9mjEIznwQSFnIfY00xwHsUQQPsdLf36W85dXKBWLws86jXEkYBqygFHQFhq33MM34/6hY7wgCnUAgvHUcKFU5llPpDqHndfOGXCNdNZ9/iIbt9qd406xnOuBWwZtUq6L8UF9z5tENBEYaspBqaHbifXIgNSMcsyaXqIwBjqHKQ5tGIJANTLFbJlzZJGJE0TSKVop0PUMA7bbVPhWKUDPVJXLwW3cMNHBq5FhYZxoDOmqUJBwR0dOflDOhEk7P09tAFQy26tDZyCBanuczSNn3UvRkckrxPAt2pjH+KLynQNNL8QFjYSAcGK8CWjIjJfnU+DmPMs2eX8+3aC/ehqX2uRwRWkvnpdS3zefqrRNX645Dbbw6Rr/pcrnjSo7CN5kwsFHXlUVVhLXdVdIs6E7suMJ03arLvbEpEL9f5jetWY5s//ThuZdVWNrLLKiwmILtVxRyy1UuqLSLL+rc7R/rcT3tsiHGCXoSD4zHsipAGtR7jfJWLiS2d0+Hrf6VK4H/e6xa+OP3fbpSfmMdslj8oQ8JS55SU7JW3JBeoSS7+QH+Ul+1X/X/zZqjfpKulMrcx6RjdXY/wfm24IU</latexit>

<latexit sha1_base64="ECkkVCPwoq0HQsIYTiXkMTzL0yo=">AAAEZ3icfVPdahNBFJ42UWv8aaoigjeroSC9CNkiKHhTaC8UFWs1aSQbw+zkJB0ys7vMnG2SDvsu3uob+Qi+hWeTQJNN6MAyZ8/3ne/8zEyYKGmx0fi7tV0q37p9Z+du5d79Bw93q3uPWjZOjYCmiFVs2iG3oGQETZSooJ0Y4DpUcB6OjnP8/BKMlXH0HacJdDUfRnIgBUdy9apPAoQJug+fvrXOjj1JKNisV6016o3Z8tYNf2HU2GKd9vZK74J+LFINEQrFre34jQS7jhuUQkFWCVILCRcjku+QGXENtutm5WfePnn63iA29EXozbzLEY5ra6c6JKbmeGGLWO7chHVSHLztOhklKUIk5okGqfIw9vJZeH1pQKCaksGFkVSrJy644QJpYpV9b1nLiq7LMyT8Kl7px6GkZshlIIKxiLXmUf8gQOzDgKcKndKIqxGhLvznjkpwHe6CPkeeuSDPKLhyJ1m2ioej4Qr8c36OubvApNugFtQwdGdFmMInS0rtDfh0Cf9RxPNCLTUCN4koHoIqsmZSy0NzwSWIrON3qT3tan5RhSZmLOB1M821YpWkG7jC+VjkUI5xNssw3gBN5tBkAzSdQ9MsP6sToHtu4DMl+ZKA4RibAxdwM9ScJBb7TTQZzWm003Pzi49r3Wgd1n2yv76uHdUXD2+HPWcv2SvmszfsiL1np6zJBLtiv9hv9qf0r7xbflp+Nqduby1iHrOVVX7xHw9KiqE=</latexit><latexit sha1_base64="2f3klyPLzDBortaHAXR34qejQOg=">AAAEaXicfVNfixMxEM9dq571z7UKIvqyWA7lHkpXBAVfDrwHQcRTbK/SrSWbnfZCk90lmb22xv0yvuoX8jP4JZxtC9duywWWzM7vN7/JTCZhqqTFdvvv3n6leuPmrYPbtTt3790/rDcedG2SGQEdkajE9EJuQckYOihRQS81wHWo4DycvCvw80swVibxV5ynMNB8HMuRFBzJNaw/ChBm6LpSoNTPrScJB5sP6812q71Y3rbhr4wmW62zYaPyNogSkWmIUShubd9vpzhw3KAUCvJakFlIuZiQfJ/MmGuwA7coIPeOyBN5o8TQF6O38K5HOK6tneuQmJrjhS1jhXMX1s9w9GbgZJxmCLFYJhplysPEK7rhRdKAQDUngwsj6ayeuOCGC6Se1Y68dS0rBq7IkPIfyUY9jlpHPasFBmKYikRrHkfHAWIEI54pdEojbkaEuvRfOGrBVbgLIo48d0GRUXDlTvN8Ew8n4w34+/ImC3eJSfOgVtQwdF/KMIXP1pR6O/D5Gv6tjBcHtVQIXCeieAiqzFpIrTfNBZcg8r4/oPK0a/plFeqYsYBXxXS2DqskTeAG50OZQzmm+SLDdAc0W0KzHdB8Cc3z4q5OgebcwEdK8ikFwzExxy7gZqw5Saz262gyXtJop+fmlx/XttF92fLJ/vyqedJaPbwD9pQ9Yy+Yz16zE/aenbEOE+wn+8V+sz+Vf9VG9XH1yZK6v7eKecg2VrX5Hyoui+g=</latexit>

<latexit sha1_base64="ivsMLvi8K//HScbS2Qd06fgU9yA=">AAAET3icfVPLjtMwFPVMC8yUVweWbCw6IyEWVdJZwAZpxIwQEkIMj3aKmlA57k3Hqp1EtjNtsPIXfA1b+AGWfAk7hJNGok2r8cbX5xyfe68fQcKZ0o7ze2e30bxx89befuv2nbv37rcPHgxUnEoKfRrzWA4DooCzCPqaaQ7DRAIRAYeLYHZa8BdXIBWLo086S8AXZBqxkFGiLTRudw9fjQeH2HhhHOmQCMYzw4VSuWc9geoCxi/waf/lx57j5ON2x+k65cCbgVsFHVSN8/FBY9+bxDQVEGnKiVIj10m0b4jUjHLIW16qICF0RqYwsmFEBCjflI3l+MgiExzGEpd1lOjqDkNsqZkIrFIQfanqXAFu40apDp/7hkVJqiGiy0RhyrGOcXFKeMKkbZ5nNiBUMlsrppdEEqrtWbaO8KqXor4pMiTka7zWjwlEbV0ALS+COY2FINHEeBOiSW68Yj8l3Jzl+TofzKZr9BdPw0KbAq4p7a3zShoE5kOdttsXK07DLXy2wn+u80WhyjYC15lwEgCvq0orCau6K6D5yPVte8J03LqLPTGpQP9vpr9RLGf2Pa1p3tQ1Nsc8LzPMt1CLJbXYQmVLKsuLuzoD+2olvLVJ3iUgiY7lU+MRORXEWlTzdTIWLWV2tp/HrX+VzWDQ67o2ft/rnBxX32gPPUKP0RPkomfoBL1G56iPKPqGvqMf6GfjV+NP42+zku7uVMFDtDaa+/8AfouBYw==</latexit> <latexit sha1_base64="rquHQW98hLx2a7uzO3aOx1RgDIs=">AAAEUHicfVPLbhMxFHWbAG14pbBkY5FWQixKpkiUDVIlKgRCiIJIGpQZIo/nTmrFHo9sT5PBms/ga9jCD7DjT9iBZxKJZBLVG1+fc3zuvX6EKWfadLu/t7YbzWvXb+zstm7eun3nbnvvXl/LTFHoUcmlGoREA2cJ9AwzHAapAiJCDufh5GXJn1+C0kwmn0yeQiDIOGExo8Q4aNR+sv9q1N/H1o9lYmIiGM8tF1oXvvMEakoYv8BvkkhK9ey4GLU73cNuNfB64C2CDlqMs9FeY9ePJM0EJIZyovXQ66YmsEQZRjkULT/TkBI6IWMYujAhAnRgq84KfOCQCMdS4aqQCl3eYYmrNRehUwpiLnSdK8FN3DAz8fPAsiTNDCR0nijOODYSl8eEI6Zc9zx3AaGKuVoxvSCKUOMOs3WAl700DWyZISVf5Uo/NhS1dQm0/ASmVApBksj6ETGksH65nxJuT4tilQ8n4xX6i29gZmwJ15Tu2vlCGob2Y51222dLToMNfL7Ef67zZaHaNQJXmXASAq+rKisFy7pLoMXQC1x7wna8uos7MaXB/G+mt1YsZ+49rWje1jUux7SoMkw3ULM5NdtA5XMqL8q7OgX3ahW8c0nep6CIkeqx9YkaC+IsFvNVMpbMZW52n8erf5X1oH906Ln4w1Hn5OniG+2gB+gheoQ8dIxO0Gt0hnqIom/oO/qBfjZ+Nf40/ja35tLtxYzuo5XRbP0D0+WChg==</latexit> <latexit sha1_base64="3SKYAXo91YP/xh4lUzxQNBJeWGM=">AAAEUXicfVNNb9NAEN02AdqUjxSOXCzSSohDFbdCcEGqRISQEKIgkgbFJlpvxukqu7a1u25iVv4b/Bqu8Ac48VO4MXYskThR97Lj9968mVnvBong2nS7f3Z2G81bt+/s7bcO7t67/6B9+HCg41Qx6LNYxGoYUA2CR9A33AgYJgqoDARcBrPXBX95DUrzOPpssgR8SacRDzmjBqFxu3v0Zjw4cqwXxpEJqeQis0JqnXvoCcwUsPPK6XEagOHseT5ud7on3XI5m4FbBR1SrYvxYWPfm8QslRAZJqjWI7ebGN9ShX4C8paXakgom9EpjDCMqATt23K03DlGZOKEsXLKTkp0NcNSbDaTASolNVe6zhXgNm6UmvClb3mUpAYitiwUpsIxsVOckzPhCscXGQaUKY69OuyKKsoMnmbr2Fn10sy3RYWEfovX5rGBrH0XQMuLYM5iKWk0sd6EGppbr8hnVNhenq/zwWy6Rn/1DCyMLeCaEv+7qKRBYD/VaUxfrDgNt/DZCv+lzheNahwEbjIReE9EXVVaKVjVXQPLR66P40nbcesueGJKg/k/TH+jWcHxPq1p3tU1WGOelxXmW6jFklpsobIlleXFv+oB3loF77HIhwQUNbF6Zj2qppKiRbXfJOPRUoY7Ph63/lQ2g8HpiYvxx9PO+Vn1jPbIY/KEPCUueUHOyVtyQfqEke/kB/lJfjV+N/42SXN3Kd3dqXIekbXVPPgHk8yCAw==</latexit>

Figure 6: Qualitative Results. (a) Samples from the transfer set ({(xi, FV (xi))},xi ∼ PA(X)) displayed for four output classes (onefrom each blackbox): ‘Homer Simpson’, ‘Harris Sparrow’, ‘Gym’, and ‘Proliferative DR’. (b) With the knockoff FA trained on the transferset, we visualize its predictions on victim’s test set ({(xi, FA(xi))},xi ∼ Dtest

V ). Ground truth labels are underlined. Objects from theseclasses, among numerous others, were never encountered while training FA.

airpla

nes-1

01

mot

orbik

es-1

01

face

s-eas

y-10

1

t-shir

t

ham

moc

k

billia

rds

horse

ladde

r

bath

tub

binoc

ulars

goril

la

peop

le

mus

hroo

m

watc

h-10

1

grap

es

light

-hou

se

mat

tress

leopa

rds-1

01

socc

er-b

all

mus

sels

mar

s

peng

uin

hot-t

ub

base

ball-

glove

back

pack

tread

mill

head

-pho

nes

racc

oon

brea

dmak

er

teap

ot

Actions z

0.0

0.02

fiz

Caltech256 · PA = D2

caltechcubsdiabetic

ilsvrcindooropenimg

goldfi

nch

hous

efinc

hjun

coind

igobu

nting

bram

bling

jacam

arch

ickad

eehu

mm

ingbir

dvin

esna

kejag

uar

amer

ican

cham

eleon

bulbu

lcr

icket

redb

reas

ted

mer

gans

ergr

een

lizar

dbe

eeat

ergr

een

snak

egr

een

mam

baleo

pard

dugo

ngtig

er jaydr

ake

robin

redb

acke

dsa

ndpip

ertre

efro

galb

atro

ssm

antis

dam

selfly be

e

Actions z

0.0

0.01

fiz

CUBS200 · PA = ILSVRC

animal bird bird 1 carnivore

Figure 7: Policy π learnt by the adaptive approach. Each barrepresents preference for action z. Top 30 actions (out of 2.1k and1k) are displayed. Colors indicate parent of action in hierarchy.

0

25

50

75

100

Accu

racy

Caltech256 ·PA = D2

rewardcertcert + div

cert + div + Lnoneuncert

Caltech256 ·PA = ILSVRC

0 5k 10k 15k 20kBudget B

0

25

50

75

100

Accu

racy

CUBS200 ·PA = D2

0 5k 10k 15k 20kBudget B

CUBS200 ·PA = ILSVRC

Figure 8: Reward Ablation. cert: certainty, uncert: uncertainty,div: diversity, L: loss, none: no reward (random strategy).

signal. Including other rewards (cert+div+L) slightly de-teriorates performance, as they encourage exploration overrelated or unseen actions – which is not ideal in a closed-world. Reward uncert, a popular measure used in AL liter-ature [3, 9, 39] underperforms in our setting since it encour-ages uncertain (in our case, irrelevant) images. (ii) open-world (Fig. 8 right): All rewards display only none-to-

0 10k 20k 30k 40kBudget B

0

25

50

75

100

Accu

racy

top-k · Caltech256 ·PA = ILSVRC

top k1

25

Noneargmax

0 10k 20k 30k 40kBudget B

rounding · Caltech256 ·PA = ILSVRC

round r12

3Noneargmax

Figure 9: Truncated Posteriors. Influence of training knockoffwith truncated posteriors.

marginal improvements for all choices of FV , with the high-est improvement in CUBS200 using cert+div+L. How-ever, we notice an influence on learnt policies where adopt-ing exploration (div + L) with exploitation (cert) goalsresult in a softer probability distribution π over the actionspace and in turn, encouraging related images.

Can we train knockoffs with truncated blackbox outputs?So far, we found adversary’s attack objective of knockingoff blackbox models can be effectively carried out with min-imal assumptions. Now we explore the influence of vic-tim’s defense strategy of reducing informativeness of black-box predictions to counter adversary’s model stealing at-tack. We consider two truncation strategies: (a) top-k: top-k (out of K) unnormalized posterior probabilities are re-tained, while rest are zeroed-out; (b) rounding r: posteriorsare rounded to r decimals e.g., round(0.127, r=2) = 0.13.In addition, we consider the extreme case “argmax”, whereonly index k = arg maxk yk is returned.

From Figure 9 (with K = 256), we observe: (i) truncat-ing yi – either using top-k or rounding – slightly impacts theknockoff performance, with argmax achieving 0.76-0.84×accuracy of original performance for any budgetB; (ii) top-k: even small increments of k significantly recovers theoriginal performance – 0.91× at k = 2 and 0.96× at k = 5;(iii) rounding: recovery is more pronounced, with 0.99×original accuracy achieved at just r = 2. We find model

Page 8: Knockoff Nets: Stealing Functionality of Black-Box …Knockoff Nets: Stealing Functionality of Black-Box Models Tribhuvanesh Orekondy 1Bernt Schiele Mario Fritz2 1 Max Planck Institute

0 10k 20k 30k 40kBudget B

0

25

50

75

100

Accu

racy

Caltech256 ·FV = Resnet-34

alexnetdensenet161resnet101resnet18

resnet34resnet50vgg16

0 10k 20k 30k 40kBudget B

Caltech256 ·FV = VGG-16

Figure 10: Architecture choices. FV (left: Resnet-34 and right:VGG-16) and FA (lines in each plot).

functionality stealing minimally impacted by reducing in-formativeness of blackbox predictions.

6.2. Architecture choice

In the previous section, we found model functionalitystealing to be consistently effective while keeping the ar-chitectures of the blackbox and knockoff fixed. Now westudy the influence of the architectural choice FA vs. FV .

How does the architecture of FA influence knockoff per-formance? We study the influence using two choices ofthe blackbox FV architecture: Resnet-34 [14] and VGG-16 [41]. Keeping these fixed, we vary architecture of theknockoff FA by choosing from: Alexnet [21], VGG-16[41], Resnet-{18, 34, 50, 101} [14], and Densenet-161 [16].

From Figure 10, we observe: (i) performance of theknockoff ordered by model complexity: Alexnet (lowestperformance) is at one end of the spectrum while sig-nificantly more complex Resnet-101/Densenet-161 are atthe other; (ii) performance transfers across model fami-lies: Resnet-34 achieves similar performance when stealingVGG-16 and vice versa; (iii) complexity helps: selecting amore complex model architecture of the knockoff is bene-ficial. This contrasts KD settings where the objective is tohave a more compact student (knockoff) model.

6.3. Stealing Functionality of a Real-world Black-box Model

Now we validate model functionality stealing on a pop-ular image analysis API. Such image recognition servicesare gaining popularity allowing users to obtain image-predictions for a variety of tasks at low costs ($1-2 per 1kqueries). These image recognition APIs have also beenused to evaluate other attacks e.g., adversarial examples[4, 18, 23]. We focus on a facial characteristics API whichgiven an image, returns attributes and confidences per face.Note that in this experiment, we have semantic informationof blackbox output classes.

Collecting PA. The API returns probability vectors perface in the image and thus, querying irrelevant images leadsto a wasted result with no output information. Hence, weuse two face image sets PA for this experiment: CelebA(220k images) [24] and OpenImages-Faces (98k images).

0 10k 20k 30kBudget B

0

25

50

75

100

Accu

racy

PA = CelebA

0 10k 20k 30kBudget B

PA = OpenImg-Faces

Test setOpenImg-FacesCelebA

FA

resnet101resnet34

Figure 11: Knocking-off a real-world API. Performance of theknockoff achieved with two choices of PA.

We create the latter by cropping faces (plus margin) fromimages in the OpenImages dataset [22].

Evaluation. Unlike previous experiments, we cannot ac-cess victim’s test data. Hence, we create test sets for eachimage set by collecting and manually screening seed anno-tations from the API on ∼5K images.

How does this translate to the real-world? We modeltwo variants of the knockoff using the random strategy(adaptive is not used since no relevant auxiliary informa-tion of images are available). We present each variant usingtwo choices of architecture FA: a compact Resnet-34 anda complex Resnet-101. From Figure 11, we observe: (i)strong performance of the knockoffs achieving 0.76-0.82×performance as that of the API on the test sets; (ii) the di-verse nature OpenImages-Faces helps improve generaliza-tion resulting in 0.82× accuracy of the API on both test-sets; (iii) the complexity of FA does not play a significantrole: both Resnet-34 and Resnet-101 show similar perfor-mance indicating a compact architecture is sufficient to cap-ture discriminative features for this particular task.

We find model functionality stealing translates well tothe real-world with knockoffs exhibiting a strong perfor-mance. The knockoff circumvents monetary and labourcosts of: (a) collecting images for the task; (b) obtainingexpert annotations; and (c) tuning a model. As a result, aninexpensive knockoff is trained which exhibits strong per-formance, using victim API queries amounting to only $30.

7. Conclusion

We investigated the problem of model functionalitystealing where an adversary transfers the functionality of avictim model into a knockoff via blackbox access. In spiteof minimal assumptions on the blackbox, we demonstratedthe surprising effectiveness of our approach. Finally, wevalidated our approach on a popular image recognition APIand found strong performance of knockoffs. We find func-tionality stealing poses a real-world threat that potentiallyundercuts an increasing number of deployed ML models.

Acknowledgement. This research was partially supportedby the German Research Foundation (DFG CRC 1223). Wethank Yang Zhang for helpful discussions.

Page 9: Knockoff Nets: Stealing Functionality of Black-Box …Knockoff Nets: Stealing Functionality of Black-Box Models Tribhuvanesh Orekondy 1Bernt Schiele Mario Fritz2 1 Max Planck Institute

References[1] Eyepacs. https://www.kaggle.com/c/

diabetic-retinopathy-detection. Accessed: 2018-11-08. 4, 11

[2] N. Akhtar, J. Liu, and A. Mian. Defense against universaladversarial perturbations. In CVPR, 2018. 2

[3] W. H. Beluch, T. Genewein, A. Nurnberger, and J. M. Kohler.The power of ensembles for active learning in image classi-fication. In CVPR, 2018. 3, 7

[4] A. N. Bhagoji, W. He, B. Li, and D. Song. Exploring thespace of black-box attacks on deep neural networks. arXivpreprint arXiv:1712.09491, 2017. 8

[5] C. Bucilua, R. Caruana, and A. Niculescu-Mizil. Modelcompression. In KDD, 2006. 1, 2, 4

[6] G. Chen, W. Choi, X. Yu, T. Han, and M. Chandraker. Learn-ing efficient object detection models with knowledge distil-lation. In NIPS, 2017. 2

[7] D. A. Cohn, Z. Ghahramani, and M. I. Jordan. Active learn-ing with statistical models. JAIR, 1996. 2

[8] J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei. Imagenet: A large-scale hierarchical image database. InCVPR, 2009. 3, 4, 5, 12

[9] S. Ebert, M. Fritz, and B. Schiele. Ralf: A reinforced activelearning formulation for object class recognition. In CVPR,2012. 3, 7

[10] M. Fredrikson, S. Jha, and T. Ristenpart. Model inversionattacks that exploit confidence information and basic coun-termeasures. In CCS, 2015. 1

[11] T. Furlanello, Z. C. Lipton, M. Tschannen, L. Itti, andA. Anandkumar. Born again neural networks. In ICML,2018. 2, 4

[12] G. Griffin, A. Holub, and P. Perona. Caltech-256 object cat-egory dataset. 2007. 4, 11

[13] S. Han, H. Mao, and W. J. Dally. Deep compression: Com-pressing deep neural networks with pruning, trained quanti-zation and huffman coding. In ICLR, 2016. 4

[14] K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learningfor image recognition. In CVPR, pages 770–778, 2016. 4, 5,8

[15] G. Hinton, O. Vinyals, and J. Dean. Distilling the knowledgein a neural network. arXiv:1503.02531, 2015. 1, 2, 3, 4, 5,12

[16] G. Huang, Z. Liu, L. van der Maaten, and K. Q. Weinberger.Densely connected convolutional networks. In CVPR, 2017.8

[17] F. N. Iandola, S. Han, M. W. Moskewicz, K. Ashraf, W. J.Dally, and K. Keutzer. Squeezenet: Alexnet-level accu-racy with 50x fewer parameters and¡ 0.5 mb model size.arXiv:1602.07360, 2016. 4

[18] A. Ilyas, L. Engstrom, A. Athalye, and J. Lin. Black-boxadversarial attacks with limited queries and information. InICML, 2018. 8

[19] A. J. Joshi, F. Porikli, and N. Papanikolopoulos. Multi-classactive learning for image classification. In CVPR, 2009. 4

[20] V. Khrulkov and I. Oseledets. Art of singular vectors anduniversal adversarial perturbations. In CVPR, 2018. 2

[21] A. Krizhevsky, I. Sutskever, and G. E. Hinton. Imagenetclassification with deep convolutional neural networks. InNIPS, 2012. 8

[22] A. Kuznetsova, H. Rom, N. Alldrin, J. Uijlings, I. Krasin,J. Pont-Tuset, S. Kamali, S. Popov, M. Malloci, T. Duerig,et al. The open images dataset v4: Unified image classifi-cation, object detection, and visual relationship detection atscale. arXiv:1811.00982, 2018. 5, 8, 11

[23] Y. Liu, X. Chen, C. Liu, and D. Song. Delving into transfer-able adversarial examples and black-box attacks. In ICLR,2017. 8

[24] Z. Liu, P. Luo, X. Wang, and X. Tang. Deep learning faceattributes in the wild. In ICCV, 2015. 8

[25] S.-M. Moosavi-Dezfooli, A. Fawzi, O. Fawzi, andP. Frossard. Universal adversarial perturbations. In CVPR,2017. 2

[26] S.-M. Moosavi-Dezfooli, A. Fawzi, and P. Frossard. Deep-fool: a simple and accurate method to fool deep neural net-works. In CVPR, 2016. 2

[27] S. J. Oh, M. Augustin, B. Schiele, and M. Fritz. Towardsreverse-engineering black-box neural networks. In ICLR,2018. 1, 2, 4

[28] S. J. Oh, R. Benenson, M. Fritz, and B. Schiele. Facelessperson recognition; privacy implications in social media. InECCV, 2016. 2

[29] S. J. Oh, M. Fritz, and B. Schiele. Adversarial image pertur-bation for privacy protection – a game theory perspective. InICCV, 2017. 2

[30] T. Orekondy, M. Fritz, and B. Schiele. Connecting pixels toprivacy and utility: Automatic redaction of private informa-tion in images. In CVPR, 2018. 2

[31] T. Orekondy, B. Schiele, and M. Fritz. Towards a visualprivacy advisor: Understanding and predicting privacy risksin images. In ICCV, 2017. 2

[32] N. Papernot, P. McDaniel, I. Goodfellow, S. Jha, Z. B. Celik,and A. Swami. Practical black-box attacks against machinelearning. In Asia CCS, 2017. 1, 2

[33] F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel,B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss,V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau,M. Brucher, M. Perrot, and E. Duchesnay. Scikit-learn: Ma-chine learning in Python. Journal of Machine Learning Re-search, 12:2825–2830, 2011. 12

[34] O. Poursaeed, I. Katsman, B. Gao, and S. Belongie. Genera-tive adversarial perturbations. In CVPR, 2018. 2

[35] A. Quattoni and A. Torralba. Recognizing indoor scenes. InCVPR, 2009. 4, 11

[36] A. Romero, N. Ballas, S. E. Kahou, A. Chassang, C. Gatta,and Y. Bengio. Fitnets: Hints for thin deep nets. In ICLR,2015. 3

[37] O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh,S. Ma, Z. Huang, A. Karpathy, A. Khosla, M. Bernstein,et al. Imagenet large scale visual recognition challenge.IJCV, 2015. 5, 12

[38] A. Salem, Y. Zhang, M. Humbert, M. Fritz, and M. Backes.Ml-leaks: Model and data independent membership infer-ence attacks and defenses on machine learning models. InNDSS, 2019. 1

Page 10: Knockoff Nets: Stealing Functionality of Black-Box …Knockoff Nets: Stealing Functionality of Black-Box Models Tribhuvanesh Orekondy 1Bernt Schiele Mario Fritz2 1 Max Planck Institute

[39] B. Settles and M. Craven. An analysis of active learningstrategies for sequence labeling tasks. In EMNLP, 2008. 2,4, 7

[40] R. Shokri, M. Stronati, C. Song, and V. Shmatikov. Member-ship inference attacks against machine learning models. InSecurity and Privacy (S&P), 2017. 1, 2

[41] K. Simonyan and A. Zisserman. Very deep con-volutional networks for large-scale image recognition.arXiv:1409.1556, 2014. 4, 8

[42] Q. Sun, L. Ma, S. J. Oh, L. van Gool, B. Schiele, andM. Fritz. Natural and effective obfuscation by head inpaint-ing. In CVPR, 2018. 2

[43] R. S. Sutton and A. G. Barto. Introduction to reinforcementlearning, volume 135. MIT press Cambridge, 1998. 4

[44] S. Tong and D. Koller. Support vector machine active learn-ing with applications to text classification. JMLR, 2001. 2

[45] F. Tramer, F. Zhang, A. Juels, M. K. Reiter, and T. Risten-part. Stealing machine learning models via prediction apis.In USENIX Security, 2016. 1, 2

[46] A. Vittorio. Toolkit to download and visualize single or mul-tiple classes from the huge open images v4 dataset. https://github.com/EscVM/OIDv4_ToolKit, 2018. 12

[47] C. Wah, S. Branson, P. Welinder, P. Perona, and S. Belongie.The Caltech-UCSD Birds-200-2011 Dataset. Technical Re-port CNS-TR-2011-001, California Institute of Technology,2011. 4, 11

[48] B. Wang and N. Z. Gong. Stealing hyperparameters in ma-chine learning. In Security and Privacy (S&P), 2018. 2, 4

[49] Z. Wu, Z. Wang, Z. Wang, and H. Jin. Towards privacy-preserving visual recognition via adversarial training: A pilotstudy. In ECCV, 2018. 2

[50] R. Yonetani, V. N. Boddeti, K. M. Kitani, and Y. Sato.Privacy-preserving visual learning using doubly permutedhomomorphic encryption. In ICCV, 2017. 2

[51] Y. Zhang, T. Xiang, T. M. Hospedales, and H. Lu. Deepmutual learning. In CVPR, 2018. 2

Page 11: Knockoff Nets: Stealing Functionality of Black-Box …Knockoff Nets: Stealing Functionality of Black-Box Models Tribhuvanesh Orekondy 1Bernt Schiele Mario Fritz2 1 Max Planck Institute

AppendicesA. Contents

The appendix contains:

A. Contents (this section)

B. Extended descriptions

1. Blackbox models

2. Overlap between PV and PA3. Aggregating OpenImages and OpenImages-

Faces

4. Additional implementation details

C. Extensions of existing results

1. Qualitative Results

2. Sample-efficiency of GT

3. Policies learnt by adaptive strategy

4. Reward Ablation

D. Auxiliary experiments

1. Seen and unseen classes

2. Adaptive strategy: With/without hierarchy

3. Semi-open world: τD2

B. Extended DescriptionsIn this section, we provide additional detailed descrip-

tions and implementation details.

B.1. Black-box models

We supplement Section 5.1 by providing extended de-scriptions of the blackboxes listed in Table 1. Each black-box FV is trained on one particular image classificationdataset.

Black-box 1: Caltech256 [12]. Caltech-256 is a popu-lar dataset for general object recognition gathered by down-loading relevant examples from Google Images and manu-ally screening for quality and errors. The dataset contains30k images covering 256 common object categories.

Black-box 2: CUBS200 [47]. A fine-grained bird-classifieris trained on the CUBS-200-2011 dataset. This dataset con-tains roughly 30 train and 30 test images for each of 200species of birds. Due to the low intra-class variance, col-lecting and annotating images is challenging even for expertbird-watchers.

Black-box 3: Indoor67 [35]. We introduce another fine-grained task of recognizing 67 types of indoor scenes. This

PV

PACaltech256

(K=256)CUBS200(K=200)

Indoor67(K=67)

Diabetic5(K=5)

ILSVRC (Z=1000) 108 (42%) 2 (1%) 10 (15%) 0 (0%)OpenImages (Z=601) 114 (44%) 1 (0.5%) 4 (6%) 0 (0%)

Table 3: Overlap between PA and PV .

dataset consists of 15.6k images collected from Google Im-ages, Flickr, and LabelMe.Black-box 4: Diabetic5 [1]. Diabetic Retinopathy (DR)is a medical eye condition characterized by retinal damagedue to diabetes. Cases are typically determined by trainedclinicians who look for presence of lesions and vascular ab-normalities in digital color photographs of the retina cap-tured using specialized cameras. Recently, a dataset of such35k retinal image scans was made available as a part of aKaggle competition [1]. Each image is annotated by a clini-cian on a scale of 0 (no DR) to 4 (proliferative DR). Thishighly-specialized biomedical dataset also presents chal-lenges in the form of extreme imbalance (largest class con-tains 30× as the smallest one).

B.2. Overlap: Open-world

In this section, we supplement Section 5.2.1 in the mainpaper by providing more details on how overlap was cal-culated in the open-world scenarios. We manually com-pute overlap between labels of the blackbox (K, e.g., 256Caltech classes) and the adversary’s dataset (Z, e.g., 1kILSVRC classes) as: 100 × |K ∩ Z|/|K|. We denote twolabels k ∈ K and z ∈ Z to overlap if: (a) they havethe same semantic meaning; or (b) z is a type of k e.g.,z = “maltese dog” and k = “dog”. The exact numbers areprovided in Table 3. We remark that this is a soft-lowerbound. For instance, while ILSVRC contains “Humming-bird” and CUBS-200-2011 contains three distinct speciesof hummingbirds, this is not counted towards the overlap asthe adversary lacks annotated data necessary to discriminateamong the three species.

B.3. Dataset Aggregation

All datasets used in the paper (expect OpenImages) havebeen used in the form made publicly available by the au-thors. We use a subset of OpenImages due to storage con-straints imposed by its massive size (9M images). The de-scription to obtain these subsets are provided below.OpenImages. We retrieve 2k images for each of the 600OpenImages [22] “boxable” categories, resulting in 554kunique images. ∼19k images are removed for either beingcorrupt or representing Flickr’s placeholder for unavailableimages. This results in a total of 535k unique images.OpenImages-Faces. We download all images (422k) fromOpenImages [22] with label “/m/0dzct: Human face”

Page 12: Knockoff Nets: Stealing Functionality of Black-Box …Knockoff Nets: Stealing Functionality of Black-Box Models Tribhuvanesh Orekondy 1Bernt Schiele Mario Fritz2 1 Max Planck Institute

0 10k 20kBudget B

0255075

100

Accu

racy

Caltech256

PV (KD)PV (FV )

2k 4kBudget B

CUBS200

0 5k 10kBudget B

Indoor67

0 10k 20k 30kBudget B

Diabetic5

Figure 12: Training on GT vs. KD. Extension of Figure 5. We compare sample efficiency of first two rows in Table 2: “PV (FV )”(training with GT data) and “PV (KD)” (training with soft-labels of GT images produced by FV )

using the OID tool [46]. The bounding box annotations areused to crop faces (plus a margin of 25%) containing at least180×180 pixels. We restrict to at most 5 faces per image tomaintain diversity between train/test splits. This results in atotal of 98k faces images.

B.4. Additional Implementation Details

In this section, we provide implementation details to sup-plement discussions in the main paper.

Training FV = Diabetic5. Training this victimmodel is identical to other blackboxes except for one as-pect: weighted loss. Due to the extreme imbalance betweenclasses of the dataset, we weigh each class as follows. Letnk denote the number of images belonging to class k andlet nmin = mink nk. We weigh the loss for each class kas nmin/nk. From our experiments with weighted loss, wefound approximately 8% absolute improvement in overallaccuracy on the test set. However, the training of knock-offs of all blackboxes are identical in all aspects, includinga non-weighted loss irrespective of the victim blackbox tar-geted.

Creating ILSVRC Hierarchy. We represent the 1k labelsof ILSVRC as a hierarchy Figure 4b in the form: root node“entity” → N coarse nodes → 1k leaf nodes. We obtainN (30 in our case) coarse labels as follows: (i) a 2048-dmean feature vector representation per 1k labels is obtainedusing an Imagenet-pretrained ResNet ; (ii) we cluster the 1kfeatures into N clusters using scikit-learn’s [33] implemen-tation of agglomerative clustering; (iii) we obtain semanticlabels per cluster (i.e., coarse node) by finding the commonparent in the Imagenet semantic hierarchy.

Adaptive Strategy. Recall from Section 6, we train theknockoff in two phases: (a) Online: during transfer set con-struction; followed by (b) Offline: the model is retrainedusing transfer set obtained thus far. In phase (a), we trainFA with SGD (with 0.5 momentum) with a learning rate of0.0005 and batch size of 4 (i.e., 4 images sampled at eacht). In phase (b), we train the knockoff FA from scratch onthe transfer set using SGD (with 0.5 momentum) for 100

epochs with learning rate of 0.01 decayed by a factor of 0.1every 60 epochs.

C. Extensions of Existing ResultsIn this section, we present extensions of existing results

discussed in the main paper.

C.1. Qualitative Results

Qualitative results to supplement Figure 6 are providedin Figures 13-16. Each row in the figures correspond toan output class of the blackbox whose images the knockoffhas never encountered before. Images in the “transfer set”column were randomly sampled from ILSVRC [8, 37]. Incontrast, images in the “test set” belong to the victim’s testset (Caltech256, CUBS-200-2011, etc.).

C.2. Sample Efficiency: Training Knockoffs on GT

We extend Figure 5 in the main paper to include train-ing on the same ground-truth data used to train the black-boxes. This extension “PV (FV )” is illustrated in Figure12, displayed alongside KD approach. The figure repre-sents the sample-efficiency of the first two rows of 2. Herewe observe: (i) comparable performance in all but one case(Diabetic5, discussed shortly) indicating KD is an effec-tive approach to train knockoffs; (ii) we find KD achievebetter performance in Caltech256 and Diabetic5 due toregularizing effect of training on soft-labels [15] on an im-balanced dataset.

C.3. Policies learnt by Adaptive

We inspected the policy π learnt by the adaptive strat-egy in Section 6.1. In this section, we provide policies overall blackboxes in the closed- and open-world setting. Fig-ures 17a and 17c display probabilities of each action z ∈ Zat t = 2500.

Since the distribution of rewards is non-stationary, wevisualize the policy over time in Figure 17b for CUBS200in a closed-world setup. From this figure, we observe anevolution where: (i) at early stages (t ∈ [0, 2000]), the

Page 13: Knockoff Nets: Stealing Functionality of Black-Box …Knockoff Nets: Stealing Functionality of Black-Box Models Tribhuvanesh Orekondy 1Bernt Schiele Mario Fritz2 1 Max Planck Institute

Transfer Set Test Set

Buddha: 0.65Boxing glove: 0.07Jesus Christ: 0.034

Buddha: 0.49Cake: 0.06

Minotaur: 0.05

Buddha: 0.28Minotaur: 0.09

Dog: 0.09

Buddha: 0.68Tombstone: 0.07Elephant: 0.04

Buddha: 0.5Minaret: 0.08

Tower Pisa: 0.05

People: 0.26Buddha: 0.25T-shirt: 0.14

Floppy-disk: 0.77Socks: 0.04

Mattress: 0.03

Floppy-disk: 0.46iPod: 0.18CD: 0.06

Floppy-disk: 0.18Flashlight: 0.08

Pez-dispenser: 0.07

Floppy-disk: 0.74Necktie: 0.04

Video Projec.: 0.02

Floppy-disk: 0.61Socks: 0.02

Teddy bear: 0.02

iPod: 0.49Floppy-disk: 0.08

CD: 0.04

Tomato: 0.96Grapes: 0.01

Boxing glove: 0.00

Tomato: 0.43Welding mask: 0.12Bowling ball: 0.01

Tomato: 0.27Dice: 0.18

Stained glass: 0.01

Tomato: 0.94Boxing glove: 0.04Bowling ball: 0.00

Tomato: 0.41Cactus: 0.07Iguana: 0.6

Mushroom: 0.17Tomato: 0.10Birdbath: 0.07

Figure 13: Qualitative results: Caltech256. Extends Figure 6 in the main paper. GT labels are underlined, correct knockoff top-1predictions in green and incorrect in red.

Transfer Set Test Set

Gadwall: 0.65Nighthawk: 0.06

Horned Lark: 0.05

Gadwall: 0.31B. Swallow: 0.14N. Flicker: 0.12

Gadwall: 0.14Chuck. Widow: 0.13Swain. Warbler: 0.1

Gadwall: 0.95Mallard: 0.01

Rb. Merganser: 0.00

Gadwall: 0.44Mallard: 0.15

Rb. Merganser: 0.11

Pom. Jaeger: 0.16Black Tern: 0.11

Herring Gull: 0.07

Lin. Sparrow: 0.82Ovenbird: 0.07

House Sparrow: 0.03

Lin. Sparrow: 0.49Mockingbird: 0.18

N. Waterthrush: 0.07

Lin. Sparrow: 0.32Song Sparrow: 0.06Tree Sparrow: 0.05

Lin. Sparrow: 0.64Hen. Sparrow: 0.05

Clay c. Sparrow: 0.03

Lin. Sparrow: 0.50Song Sparrow: 0.24Hen. Sparrow: 0.04

Hen. Sparrow: 0.28Ovenbird: 0.11

Lin. Sparrow: 0.10

Cactus Wren: 0.95W. Meadowlark: 0.02

Lin. Sparrow: 0.00

Cactus Wren: 0.88Geococcyx: 0.04

W. Meadowlark: 0.03

Cactus Wren: 0.33N. Flicker: 0.28

Lin. Sparrow: 0.07

Cactus Wren: 0.86Rock Wren: 0.02

R. Blackbird: 0.01

Cactus Wren: 0.82N. Flicker: 0.02Geococcyx: 0.02

Geococcyx: 0.25Cactus Wren: 0.20Nighthawk: 0.08

Figure 14: Qualitative results: CUBS200. Extends Figure 6 in the main paper.

Page 14: Knockoff Nets: Stealing Functionality of Black-Box …Knockoff Nets: Stealing Functionality of Black-Box Models Tribhuvanesh Orekondy 1Bernt Schiele Mario Fritz2 1 Max Planck Institute

Transfer Set Test Set

Casino: 0.79Deli: 0.03

Jewel. Shop: 0.03

Casino: 0.38Airport ins.: 0.35

Bowling: 0.05

Casino: 0.18Bar: 0.18

Movie theater: 0.13

Casino: 0.99Deli: 0.00

Toystore: 0.00

Casino: 0.88Toystore: 0.08

Bar: 0.01

Restaurant: 0.46Bar: 0.24

Airport ins.: 0.07

Ins. Subway: 0.87Video store: 0.11

Dental office: 0.01

Ins. Subway: 0.59Florist: 0.12

Greenhouse: 0.06

Ins. Subway: 0.37Airport ins.: 0.12Train station: 0.08

Ins. Subway: 0.96Casino: 0.02

Museum: 0.01

Ins. Subway: 0.86Train station: 0.07

Subway: 0.05

Corridor: 0.45Ins. Subway: 0.11

Bar: 0.09

Prison cell: 0.52Elevator: 0.20

Airport ins.: 0.11

Prison cell: 0.23Museum: 0.19Nursery: 0.17

Prison cell: 0.21Museum: 0.12

Airport ins.: 0.11

Prison cell: 0.83Kitchen: 0.03

Locker room: 0.03

Prison cell: 0.52Subway: 0.08Nursery: 0.07

Wine cellar: 0.31Prison cell: 0.17Staircase: 0.09

Figure 15: Qualitative results: Indoor67. Extends Figure 6 in the main paper. GT labels are underlined, correct top-1 knockoffpredictions in green and incorrect in red.

Transfer Set Test Set

No DR: 0.73Proliferative: 0.12

Moderate: 0.08

No DR: 0.48Mild: 0.36

Moderate: 0.16

No DR: 0.30Moderate: 0.29

Proliferative: 0.28

No DR: 0.50Mild: 0.33

Moderate: 0.13

No DR: 0.36Mild: 0.33

Moderate: 0.28

Mild: 0.53No DR: 0.43

Moderate: 0.03

Moderate: 0.69No DR: 0.31Mild: 0.01

Moderate: 0.63No DR: 0.15Severe: 0.13

Moderate: 0.35Mild: 0.32

No DR: 0.22

Moderate: 0.48Mild: 0.316No DR: 0.21

Moderate: 0.35Mild: 0.31

No DR: 0.23

No DR: 0.36Mild: 0.33

Moderate: 0.26

Severe: 0.73Proliferative: 0.23

Moderate: 0.04

Severe: 0.70Proliferative: 0.30

Moderate: 0.00

Severe: 0.53Mild: 0.16

Moderate: 0.15

Severe: 0.57Moderate: 0.23

Proliferative: 0.19

Severe: 0.41Proliferative: 0.29

Moderate: 0.24

Moderate: 0.62Severe: 0.16Mild: 0.13

Figure 16: Qualitative results: Diabetic5. Extends Figure 6 in the main paper.

Page 15: Knockoff Nets: Stealing Functionality of Black-Box …Knockoff Nets: Stealing Functionality of Black-Box Models Tribhuvanesh Orekondy 1Bernt Schiele Mario Fritz2 1 Max Planck Institute

Policy: D^2

hum

min

gbird

ham

moc

k

wate

rfall

car-t

ire

gras

shop

per

porc

upin

e

syrin

ge

skat

eboa

rd

picn

ic-ta

ble

fire-

hydr

ant

fire-

extin

guish

er

palm

-tree

tenn

is-co

urt

cowb

oy-h

at

goril

la

hour

glass

bask

etba

ll-ho

op

eyeg

lasse

s

cake

sadd

le

refri

gera

tor

dolp

hin-

101

swor

d

hom

er-si

mps

on

bons

ai-10

1

zebr

a

billia

rds

cock

roac

h

tra�

c-lig

ht

tread

mill

Actions z

0.0

0.01

0.02

0.03

fiz

Caltech256 · PA = D2 · t œ [0, 2500] · T = 7401

caltechcubsdiabetic

ilsvrcindooropenimg

bowl

ing

lock

erro

om

laund

rom

at

ware

hous

e

book

store

groc

erys

tore

dini

ngro

om bar

airpo

rtin

side

mee

ting

room

bake

ry

mall

mov

iethe

ater

tvstu

dio

kitch

en

wine

cella

r

audi

toriu

m

stairs

case

insid

esu

bway deli

subw

ay

train

statio

n

gym

pooli

nsid

e

waiti

ngro

om

casin

o

livin

groo

m

hairs

alon

bath

room

mus

eum

Actions z

0.0

0.01

0.02

0.03

fiz

Indoor67 · PA = D2 · t œ [0, 2500] · T = 7401

caltechcubsdiabetic

ilsvrcindooropenimg

Easte

rnTo

whee

LeCo

nte

Spar

row

Pied

bille

dGr

ebe

Flor

ida

Jay

Hood

edOr

iole

Whi

tebr

easte

dNu

that

ch

Long

taile

dJa

eger

Gadw

all

Seas

ide

Spar

row

Logg

erhe

adSh

rike

Gras

shop

perS

parro

w

War

blin

gVi

reo

Belte

dKi

ngfis

her

Pilea

ted

Woo

dpec

ker

Whi

tebr

easte

dKi

ngfis

her

Bobo

link

Ceda

rWax

wing

Hood

edM

erga

nser

Fish

Crow

Sciss

orta

iled

Flyc

atch

er

Rufo

usHu

mm

ingb

ird

Sum

mer

Tana

ger

Calif

orni

aGu

ll

Man

grov

eCu

ckoo

Bran

dtCo

rmor

ant

Herri

ngGu

ll

Pom

arin

eJa

eger

Amer

ican

Gold

finch

Wes

tern

Greb

e

Wor

mea

ting

War

bler

Actions z

0.0

0.025

fiz

CUBS200 · PA = D2 · t œ [0, 2500] · T = 7401

caltechcubs

diabeticilsvrc

indooropenimg

snail

fire-

hydr

ant

porc

upin

e

cano

e

dog

toas

ter

back

pack

cam

el

bath

tub

duck frog

llam

a-10

1

zebr

a

kang

aroo

-101

scor

pion

-101

wire

haire

dfo

xte

rrier

ham

ster

mot

orsc

oote

r

suit

castl

e

sloth

bear

skun

k

man

dolin

wom

bat

gira�

e

mus

hroo

m

sidew

inde

r

quail hen

flam

ingo

Actions z

0.0

0.02

fiz

Diabetic5 · PA = D2 · t œ [0, 2500] · T = 7401

caltechcubs

diabeticilsvrc

indooropenimg

(a) Closed world.

Policy: D^2 - evolution over time

Flor

ida

Jay

Logg

erhe

adSh

rike

Wes

tern

Greb

e

Brow

nTh

rash

er

Yello

wbr

easte

dCh

at

Hood

edOr

iole

Com

mon

Yello

wthr

oat

Whi

tebr

easte

dNu

that

ch

Long

taile

dJa

eger

Cape

May

War

bler

Prair

ieW

arbl

er

Bank

Swall

ow

Ring

edKi

ngfis

her

Leas

tFlyc

atch

er

Com

mon

Tern

Hens

lowSp

arro

w

Nelso

nSh

arp

taile

dSp

arro

w

Pine

Gros

beak

Chuc

kwi

llW

idow

Brew

erSp

arro

w

Blue

Jay

Hous

eW

ren

Bay

brea

sted

War

bler

Gray

King

bird

Whi

teey

edVi

reo

Sava

nnah

Spar

row

Wes

tern

Gull

Red

brea

sted

Mer

gans

er

Glau

cous

wing

edGu

ll

Bewi

ckW

ren

Actions z

0.0

0.01

fiz

CUBS200 · PA = D2 · t œ [0, 1000] · T = 7401

caltechcubs

diabeticilsvrc

indooropenimg

Blue

Gros

beak

Fish

Crow

Tree

Swall

ow

Blac

kbi

lled

Cuck

oo

Acad

ianFl

ycat

cher

Pain

ted

Bunt

ing

Sage

Thra

sher

Kent

ucky

War

bler

Win

terW

ren

Cres

ted

Aukle

t

Heer

man

nGu

ll

Whi

tene

cked

Rave

n

Amer

ican

Reds

tart

Ruby

thro

ated

Hum

min

gbird

Mag

nolia

War

bler

Blue

wing

edW

arbl

er

Soot

yAl

batro

ss

Indi

goBu

ntin

g

Mou

rnin

gW

arbl

er

Hood

edW

arbl

er

Whi

tePe

lican

Pelag

icCo

rmor

ant

Whi

ppo

orW

ill

Rock

Wre

n

Oven

bird

North

ern

Flick

er

Gras

shop

perS

parro

w

Eleg

antT

ern

Whi

teth

roat

edSp

arro

w

Easte

rnTo

whee

Actions z

0.0

0.01

fiz

CUBS200 · PA = D2 · t œ [1000, 2000] · T = 7401

caltechcubs

diabeticilsvrc

indooropenimg

ostri

ch

tenn

is-sh

oes

jagua

r

cran

e

jacke

t

bicy

clewh

eel

mail

box

duck

car-t

ire

bino

cular

s

peng

uin

snak

e

porc

upin

e

gond

ola

spor

tsun

iform

teap

ot

tenn

isra

cket

chee

se

tank

scre

wdriv

er

foot

ball

helm

et

gold

fish

skys

crap

er

skat

eboa

rd

carb

onar

a

violin

willo

w

drag

onfly

goril

la

cras

hhe

lmet

Actions z

0.0

0.005

0.01

fiz

CUBS200 · PA = D2 · t œ [3000, 4000] · T = 7401

caltechcubs

diabeticilsvrc

indooropenimg

snail

shee

t-mus

ic

ipod

goat

back

pack

close

t

otte

r

rota

ry-p

hone

budd

ha-1

01

spag

hetti

squa

sh

mixi

ngbo

wl

jagua

r

cano

e

tom

bsto

ne

pape

rclip

pant

ry

wine

video

store

lemon

labor

ator

ywet

knife

gold

fish

fryin

g-pa

n

eleph

ant

coin

bide

t

kite

swan

jackf

ruit

rottw

eiler

Actions z

0.0

0.005

0.01

fiz

CUBS200 · PA = D2 · t œ [2000, 3000] · T = 7401

caltechcubs

diabeticilsvrc

indooropenimg

(b) Closed world. Analyzing policy over time t for CUBS200.

Policy: ILSVRC

libra

rypo

olta

ble crib

book

case

altar

ward

robe

thea

terc

urta

inbo

oksh

opor

gan

wash

erto

ysho

pdin

ingta

blegu

acam

olech

i�on

ierpiz

zash

ojim

edici

nech

est

desk

top

com

pute

rho

tpot

resta

uran

tde

skm

icrow

ave

groc

ery

store

refri

gera

tor

filech

inaca

binet

toba

cco

shop

carb

onar

ash

oesh

oppla

te

Actions z

0.0

0.01

fiz

Indoor67 · PA = ILSVRC

artifact 1 artifact 5 instrumentality physical entity 1

carb

onar

am

anho

leco

ver

elect

ricfa

nty

pewr

iterk

eybo

ard

amer

ican

lobste

rse

acu

cum

ber

dom

eco

mpu

terk

eybo

ard

pizza

shiel

dwo

kwa

lking

stick coil

runn

ingsh

oejac

kolan

tern

chee

tah

diam

ondb

ack

hotp

otim

pala

elect

ricloc

omot

iveco

nfec

tione

ry ox ram

mea

tloa

fco

ralf

ungu

sab

acus

toys

hop

wire

haire

dfox

terri

erwa

rthog

bison

Actions z

0.0

0.01

fiz

Diabetic5 · PA = ILSVRC

animal 2artifact 4carnivore

dogeven-toed ungulate

physical entityphysical entity 1

physical entity 3self-propelled vehicle

radio

teles

cope

airlin

erwa

rplan

ebe

acon

airsh

ipho

urgla

ssze

bra

scho

oner

chee

sebu

rger

mon

arch

gard

ensp

ider

yawl

starfi

shca

ssette

playe

rsn

owm

obile

black

and

gold

phot

ocop

ierha

ndhe

ldco

mpu

ter

goldfi

shto

aste

rob

elisk

ballo

onjoy

stick fly

pitch

erha

mm

erhe

adkil

lerwh

alem

icrow

ave

man

tis pick

Actions z

0.0

0.01

fiz

Caltech256 · PA = ILSVRC

animalanimal 1animal 2

artifactartifact 4artifact 5

even-toed ungulateinstrumentalityinstrumentality 1

instrumentality 2physical entityphysical entity 1

goldfi

nch

hous

efinc

hjun

coind

igobu

nting

bram

bling

jacam

arch

ickad

eehu

mm

ingbir

dvin

esna

kejag

uar

amer

ican

cham

eleon

bulbu

lcr

icket

redb

reas

ted

mer

gans

ergr

een

lizar

dbe

eeat

ergr

een

snak

egr

een

mam

baleo

pard

dugo

ngtig

er jaydr

ake

robin

redb

acke

dsa

ndpip

ertre

efro

galb

atro

ssm

antis

dam

selfly be

e

Actions z

0.0

0.01

fiz

CUBS200 · PA = ILSVRC

animal bird bird 1 carnivore

(c) Open world.

Figure 17: Policies learnt by adaptive strategy. Supplements Figure 7 in the main paper.

approach samples (without replacement) images that over-laps with the victim’s train data; and (ii) at later stages(t ∈ [2000, 4000]), since the overlapping images have beenexhausted, the approach explores related images from otherdatasets e.g., “ostrich”, “jaguar”.

C.4. Reward Ablation

The reward ablation experiment 8 for the remainingdatasets are provided in Figure 18. We make similar ob-servations as before for Indoor67. However, since FV =Diabetic5 demonstrates confident predictions in all im-ages (discussed in 563-567), we find little-to-no improve-

Page 16: Knockoff Nets: Stealing Functionality of Black-Box …Knockoff Nets: Stealing Functionality of Black-Box Models Tribhuvanesh Orekondy 1Bernt Schiele Mario Fritz2 1 Max Planck Institute

0

25

50

75

100

Accu

racy

Indoor67 ·PA = D2

rewardcertcert + div

cert + div + Lnoneuncert

Indoor67 ·PA = ILSVRC

0 5k 10k 15k 20kBudget B

0

25

50

75

100

Accu

racy

Diabetic5 ·PA = D2

0 5k 10k 15k 20kBudget B

Diabetic5 ·PA = ILSVRC

Figure 18: Reward Ablation. Supplements Figure 8 in the mainpaper.

0 20k 40k 60kBudget B

0

25

50

75

100

Accu

racy

Caltech256

classesallseenunseen

0 20k 40k 60kBudget B

Indoor67

Figure 19: Per class evaluation. Per-class evaluation split intoseen and unseen classes.

ment for knockoffs of this victim model.

D. Auxiliary ExperimentsIn this section, we present experiments to supplement

existing results in the main paper.

D.1. Seen and Unseen classes

We now discuss evaluation to supplement Section 5.2.1and Section 6.

In 6.1 we highlighted strong performance of the knock-off even among classes that were never encountered (see Ta-ble 3 for exact numbers) during training. To elaborate, wesplit the blackbox output classes into “seen” and “unseen”categories and present mean per-class accuracies in Figure19. Although we find better performance on classes seenwhile training the knockoff, performance of unseen classesis remarkably high, with the knockoff achieving >70% per-formance in both cases.

D.2. Adaptive: With and without hierarchy

The adaptive strategy presented in Section 4.1.2 usesa hierarchy discussed in Section 5.2.2. As a result, we ap-proached this as a hierarchical multi-armed bandit problem.Now, we present an alternate approach adaptive-flat,without the hierarchy. This is simply a multi-armed banditproblem with |Z| arms (actions).

Figure 20 illustrates the performance of these approachesusing PA = D2 (|Z| = 2129) and rewards {certainty, diver-sity, loss}. We observe adaptive consistently outperforms

adaptive-flat. For instance, in CUBS200, adaptive is2× more sample-efficient to reach accuracy of 50%. Wefind the hierarchy helps the adversary (agent) better navi-gate the large action space.

D.3. Semi-open World

The closed-world experiments (PA = D2) presented inSection 6.1 and discussed in Section 5.2.1 assumed accessto the image universe. Thereby, the overlap between PA andPV was 100%. Now, we present an intermediate overlapscenario semi-open world by parameterizing the overlapas: (i) τd: The overlap between images PA and PV is 100×τd; and (ii) τk: The overlap between labelsK andZ is 100×τk. In both these cases τd, τk ∈ (0, 1] represents the fractionof PA used. τd = τk = 1 depicts the closed-world scenariodiscussed in Section 6.1.

From Figure 21, we observe: (i) the random strategyis unaffected in the semi-open world scenario, displayingcomparable performance for all values of τd and τk; (ii) τd:knockoff obtained using adaptive obtains strong perfor-mance even with low overlap e.g., a difference of at most3% performance in Caltech256 even at τd = 0.1; (iii)τk: although the adaptive strategy is minimally affectedin few cases (e.g., CUBS200), we find the performance dropdue to a pure exploitation (certainty) that is used. We ob-served recovery in performance by using all rewards indi-cating exploration goals (diversity, loss) are necessary whentransitioning to an open-world scenario.

Page 17: Knockoff Nets: Stealing Functionality of Black-Box …Knockoff Nets: Stealing Functionality of Black-Box Models Tribhuvanesh Orekondy 1Bernt Schiele Mario Fritz2 1 Max Planck Institute

0 10k 20k 30kBudget B

0

25

50

75

100

Accu

racy

Caltech256

StrategyPV (KD)adaptive

adaptive-flatrandom

0 5k 10k 15k 20kBudget B

CUBS200

0 5k 10k 15k 20kBudget B

Indoor67

0 10k 20k 30k 40kBudget B

Diabetic5

Figure 20: Hierarchy. Evaluating adaptive with and without hierarchy using PA = D2. represents accuracy of blackbox FV andrepresents chance-level performance.

0 10k 20kBudget B

0

25

50

75

100

Accu

racy

Caltech256

0 10k 20kBudget B

CUBS200

0 10k 20kBudget B

Indoor67

τd = 0.01 0.1 0.5 1.0 π = adaptive random

0 10k 20kBudget B

Diabetic5

0 10k 20kBudget B

0

25

50

75

100

Accu

racy

Caltech256

0 10k 20kBudget B

CUBS200

0 10k 20kBudget B

Indoor67

τk = 0.01 0.1 0.5 1.0 π = adaptive random

0 10k 20kBudget B

Diabetic5

Figure 21: Semi-open world: τd and τk.