leaking sensitive financial accounting data in plain sight

9
Leaking Sensitive Financial Accounting Data in Plain Sight using Deep Autoencoder Neural Networks Marco Schreyer 1 , Christian Schulze 2 , Damian Borth 1 1 University of St. Gallen (HSG), St. Gallen, Switzerland 2 German Research Center for Artificial Intelligence (DFKI), Kaiserslautern, Germany {marco.schreyer, damian.borth}@unisg.ch, [email protected] Abstract Nowadays, organizations collect vast quantities of sensitive information in ‘Enterprise Resource Planning’ (ERP) sys- tems, such as accounting relevant transactions, customer mas- ter data, or strategic sales price information. The leakage of such information poses a severe threat for companies as the number of incidents and the reputational damage to those experiencing them continue to increase. At the same time, discoveries in deep learning research revealed that machine learning models could be maliciously misused to create new attack vectors. Understanding the nature of such attacks be- comes increasingly important for the (internal) audit and fraud examination practice. The creation of such an aware- ness holds in particular for the fraudulent data leakage using deep learning-based steganographic techniques that might re- main undetected by state-of-the-art ‘Computer Assisted Au- dit Techniques’ (CAATs). In this work, we introduce a real- world ‘threat model’ designed to leak sensitive accounting data. In addition, we show that a deep steganographic pro- cess, constituted by three neural networks, can be trained to hide such data in unobtrusive ‘day-to-day’ images. Finally, we provide qualitative and quantitative evaluations on two publicly available real-world payment datasets. Introduction In recent years data became one of the organizations most valuable asset (‘the new oil’ (The Economist 2017)). As a result, data protection, preventing it from being stolen or leaked to the outside world, is often considered of paramount importance. This observation holds in particu- lar for data processed and stored in Enterprise Resource Planning (ERP) systems. Steadily, these systems collect vast quantities of business process and accounting data at a gran- ular level. Often the data recorded by such system encom- passes sensitive information such as (i) the master data of customers and vendors, (ii) the volumes of sales and rev- enue, and (iii) strategic information about purchase and sales prices. Nowadays, the leakage of such information poses a severe issue for companies as the number of incidents and the costs to those experiencing them continue to in- crease 1 . The potential damage and adverse consequences Copyright © 2021, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved. 1 An overview of leakage incidents: https://selfkey.org/data- breaches-in-2019/. of a leakage incident may result in indirect losses, such as violations of (privacy) regulations or settlement and com- pensation fees. Furthermore, it can also result in indirect losses, such as damaging a company’s goodwill and repu- tation or the exposure of the intellectual property. Formally, data leakage can be defined as the ‘unauthorized transmis- sion of information from inside an organization to an exter- nal recipient’ (Shabtai, Elovici, and Rokach 2012). The key characteristic of a data leak incident is that it is carried out by ‘insiders’ of an organization, e.g., employees or subcontrac- tors. The detection of such insider attacks remains difficult since it often involves the misuse of legitimate credentials or authorizations to perform such an attack. Recent breakthroughs in artificial intelligence, and in par- ticular deep neural networks (LeCun, Bengio, and Hinton 2015), created advances across a diverse range of appli- cation domains such as image classification (Krizhevsky, Sutskever, and Hinton 2012), speech recognition (Saon et al. 2015), language translation (Sutskever, Vinyals, and Le 2014) and game-play (Silver et al. 2017). Due to their broad application, these developments also raised aware- ness of the unsafe aspects of deep learning (Szegedy et al. 2013; Goodfellow, Shlens, and Szegedy 2014; Papernot et al. 2017), which become increasingly important for the (internal-) audit practice. The research of the potential im- pact of deep learning techniques in accounting and audit is still in an early stage (Sun 2019; Schreyer et al. 2017, 2019b). However, we believe that it is of vital relevance to understand how such techniques can be maliciously mis- used in this sphere to create new attack vectors (Ballet et al. 2019; Schreyer et al. 2019a). This holds in particular for the fraudulent 2 data leakage using deep learning-based stegano- graphic techniques that might remain undetected by state- of-the-art Computer Assisted Audit Techniques (CAATs). In general, the objective of steganography denotes a se- cret communication (Shabtai, Elovici, and Rokach 2012): a sender encodes a message into a cover medium, such as an image or audio file, such that the recipient can decode the message. At the same time, a potential (internal-) auditor or 2 According to (Garner 2014) the term fraud refers to the use of one’s occupation for personal enrichment through the deliber- ate misuse or misapplication of the using organization’s resources or assets, e.g. the booking of fictitious sales, fraudulent invoices, wrong recording of expenses.

Upload: others

Post on 06-Jan-2022

1 views

Category:

Documents


0 download

TRANSCRIPT

Leaking Sensitive Financial Accounting Datain Plain Sight using Deep Autoencoder Neural Networks

Marco Schreyer1, Christian Schulze2, Damian Borth1

1University of St. Gallen (HSG), St. Gallen, Switzerland2German Research Center for Artificial Intelligence (DFKI), Kaiserslautern, Germany

{marco.schreyer, damian.borth}@unisg.ch, [email protected]

AbstractNowadays, organizations collect vast quantities of sensitiveinformation in ‘Enterprise Resource Planning’ (ERP) sys-tems, such as accounting relevant transactions, customer mas-ter data, or strategic sales price information. The leakage ofsuch information poses a severe threat for companies as thenumber of incidents and the reputational damage to thoseexperiencing them continue to increase. At the same time,discoveries in deep learning research revealed that machinelearning models could be maliciously misused to create newattack vectors. Understanding the nature of such attacks be-comes increasingly important for the (internal) audit andfraud examination practice. The creation of such an aware-ness holds in particular for the fraudulent data leakage usingdeep learning-based steganographic techniques that might re-main undetected by state-of-the-art ‘Computer Assisted Au-dit Techniques’ (CAATs). In this work, we introduce a real-world ‘threat model’ designed to leak sensitive accountingdata. In addition, we show that a deep steganographic pro-cess, constituted by three neural networks, can be trained tohide such data in unobtrusive ‘day-to-day’ images. Finally,we provide qualitative and quantitative evaluations on twopublicly available real-world payment datasets.

IntroductionIn recent years data became one of the organizations mostvaluable asset (‘the new oil’ (The Economist 2017)). Asa result, data protection, preventing it from being stolenor leaked to the outside world, is often considered ofparamount importance. This observation holds in particu-lar for data processed and stored in Enterprise ResourcePlanning (ERP) systems. Steadily, these systems collect vastquantities of business process and accounting data at a gran-ular level. Often the data recorded by such system encom-passes sensitive information such as (i) the master data ofcustomers and vendors, (ii) the volumes of sales and rev-enue, and (iii) strategic information about purchase and salesprices. Nowadays, the leakage of such information posesa severe issue for companies as the number of incidentsand the costs to those experiencing them continue to in-crease1. The potential damage and adverse consequences

Copyright © 2021, Association for the Advancement of ArtificialIntelligence (www.aaai.org). All rights reserved.

1An overview of leakage incidents: https://selfkey.org/data-breaches-in-2019/.

of a leakage incident may result in indirect losses, such asviolations of (privacy) regulations or settlement and com-pensation fees. Furthermore, it can also result in indirectlosses, such as damaging a company’s goodwill and repu-tation or the exposure of the intellectual property. Formally,data leakage can be defined as the ‘unauthorized transmis-sion of information from inside an organization to an exter-nal recipient’ (Shabtai, Elovici, and Rokach 2012). The keycharacteristic of a data leak incident is that it is carried out by‘insiders’ of an organization, e.g., employees or subcontrac-tors. The detection of such insider attacks remains difficultsince it often involves the misuse of legitimate credentials orauthorizations to perform such an attack.

Recent breakthroughs in artificial intelligence, and in par-ticular deep neural networks (LeCun, Bengio, and Hinton2015), created advances across a diverse range of appli-cation domains such as image classification (Krizhevsky,Sutskever, and Hinton 2012), speech recognition (Saonet al. 2015), language translation (Sutskever, Vinyals, andLe 2014) and game-play (Silver et al. 2017). Due to theirbroad application, these developments also raised aware-ness of the unsafe aspects of deep learning (Szegedy et al.2013; Goodfellow, Shlens, and Szegedy 2014; Papernotet al. 2017), which become increasingly important for the(internal-) audit practice. The research of the potential im-pact of deep learning techniques in accounting and auditis still in an early stage (Sun 2019; Schreyer et al. 2017,2019b). However, we believe that it is of vital relevance tounderstand how such techniques can be maliciously mis-used in this sphere to create new attack vectors (Ballet et al.2019; Schreyer et al. 2019a). This holds in particular for thefraudulent2 data leakage using deep learning-based stegano-graphic techniques that might remain undetected by state-of-the-art Computer Assisted Audit Techniques (CAATs).

In general, the objective of steganography denotes a se-cret communication (Shabtai, Elovici, and Rokach 2012): asender encodes a message into a cover medium, such as animage or audio file, such that the recipient can decode themessage. At the same time, a potential (internal-) auditor or

2According to (Garner 2014) the term fraud refers to the useof one’s occupation for personal enrichment through the deliber-ate misuse or misapplication of the using organization’s resourcesor assets, e.g. the booking of fictitious sales, fraudulent invoices,wrong recording of expenses.

Auditor

ERP-System

(4) Data Leakage

(1) RegularData Extract

(3)ERP-Data

Hiding

Regular Journal Entries

🕵 Hidden Images Steganography Process

(2) Model

Learning

Inte

rnal

Exte

rnal

Organizational Boundaries

Leaked Journal Entries

(6)Data

Revealing

(5)Data

Extraction

“Trusted Zone"

Figure 1: Exemplary data leakage threat model designed to leak sensitive accounting data such as (i) detailed journal entries or(ii) sensitive customer and vendor master data. The sensitive data is encoded into non-suspicious appearing ‘container’ imagesthat transmitted to a location outside of the organization.

fraud examiner cannot judge whether the cover medium con-tains a secret message or not. In this paper, we present a deeplearning-based steganographic process designed to leak sen-sitive accounting data. We regard this work to be an ini-tial step towards the investigation of such future challengesof audits or fraud examinations. In summary, we presentthe following contributions: First, we describe a real-world‘threat model’ designed to leak sensitive accounting data us-ing deep steganographic techniques. Second, we show thatdeep neural networks are capable of learning to hide the ac-counting data in everyday data, such as simple ‘cover’ im-ages. The created cover images are deliberately designed notto raise awareness and misguide auditors or fraud examin-ers. Third, to complete the attack scenario, we demonstratehow the hidden information can be revealed from the coverimages upon the successful leak.

Related WorkA wide variety of steganography settings and methods havebeen propsed in the literature; most relevant to our work aremethods for blind image steganography, where the messageis encoded in an image and the decoder does not have accessto the original cover image.

Least Significant Bit (LSB) Methods: The Least Signif-icant Bit (Chandramouli and Memon 2001; Fridrich, Gol-jan, and Du 2001; Mielikainen 2006; Viji and Balamurugan2011; Qazanfari and Safabakhsh 2017) describes a classicsteganographic algorithm. In general, each pixel of a dig-ital image is comprised of three bytes (i.e., 8 binary bits)that represent the RGB chromatic values respectively. Thenbit-LSB algorithm replaces the least n significant bits of thecover image by the n most significant bits of the secret im-age. For each byte, the significant bits dominate the colourvalues. This way, the chromatic variation of the containerimage (altered cover) is minimized. Revealing the concealedsecret image can be accomplished by reading the n least sig-nificant bits and performing a bit shift despite that its dis-tortion is often not visually observable. However, the LSB

algorithm is, highly vulnerable to steganalysis. Often sta-tistical analyses can easily detect the pattern of the alteredpixels as shown in (Fridrich, Goljan, and Du 2001; Lerch-Hostalot and Megıas 2016; Luo, Huang, and Huang 2010).Recent works have therefore focused on methods that pre-serve the original image statistics or the design of sophisti-cated distortion functions (Holub and Fridrich 2012; Holub,Fridrich, and Denemark 2014; Long and Li 2018; Pevny,Filler, and Bas 2010; Swain 2018; Tamimi, Abdalla, and Al-Allaf 2013).

Discrete Cosine Transform (DCT) Methods: To over-come the drawbacks of the LSB algorithm, the variantHRVSS (Eltahir et al. 2009) and (Muhammad et al. 2018)focus for example, on the unique biological trait of humaneyes when hiding grey images in colour images. Other ap-proaches utilize the bit-plane complexity segmentation in ei-ther the spatial or the transform domain (Kawaguchi and Ea-son 1999; Spaulding et al. 2002; Ramani et al. 2007). Otheralgorithms embed secret information in the Discrete CosineTransformation ‘DCT’ frequency domain of an image. Thisis achieved by deliberately changing the DCT coefficients(Chae and Manjunath 1999; Kaur, Kaur, and Singh 2011;Nag et al. 2011; Zhang et al. 2018). As many of those co-efficients are zero in general, they can be exploited to em-bed information in the images frequency domain rather thanthe spatial domain (El Rahman 2018; Cheddad et al. 2010;Sadek, Khalifa, and Mostafa 2015).

Deep Learning Methods: Recently, deep learning-basedsteganography methods have been proposed to encode (bi-nary) text messages in images (Baluja 2017; Zhu et al.2018). Prior works (Husien and Badi 2015; Pibre et al. 2015)mostly focused on the decoding of the hidden information(in terms of determining which bits to extract from the con-tainer image) to increase the transmission accuracy. Besides,deep learning methods are also increasingly employed in thecontext of steganalysis to detect the information potentiallyhidden (Pibre et al. 2016; Qian et al. 2015; Tan and Li 2014;Xu, Wu, and Shi 2016). In this work, we build on the ideas of

(Baluja 2017; Hayes and Danezis 2017) that proposed an en-tire steganographic based on deep networks, encompassing apreparation, an encoding (hiding) and decoding (revealing)network. To the best of our knowledge, this work presentsthe first analysis of a deep neural network based data leak-age attack targeting sensitive financial accounting data.

Deep Steganography Threat ModelTo establish such a data leakage attack, we introduce a threatmodel depicted in Fig. 1, in which a person or group of peo-ple, referred to as the perpetrator, within an organizationintends to transmit ‘leak’ sensitive data of the organizationto an external destination or recipient. To initiate the attack,a perpetrator will query the ERP system to extract the sen-sitive accounting data to be leaked, such as detailed journalentry data or customer and vendor master data tables (1). Af-terwards, the extracted entries are converted into binary im-ages, referred to as secret images, exhibiting a ‘one-hot’ en-coded representation of the data. One could think of convert-ing the information into a Quick Response (QR) code repre-sentation using designated publicly available software3.

Following, a population of images referred to as cover im-ages is selected. Usually, the cover images contain unob-trusive content to leak the secret information covertly. Suchimages can, for example, be obtained from prominent andpublicly available image databases such us (Thomee et al.2016; Russakovsky et al. 2015). Following, a deep stegano-graphic model is learned utilizing a deliberately designedtraining process. The process is comprised of a sequence ofdeep neural networks that process both the secret and thecover images. The model is trained to create a container im-age that efficiently encodes the information contained in thesecret image into the cover image (2). Thereby the encod-ing is regularized to hide the secret information with mini-mal distortion of the visual appearance of the cover image.Simultaneously, the model is trained to also decode the en-coded secret information from the container image.

Upon successful training, the model is exploited to facil-itate the attack (3). Therefore, a single or a series of con-tainer images are created that conceal the sensitive data tobe leaked. The created container images are then transmit-ted either (i) electronically (e.g., via email or files-haring) or(ii) physically (e.g., via mobile phone or USB flash drive) toa location outside of the organizational boundaries, usuallyreferred to as ‘safe zone’. (4). Visually the container imagesseem to correspond to unobtrusive ‘everyday’ images butencode the sensitive information to be leaked. As a result,they exhibit a high chance of being judged as non-suspiciousthroughout an internal audit. Once the container images aresuccessfully transmitted, the secret binary images can be re-vealed using the steganographic model (5). Ultimately thesensitive data can be encoded at high accuracy (6).

Deep Steganography ProcessInspired by and building on the work of Baluja (Baluja 2017)we investigate if a deep-learning based steganography pro-cess can be misused to leak sensitive accounting data. The

3GitHub: https://github.com/Bacon/BaconQrCode

proposed process, is constituted of three neural networks: apreparation network Pθ, a hiding network Hφ, and a revealnetwork Rγ . The distinct networks are trained ‘end-to-end’and θ, φ, and γ denote the trainable parameters of each net-work respectively.

To establish a potential data leak, a set of X of N jour-nal entries x1, x2, ..., xn (or other tabular data record) isextracted from the ERP system to be attacked. Each jour-nal entry xi consists of M accounting specific attributesxi1, x

i2, ..., x

ij , ..., x

im. The individual attributes xj describe

the journal entries details, e.g., the entries’ fiscal year, post-ing type, posting date, amount, general-ledger. Afterwards,each entry is sequentially processed by the distinct networksas described in the following.

Prepare Step: Each entry (or multiple entries) is con-verted into a secret image. The secret image Isec ∈{0, 1}H×W of shapeH×W contains a binary representationof the information to be leaked. The preparation network Pθreceives Isec and converts it into a RGB prepared imageIpre ∈ {0, 255}C×H×W that exhibits the desired charac-teristics to be hidden.

Hide Step: The hiding process is then initiated using theprepared image and a cover image. The RGB cover imageIcov ∈ {0, 255}C×H×W of shape C ×H ×W correspondsto an arbitrary ‘everyday’ image and is used to hide the pre-pared image. The hiding networkHφ receives a cover imageIcov and the prepared image Ipre. Using both input images,Hφ produces a RGB container image Icon of the same shapeas Icov .

Reveal Step: Subsequently, the container image is passedto the reveal network Rγ to produce a binary reveal imageIrev ∈ {0, 1}H×W that exhibits a high similarity to theoriginal prepared image Ipre. Ultimately, Irev is supposedto reveal the secret information of Isec hidden in Icon. Thetraining of the architecture follows a dual training objective,namely (1) the minimization of the cover distortion (coverloss) and (2) the preservation of the secret information (se-cret loss) when optimizing the network parameters.

Cover loss: The first training objective requires that thecontainer image Icon exhibits a high visual similarity to thecover image Icov . Formally defined as the mean-squared dif-ference Lcovθ,φ between both images, given by:

Lcovθ,φ =

C−1∑x=0

H−1∑y=0

W−1∑z=0

(Icovx,y,z − Iconx,y,z)2, (1)

where C, H , and W denote the shape of both RGB imagesand θ and φ the network parameters.

Secret loss: The second training objective requires thatthe revealed image Irev should contain the same binary in-formation as encoded in the original secret image Isec. For-mally defined as the binary cross-entropy difference Lsecθ,φ,γof between images, given by:

Lsecθ,φ,γ =

H−1∑x=0

W−1∑y=0

Isecx,y log Irevx,y

+ (1− Isecx,y ) log(1− Irevx,y ),

(2)

1

Original Journal Entry Data

Preparation Net Hiding Net Reveal Net

Secret Image Container Image

Revealed Image

Organizational BoundariesCover Image

P✓<latexit sha1_base64="ku/fJtRQSMHGItdbUoA6IPPChsI=">AAAB8XicbVBNS8NAEN3Ur1q/qh69LBbBU0mqoMeiF48V7Ae2oWy2k3bpZhN2J0IJ/RdePCji1X/jzX/jts1BWx8MPN6bYWZekEhh0HW/ncLa+sbmVnG7tLO7t39QPjxqmTjVHJo8lrHuBMyAFAqaKFBCJ9HAokBCOxjfzvz2E2gjYvWAkwT8iA2VCAVnaKXHRj/r4QiQTfvlilt156CrxMtJheRo9MtfvUHM0wgUcsmM6Xpugn7GNAouYVrqpQYSxsdsCF1LFYvA+Nn84ik9s8qAhrG2pZDO1d8TGYuMmUSB7YwYjsyyNxP/87ophtd+JlSSIii+WBSmkmJMZ+/TgdDAUU4sYVwLeyvlI6YZRxtSyYbgLb+8Slq1qndRrd1fVuo3eRxFckJOyTnxyBWpkzvSIE3CiSLP5JW8OcZ5cd6dj0VrwclnjskfOJ8/xemQ+w==</latexit>

H�<latexit sha1_base64="NpC1j3nNwZK1LsxXm1KZsW9K5qs=">AAAB73icbVBNS8NAEJ34WetX1aOXxSJ4KkkV9Fj00mMF+wFtKJvtpl262cTdiVBC/4QXD4p49e9489+4bXPQ1gcDj/dmmJkXJFIYdN1vZ219Y3Nru7BT3N3bPzgsHR23TJxqxpsslrHuBNRwKRRvokDJO4nmNAokbwfju5nffuLaiFg94CThfkSHSoSCUbRSp97PeslITPulsltx5yCrxMtJGXI0+qWv3iBmacQVMkmN6Xpugn5GNQom+bTYSw1PKBvTIe9aqmjEjZ/N752Sc6sMSBhrWwrJXP09kdHImEkU2M6I4sgsezPxP6+bYnjjZ0IlKXLFFovCVBKMyex5MhCaM5QTSyjTwt5K2IhqytBGVLQheMsvr5JWteJdVqr3V+XabR5HAU7hDC7Ag2uoQR0a0AQGEp7hFd6cR+fFeXc+Fq1rTj5zAn/gfP4AJemQCg==</latexit>

R�<latexit sha1_base64="Q+2UNEJWaCOPhKXaoCXHtzX9nxw=">AAAB8XicbVBNS8NAEJ34WetX1aOXYBE8laQKeix68VjFfmAbymS7aZfubsLuRiih/8KLB0W8+m+8+W/ctjlo64OBx3szzMwLE8608bxvZ2V1bX1js7BV3N7Z3dsvHRw2dZwqQhsk5rFqh6gpZ5I2DDOcthNFUYSctsLRzdRvPVGlWSwfzDihgcCBZBEjaKz0eN/LugMUAie9UtmreDO4y8TPSRly1Hulr24/Jqmg0hCOWnd8LzFBhsowwumk2E01TZCMcEA7lkoUVAfZ7OKJe2qVvhvFypY07kz9PZGh0HosQtsp0Az1ojcV//M6qYmugozJJDVUkvmiKOWuid3p+26fKUoMH1uCRDF7q0uGqJAYG1LRhuAvvrxMmtWKf16p3l2Ua9d5HAU4hhM4Ax8uoQa3UIcGEJDwDK/w5mjnxXl3PuatK04+cwR/4Hz+AKvkkOo=</latexit>

Prepared ImageIsec

<latexit sha1_base64="iEKM9XtasLr6jFYI4KI7dAyQx2M=">AAAB7nicbVBNS8NAEJ34WetX1aOXxSJ4KkkV9Fj0orcK9gPaWDbbSbt0swm7G6GE/ggvHhTx6u/x5r9x2+agrQ8GHu/NMDMvSATXxnW/nZXVtfWNzcJWcXtnd2+/dHDY1HGqGDZYLGLVDqhGwSU2DDcC24lCGgUCW8HoZuq3nlBpHssHM07Qj+hA8pAzaqzUunvMNLJJr1R2K+4MZJl4OSlDjnqv9NXtxyyNUBomqNYdz02Mn1FlOBM4KXZTjQllIzrAjqWSRqj9bHbuhJxapU/CWNmShszU3xMZjbQeR4HtjKgZ6kVvKv7ndVITXvkZl0lqULL5ojAVxMRk+jvpc4XMiLEllClubyVsSBVlxiZUtCF4iy8vk2a14p1XqvcX5dp1HkcBjuEEzsCDS6jBLdShAQxG8Ayv8OYkzovz7nzMW1ecfOYI/sD5/AFpzo+e</latexit>

Ipre<latexit sha1_base64="pEMymtFjJup6zt0aEjzerOCO4tM=">AAAB7nicbVDLSgMxFL3js9ZX1aWbYBFclZkq6LLoRncV7APasWTSO21oJjMkGaEM/Qg3LhRx6/e4829M21lo64HA4Zxzyb0nSATXxnW/nZXVtfWNzcJWcXtnd2+/dHDY1HGqGDZYLGLVDqhGwSU2DDcC24lCGgUCW8HoZuq3nlBpHssHM07Qj+hA8pAzaqzUunvMbHzSK5XdijsDWSZeTsqQo94rfXX7MUsjlIYJqnXHcxPjZ1QZzgROit1UY0LZiA6wY6mkEWo/m607IadW6ZMwVvZJQ2bq74mMRlqPo8AmI2qGetGbiv95ndSEV37GZZIalGz+UZgKYmIyvZ30uUJmxNgSyhS3uxI2pIoyYxsq2hK8xZOXSbNa8c4r1fuLcu06r6MAx3ACZ+DBJdTgFurQAAYjeIZXeHMS58V5dz7m0RUnnzmCP3A+fwB8EY+q</latexit> Icon

<latexit sha1_base64="yJ4ZPqUdah7VTxjkn+yMcN+i+v0=">AAAB7nicbVBNSwMxEJ31s9avqkcvwSJ4KrtV0GPRi94q2A9o15JNs21oNglJVihLf4QXD4p49fd489+YtnvQ1gcDj/dmmJkXKc6M9f1vb2V1bX1js7BV3N7Z3dsvHRw2jUw1oQ0iudTtCBvKmaANyyynbaUpTiJOW9HoZuq3nqg2TIoHO1Y0TPBAsJgRbJ3UunvMiBSTXqnsV/wZ0DIJclKGHPVe6avblyRNqLCEY2M6ga9smGFtGeF0UuymhipMRnhAO44KnFATZrNzJ+jUKX0US+1KWDRTf09kODFmnESuM8F2aBa9qfif10ltfBVmTKjUUkHmi+KUIyvR9HfUZ5oSy8eOYKKZuxWRIdaYWJdQ0YUQLL68TJrVSnBeqd5flGvXeRwFOIYTOIMALqEGt1CHBhAYwTO8wpunvBfv3fuYt654+cwR/IH3+QNxUY+j</latexit>

Icov<latexit sha1_base64="shLfq+nh+jNJpDOMpwpJlffZ/mg=">AAAB7nicbVBNSwMxEJ2tX7V+rXr0EiyCp7JbBT0Wveitgv2Adi3ZNNuGZpMlyRbK0h/hxYMiXv093vw3pu0etPXBwOO9GWbmhQln2njet1NYW9/Y3Cpul3Z29/YP3MOjppapIrRBJJeqHWJNORO0YZjhtJ0oiuOQ01Y4up35rTFVmknxaCYJDWI8ECxiBBsrte6fMiLH055b9ireHGiV+DkpQ456z/3q9iVJYyoM4Vjrju8lJsiwMoxwOi11U00TTEZ4QDuWChxTHWTzc6fozCp9FEllSxg0V39PZDjWehKHtjPGZqiXvZn4n9dJTXQdZEwkqaGCLBZFKUdGotnvqM8UJYZPLMFEMXsrIkOsMDE2oZINwV9+eZU0qxX/olJ9uCzXbvI4inACp3AOPlxBDe6gDg0gMIJneIU3J3FenHfnY9FacPKZY/gD5/MHfXmPqw==</latexit>

Irev<latexit sha1_base64="nQZo2FQ4FXY4wU0sUiKiW3dHxC8=">AAAB7nicbVDLSgNBEOyNrxhfUY9eBoPgKexGQY9BL3qLYB6QrGF20kmGzM4uM7OBsOQjvHhQxKvf482/cZLsQRMLGoqqbrq7glhwbVz328mtrW9sbuW3Czu7e/sHxcOjho4SxbDOIhGpVkA1Ci6xbrgR2IoV0jAQ2AxGtzO/OUaleSQfzSRGP6QDyfucUWOl5v1TqnA87RZLbtmdg6wSLyMlyFDrFr86vYglIUrDBNW67bmx8VOqDGcCp4VOojGmbEQH2LZU0hC1n87PnZIzq/RIP1K2pCFz9fdESkOtJ2FgO0NqhnrZm4n/ee3E9K/9lMs4MSjZYlE/EcREZPY76XGFzIiJJZQpbm8lbEgVZcYmVLAheMsvr5JGpexdlCsPl6XqTRZHHk7gFM7Bgyuowh3UoA4MRvAMr/DmxM6L8+58LFpzTjZzDH/gfP4AhSaPsA==</latexit>

Leaked Journal Entry Data

“Trusted Zone"

Figure 2: The data leakage process as introduced in (Baluja 2017), applied to learn a steganographic model of real-world ac-counting data. The process is designed to encode and decode sensitive information into unobtrusive ‘day-to-day’ cover images.

where H , and W denote the shape of both binary imagesand θ, φ and γ the network parameters. We train the networkparameters θ, φ, and γ batch-wise using stochastic gradientdescent to minimize a combined steganographic loss func-tion LALL

θ,φ,γ over the distributions of secret and cover images,as defined by:

LALL

θ,φ,γ = α · 1N

N−1∑i=0

Lcovθ,φ + β · 1N

N−1∑i=0

Lsecθ,φ,γ , (3)

where N denotes the size of the training batch and the fac-tors α and β balance both losses. In summary, the approachbuilds upon the idea of autoencoder neural networks origi-nally proposed in (Hinton and Salakhutdinov 2006). We en-coded two input images such that the learned intermediaterepresentation Icon comprises an image that appears as sim-ilar as possible Icov and thereby hides the accounting datainformation contained in Isec.

Datasets and Experimental SetupDue to the high confidentiality of real-world accounting dataand to allow for reproducibility of our results, we evaluatethe proposed methodology based on two publicly availabledatasets of real-world city payments. The payment datasetsserve as secret data to be hidden. We use a prominent datasetof everyday images that serves as cover data to hide the pay-ment information.

Secret Data: The secret datasets to be leaked, encom-pass two publicly available payment datasets of the cityof Philadelphia and the city of Chicago. The majority ofattributes recorded in both datasets (similar to real-worldERP data) correspond to categorical (discrete) variables, e.g.posting date, department, vendor name, document type. Foreach dataset we pre-process the original payment line-itemattributes to (i) remove semantically redundant attributes and(ii) obtain a binary (‘one-hot’) representation of each pay-ment record. The following descriptive statistics summariseboth datasets upon successful data pre-processing:

The ‘City of Philadelphia’ dataset4 (dataset A) encom-passes the city’s payments of the fiscal year 2017. It repre-

4https://www.phila.gov

sents nearly $4.2 billion in payments obtained from almost60 city offices, departments, boards and committees. Thedataset encompasses n = 238, 894 payments comprised of10 categorical and one numerical attribute. The encoding re-sulted in a total of D = 8, 565 one-hot encoded dimensionsfor each vendor payment record xi ∈ R8,565.

The ‘City of Chicago’ dataset5 (dataset B) encompassesthe city’s vendor payments ranging from 1996 to 2020. Thedata is collected from the city’s ‘Vendor, Contract and Pay-ment Search’ and encompasses the procurement of goodsand services. The dataset encompasses n = 72, 814 pay-ments comprised of 7 categorical and one numerical at-tribute. The encoding resulted in a total of D = 2, 354one-hot encoded dimensions for eachvendor payment recordxi ∈ R2,354.

Cover Data: The cover dataset used to hide the secretdata is derived from the ‘ImageNet 2012’ dataset. The ‘Ima-geNet 2012’ (Russakovsky et al. 2015) dataset6 (ImageNet)is a publicly available datasets commonly used to evaluatemachine learning models. The original dataset contains a to-tal of 1.2 million RGB images organized in non-overlapping1,000 subcategories. Each category corresponds to a real-world concept, such as ‘llama’, ‘guitar’, or ‘pillow’. We usea subset of the entire dataset referred to as the ‘Tiny Ima-geNet’ dataset7. The dataset consists of 200 ImageNet sub-categories. Thereby, each category encompasses 500 imagesresulting in a total of 100k RGB images. To demonstrate thegeneralization of the proposed method, we crop from eachimage a random image patch of size 224 × 224 pixels perchannel and resize it to 256 × 256 pixels.

Architectural Setup: The steganographic process asshown in Fig. 2 and described in (Baluja 2017) is trainedend-to-end comprised of the preparation network Pθ, thehiding network Hφ, and the reveal network Rγ . Each net-work applies a series 2D convolutions (LeCun, Bengio et al.1995) of different filter sizes applied onto the input images(architectural details are described in the appendix). Theconvolutions are followed by non-linear ReLU activation

5https://data.cityofchicago.org6http://image-net.org/download-images7https://tiny-imagenet.herokuapp.com

Table 1: Network training losses (LALL

θ,φ,γ , Lcovθ,φ , Lsecθ,φ,γ), PSNR, SSIM, and BACC obtained on both city payment datasets whenusing three channel C = 3 RGB (H×W×3) cover images.

Dataset BPP α β LALLθ,φ,γ Lcovθ,φ Lsecθ,φ,γ PSNR SSIM BACC

A 0.04360.2 1.0 0.63 ± 0.01 0.43 ± 0.06 0.54 ± 0.01 43.91 ± 0.64 0.997 ± 0.001 0.999 ± 0.0010.5 1.0 0.74 ± 0.03 0.40 ± 0.04 0.54 ± 0.04 44.23 ± 0.50 0.997 ± 0.001 0.998 ± 0.0030.8 1.0 0.81 ± 0.03 0.34 ± 0.03 0.55 ± 0.01 44.97 ± 0.43 0.998 ± 0.001 0.999 ± 0.0011.0 1.0 0.92 ± 0.06 0.38 ± 0.05 0.54 ± 0.02 44.42 ± 0.65 0.997 ± 0.001 0.998 ± 0.002

B 0.01200.2 1.0 0.67 ± 0.07 0.79 ± 0.03 0.51 ± 0.00 41.38 ± 0.32 0.998 ± 0.002 0.999 ± 0.0010.5 1.0 0.77 ± 0.01 0.52 ± 0.02 0.51 ± 0.01 42.94 ± 0.23 0.996 ± 0.001 0.998 ± 0.0010.8 1.0 0.87 ± 0.05 0.46 ± 0.07 0.51 ± 0.00 43.51 ± 0.65 0.995 ± 0.003 0.998 ± 0.0021.0 1.0 0.99 ± 0.04 0.47 ± 0.04 0.52 ± 0.01 43.35 ± 0.37 0.997 ± 0.001 0.997 ± 0.003

Values of the distinct loss functions L are multiplied by 104, variances originate from parameter initialization using four distinct random seeds.

Table 2: Network training losses (LALL

θ,φ,γ , Lcovθ,φ , Lsecθ,φ,γ), PSNR, SSIM, and BACC obtained on both city payment datasets whenusing single channel C = 1 grayscale (H×W×1) cover images.

Dataset BPP α β LALLθ,φ,γ Lcovθ,φ Lsecθ,φ,γ PSNR SSIM BACC

A 0.1307 0.5 1.0 0.29 ± 0.00 0.01 ± 0.00 0.54 ± 0.01 60.93 ± 1.58 0.999 ± 0.001 0.997 ± 0.002B 0.0359 0.5 1.0 0.87 ± 0.04 0.64 ± 0.07 0.55 ± 0.02 42.27 ± 0.49 0.989 ± 0.003 0.977 ± 0.002

Values of the distinct loss functions L are multiplied by 104, variances originate from parameter initialization using four distinct random seeds.

functions (Nair and Hinton 2010).Training Setup: The networks are trained on secret im-

ages Isec derived from the entire population of payments ineach dataset. Throughout the training, the secret images arepaired with randomly drawn cover images Icov of the Im-ageNet dataset. We train with a mini-batch size of m = 6secret-cover image pairs for a max. τ = 800k iterations andapply early stopping once the combined loss defined in Eq. 3converges. Thereby, the network parameters θ, φ, and γ areoptimized with a constant learning rate of η = 10−5 usingAdam optimization (Kingma and Ba 2014), setting β1 = 0.9and β2 = 0.999. We sweep the weight factor of the containerimage loss Lcovθ,φ through α ∈ [0.2, 1.0] to determine an opti-mal hyperparameter setup.

Experimental ResultsIn this section, we quantitatively and qualitatively assessthe data leakage capability of the introduced steganographicprocess. The quantitative evaluation is conducted accordingto: (i) model secrecy, the difficulty of detecting the sensitiveaccounting data hidden in each image; (ii) model accuracy,the extent to which the secret information can be revealedcorrectly from the container image.

Quantitative Evaluation: We evaluate the model secrecyusing the Peak Signal-to-Noise Ratio (PSNR) ratio as aproxy. The PSNR measures the ratio between the maximumpossible power of a signal and the power of corrupting noisethat affects the fidelity of its representation, as defined by:

PSNR = 10 · log10max |Icov|

Lcovθ,φ(Icov, Icon), (4)

where max |Icov| denotes the maximum absolute possiblevalue of the cover image. In the absence of noise, the coverimage Icov and the container image Icon are identical, and

thus the Lcovθ,φ is zero. In this case, the PSNR is infinite (Sa-lomon 2004). It is assumed that acceptable values for wire-less transmission quality loss are considered to be about 20dB to 25 dB (Li and Cai 2007). In addition, we measurethe Structural Similarity Index Measure (SSIM) (Wang et al.2004). The SSIM defines a perception-based model consid-ering image degradation as a perceived change in structuralinformation. We quantify the visibility of differences be-tween the distorted container image Icon and the cover im-age Icov , as defined by:

SSIM =(2µcovµcon + c1)(2σcov,con + c2)

(µ2cov + µ2

con + c1)(σ2cov + σ2

con + c2), (5)

where µcov and µcon denotes the average pixel value ofIcov and Icon respectively, σcov and σcon their variance, andσcov,con the co-variance of the pixel values in both images.The SSIM maximum possible value of 1 indicates that thetwo images are structurally similar, while a value of 0 indi-cates no structural similarity.We measure the model accuracy using Bit Accuracy (ACC),which denotes the fraction of identical active bits betweenthe secret image and revealed image, as defined by:

BACC = 1−∑H−1x=0

∑W−1y=0 1[‖Isecx,y−M◦Irevx,y ‖≤δ]∑H−1

x=0

∑W−1y=0 1[Isecx,y>0]

, (6)

where 1 denotes the indicator function and M =max(Irev, k) denotes a binary pixel mask corresponding tothe k most active pixel values in Irev . We set k to the num-ber of active binary pixels in Isec. The operator ◦ denotesthe element-wise multiplication, and δ denotes the dissim-ilarity threshold of the most active pixel values. We set toδ = 0.001 in all our experiments.

label: ‘goldfish’

label: ‘mushroom’

label: ‘pizza’

Isec Irev Icov Icon Icov − Icon

Figure 3: Exemplary data hiding results of a deep steganographic process model trained for τ = 800k iterations using RGBcover images (left to right): original secret, revealed secret, original cover, and created container image, as well as the pixel-wiseresidual errors of the cover and the container image.

τ = 200k

τ = 800kIsec Irev Iprep Icov Icon Icov − Icon

Figure 4: Exemplary data hiding results using grayscale cover images (left to right): original secret, revealed secret, preparedsecret, original cover, and created container image, as well as the pixel-wise residual errors of the cover and the container image.

We analyze the performance of the introduced processin two setups: In the first setup, we use the original three-channel RGB images of the ImageNet dataset to hide thesecret information. In the second setup, we reduced the ca-pacity of the cover images by converting the dataset of coverimages to single-channel grayscale images. We measure ca-pacity as Bits Per Pixel (BPP) (Zhu et al. 2018), which de-notes the number of encoded secret bits in the encoded im-age, defined byD/(H×W×C). The trained models are eval-uated on randomly drawn combinations of 5k secret-coverimage pairs. Tables 1 and 4 show the quantitative results ofboth setups. It can be observed that in the RGB cover im-age setup different α parameterizations yield high a secrecyand accuracy of the secret information. Similar results areobtained for the grayscale cover image setup, indicating that

the grayscale images provide sufficient capacity to hide thedata to the unaided human eye.

Qualitative Evaluation: Figure 3 shows three exemplarydata hiding results using RGB cover images. The recon-structed container images Icon appear almost identical to theoriginal cover images Icov and encode all the informationnecessary to reconstruct the original payment information.The rightmost column shows the pixel-wise residual errorbetween Icov and Icon. It can be observed that the secretinformation is encoded across the pixel intensities of Iconand cannot easily be obtained. This learned obfuscation is ofhigh relevance in a scenario when the original cover imagebecomes accessible to the auditor or fraud examiner. Figure4 shows two exemplary data hiding results of the grayscalecover image setup. It can be observed that, due to the lim-

ited capacity of the cover, the secret information remainspartially visible in the pixel-wise residual error even whentraining the model for τ = 800k iterations.

ConclusionIn this work, we conducted an analysis of deep steganog-raphy as a future challenge to the (internal) audit and fraudexamination practice. We introduced a ‘threat model’ to leaksensitive information recorded in ERP systems. We also pro-vided initial evidence that deep steganographic techniquescan be maliciously misused to hide such information in un-obtrusive ‘everyday’ images. In summary, we believe thatsuch an attack vector pose a substantial challenge for CAATsused by auditors nowadays and the near future.

ReferencesBallet, V.; Renard, X.; Aigrain, J.; Laugel, T.; Frossard, P.;and Detyniecki, M. 2019. Imperceptible Adversarial Attackson Tabular Data. arXiv preprint:1911.03274 .

Baluja, S. 2017. Hiding Images in Plain Sight: DeepSteganography. In Advances in Neural Information Process-ing Systems, 2069–2079.

Chae, J. J.; and Manjunath, B. 1999. Data Hiding in Video.In Proceedings 1999 International Conference on ImageProcessing, volume 1, 311–315. IEEE.

Chandramouli, R.; and Memon, N. 2001. Analysis of LSBBased Image Steganography Techniques. In Proceedings2001 International Conference on Image Processing (Cat.No. 01CH37205), volume 3, 1019–1022. IEEE.

Cheddad, A.; Condell, J.; Curran, K.; and Mc Kevitt, P.2010. Digital Image Steganography: Survey and Analysisof Current Methods. Signal Processing 90(3): 727–752.

El Rahman, S. A. 2018. A Comparative Analysis of ImageSteganography based on DCT Algorithm and Steganogra-phy Tool to Hide Nuclear Reactors Confidential Informa-tion. Computers & Electrical Engineering 70: 380–399.

Eltahir, M. E.; Kiah, L. M.; Zaidan, B. B.; and Zaidan, A. A.2009. High Rate Video Streaming Steganography. In 2009International Conference on Information Management andEngineering, 550–553. IEEE.

Fridrich, J.; Goljan, M.; and Du, R. 2001. Detecting LSBSteganography in Color, and Grayscale Images. IEEE Mul-timedia 8(4): 22–28.

Garner, B. 2014. Black’s Law Dictionary. Thomson Reuters.

Goodfellow, I. J.; Shlens, J.; and Szegedy, C. 2014. Ex-plaining and Harnessing Adversarial Examples. arXivpreprint:1412.6572 .

Hayes, J.; and Danezis, G. 2017. Generating SteganographicImages via Adversarial Training. In Advances in Neural In-formation Processing Systems, 1954–1963.

Hinton, G. E.; and Salakhutdinov, R. R. 2006. Reducingthe Dimensionality of Data with Neural Networks. Science313(5786): 504–507.

Holub, V.; and Fridrich, J. 2012. Designing SteganographicDistortion using Directional Filters. In 2012 IEEE Inter-national workshop on information forensics and security(WIFS), 234–239. IEEE.

Holub, V.; Fridrich, J.; and Denemark, T. 2014. UniversalDistortion Function for Steganography in an Arbitrary Do-main. EURASIP J. on Information Security 2014(1): 1.

Husien, S.; and Badi, H. 2015. Artificial Neural Network forSteganography. Neural Computing and Applications 26(1):111–116.

Kaur, B.; Kaur, A.; and Singh, J. 2011. Steganographic Ap-proach for Hiding Image in DCT Domain. InternationalJournal of Advances in Engineering & Technology 1(3): 72.

Kawaguchi, E.; and Eason, R. O. 1999. Principles and appli-cations of BPCS steganography. In Multimedia Systems andApplications, volume 3528, 464–473. International Societyfor Optics and Photonics.

Kingma, D. P.; and Ba, J. 2014. ADAM: A Method forStochastic Optimization. arXiv preprint:1412.6980 .

Krizhevsky, A.; Sutskever, I.; and Hinton, G. E. 2012. Im-ageNet classification with Deep Convolutional Neural Net-works. In Advances in Neural Information Processing Sys-tems, 1097–1105.

LeCun, Y.; Bengio, Y.; and Hinton, G. 2015. Deep Learning.Nature 521(7553): 436.

LeCun, Y.; Bengio, Y.; et al. 1995. Convolutional Networksfor Images, Speech, and Time Series. The handbook of braintheory and neural networks 3361(10): 1995.

Lerch-Hostalot, D.; and Megıas, D. 2016. Unsupervised Ste-ganalysis Based on Artificial Training Sets. Engineering Ap-plications of Artificial Intelligence 50: 45–59.

Li, X.; and Cai, J. 2007. Robust Transmission of JPEG2000Encoded Images over Packet Loss Channels. In 2007 IEEEInternational Conference on Multimedia and Expo, 947–950. IEEE.

Long, M.; and Li, F. 2018. A Formula Adaptive Pixel PairMatching Steganography Algorithm. Advances in Multime-dia: 2018 .

Luo, W.; Huang, F.; and Huang, J. 2010. Edge AdaptiveImage Steganography Based on LSB Matching Revisited.IEEE Transactions on Information Forensics and Security5(2): 201–214.

Mielikainen, J. 2006. LSB Matching Revisited. IEEE SignalProcessing Letters 13(5): 285–287.

Muhammad, K.; Sajjad, M.; Mehmood, I.; Rho, S.; andBaik, S. W. 2018. Image Steganography using UncorrelatedColor Space and its Application for Security of Visual Con-tents in Online Social Networks. Future Generation Com-puter Systems 86: 951–960.

Nag, A.; Biswas, S.; Sarkar, D.; and Sarkar, P. P. 2011. ANovel Technique for Image Steganography Based on DWTand Huffman Encoding. International Journal of ComputerScience and Security,(IJCSS) 4(6): 497–610.

Nair, V.; and Hinton, G. E. 2010. Rectified Linear UnitsImprove Restricted Boltzmann Machines. In Proceedingsof the 27th International Conference on Machine Learning(ICML), 807–814.Papernot, N.; McDaniel, P.; Goodfellow, I.; Jha, S.; Celik,Z. B.; and Swami, A. 2017. Practical Black-Box AttacksAgainst Machine Learning. In Proceedings of the 2017 ACMon Asia conference on computer and communications secu-rity, 506–519. ACM.Pevny, T.; Filler, T.; and Bas, P. 2010. Using High-Dimensional Image Models to Perform Highly UndetectableSteganography. In International Workshop on InformationHiding, 161–177. Springer.Pibre, L.; Jerome, P.; Ienco, D.; and Chaumont, M. 2015.Deep Learning for Steganalysis is Better than a Rich Modelwith an Ensemble Classifier, and is Natively Robust to theCover Source-Mismatch. arXiv preprint:1511.04855 .Pibre, L.; Pasquet, J.; Ienco, D.; and Chaumont, M. 2016.Deep Learning is a good Steganalysis Tool when EmbeddingKey is Reused for Different Images, even if there is a CoverSource Mismatch. Electronic Imaging 2016(8): 1–11.Qazanfari, K.; and Safabakhsh, R. 2017. An Improvementon LSB Matching and LSB Matching Revisited Steganogra-phy Methods. arXiv preprint:1709.06727 .Qian, Y.; Dong, J.; Wang, W.; and Tan, T. 2015. Deep Learn-ing for Steganalysis via Convolutional Neural Networks. InMedia Watermarking, Security, and Forensics 2015, volume9409, 94090J. Int. Society for Optics and Photonics.Ramani, G.; Prasad, E.; Varadarajan, S.; SVUCE, T.; andJntuce, K. 2007. Steganography using BPCS to the IntegerWavelet Transformed Image. IJCSNS 7(7): 293–302.Russakovsky, O.; Deng, J.; Su, H.; Krause, J.; Satheesh, S.;Ma, S.; Huang, Z.; Karpathy, A.; Khosla, A.; Bernstein, M.;Berg, A. C.; and Fei-Fei, L. 2015. ImageNet Large Scale Vi-sual Recognition Challenge. International Journal of Com-puter Vision (IJCV) 115(3): 211–252.Sadek, M. M.; Khalifa, A. S.; and Mostafa, M. G. 2015.Video Steganography: A Comprehensive Review. Multime-dia Tools and Applications 74(17): 7063–7094.Salomon, D. 2004. Data Compression: The Complete Ref-erence. Springer Science & Business Media.Saon, G.; Kuo, H.-K. J.; Rennie, S.; and Picheny, M. 2015.The IBM 2015 English Conversational Telephone SpeechRecognition System. arXiv preprint:1505.05899 .Schreyer, M.; Sattarov, T.; Borth, D.; Dengel, A.; andReimer, B. 2017. Detection of Anomalies in Large ScaleAccounting Data using Deep Autoencoder Networks. arXivpreprint:1709.05254 .Schreyer, M.; Sattarov, T.; Reimer, B.; and Borth, D.2019a. Adversarial Learning of Deepfakes in Accounting.NeurIPS’19 Workshop on Robust AI in Financial Services .Schreyer, M.; Sattarov, T.; Schulze, C.; Reimer, B.; andBorth, D. 2019b. Detection of accounting anomalies in thelatent space using adversarial autoencoder neural networks.KDD’19 Workshop on Anomaly Detection in Finance .

Shabtai, A.; Elovici, Y.; and Rokach, L. 2012. A Survey ofData Leakage Detection and Prevention Solutions. SpringerScience & Business Media.Silver, D.; Schrittwieser, J.; Simonyan, K.; Antonoglou, I.;Huang, A.; Guez, A.; Hubert, T.; Baker, L.; Lai, M.; Bolton,A.; et al. 2017. Mastering the Game of Go Without HumanKnowledge. Nature 550(7676): 354.Spaulding, J.; Noda, H.; Shirazi, M. N.; and Kawaguchi, E.2002. BPCS Steganography using EZW Lossy CompressedImages. Pattern Recognition Letters 23(13): 1579–1587.Sun, T. 2019. Applying Deep Learning to Audit Procedures:An Illustrative Framework. Accounting Horizons 33(3): 89–109.Sutskever, I.; Vinyals, O.; and Le, Q. V. 2014. Sequence toSequence Learning with Neural Networks. In Advances inNeural Information Processing Systems, 3104–3112.Swain, G. 2018. Digital Image Steganography using Eight-directional PVD against RS Analysis and PDH Analysis.Advances in Multimedia 2018.Szegedy, C.; Zaremba, W.; Sutskever, I.; Bruna, J.; Erhan,D.; Goodfellow, I.; and Fergus, R. 2013. Intriguing Proper-ties of Neural Networks. arXiv preprint:1312.6199 .Tamimi, A. A.; Abdalla, A. M.; and Al-Allaf, O. 2013.Hiding an Image inside another Image using Variable-RateSteganography. International Journal of Advanced Com-puter Science and Applications (IJACSA) 4(10).Tan, S.; and Li, B. 2014. Stacked Convolutional Autoen-coders for Steganalysis of Digital Images. In Signal and In-formation Processing Association Annual Summit and Con-ference (APSIPA), 2014 Asia-Pacific, 1–4. IEEE.The Economist. 2017. The World’s Most Valuable Resourceis No Longer Oil, but Data. New York, NY, USA .Thomee, B.; Shamma, D. A.; Friedland, G.; Elizalde, B.; Ni,K.; Poland, D.; Borth, D.; and Li, L.-J. 2016. YFCC100M:The New Data in Multimedia Research. Communications ofthe ACM 59(2): 64–73.Viji, G.; and Balamurugan, J. 2011. LSB Steganography inColor and Grayscale Images without using the Transforma-tion. Bonfring International Journal of Advances in ImageProcessing 1: 11–14.Wang, Z.; Bovik, A. C.; Sheikh, H. R.; and Simoncelli, E. P.2004. Image Quality Assessment: From Error Visibility toStructural Similarity. IEEE Transactions on Image Process-ing 13(4): 600–612.Xu, G.; Wu, H.-Z.; and Shi, Y.-Q. 2016. Structural Designof Convolutional Neural Networks for Steganalysis. IEEESignal Processing Letters 23(5): 708–712.Zhang, S.; Yang, L.; Xu, X.; and Gao, T. 2018. SecureSteganography in JPEG Images Based on Histogram Mod-ification and Hyper Chaotic System. International Journalof Digital Crime and Forensics (IJDCF) 10(1): 40–53.Zhu, J.; Kaplan, R.; Johnson, J.; and Fei-Fei, L. 2018. Hid-den: Hiding Data with Deep Networks. In Proc. of the Eu-ropean Conference on Computer Vision (ECCV), 657–672.

Appendix A: Architectural Details

Table 3: Architectural details of the distinct neural networks that constitute the steganography process, namely preparationnetwork Pθ, hiding network Hφ, and reveal network Rγ .

Isec ∈ RH×W×1

Conv503×3 Conv504×4 Conv505×5

ReLU ReLU ReLU

Conv503×3 Conv504×4 Conv505×5

ReLU ReLU ReLU

Conv503×3 Conv504×4 Conv505×5

ReLU ReLU ReLU

Conv503×3 Conv504×4 Conv505×5

ReLU ReLU ReLU

Cat150H×W

Conv503×3 Conv504×4 Conv505×5

ReLU ReLU ReLU

Conv14×4

ReLU

Cat3H×W

Ipre ∈ RH×W×3

Preparation Net Pθ

Ipre, Icov ∈ RH×W×3

Cat6H×W

Conv503×3 Conv504×4 Conv505×5

ReLU ReLU ReLU

Conv503×3 Conv504×4 Conv505×5

ReLU ReLU ReLU

Conv503×3 Conv504×4 Conv505×5

ReLU ReLU ReLU

Conv503×3 Conv504×4 Conv505×5

ReLU ReLU ReLU

Cat150H×W

Conv503×3 Conv504×4 Conv505×5

ReLU ReLU ReLU

Conv13×3 Conv14×4 Conv15×5

ReLU ReLU ReLU

Conv31×1(Cat150H×W )

Icon ∈ RH×W×3

Hiding Net Hφ

Icon ∈ RH×W×3

Conv503×3 Conv504×4 Conv505×5

ReLU ReLU ReLU

Conv503×3 Conv504×4 Conv505×5

ReLU ReLU ReLU

Conv503×3 Conv504×4 Conv505×5

ReLU ReLU ReLU

Conv503×3 Conv504×4 Conv505×5

ReLU ReLU ReLU

Cat150H×W

Conv503×3 Conv504×4 Conv505×5

ReLU ReLU ReLU

Conv14×4

ReLU

Conv12×2(Cat3H×W )

Irev ∈ RH×W×1

Reveal Net Rγ

Each network applies a series of convolutions and non-linear activation’s onto their respective input images. Each networkapplies a series of 2D convolutions Convck×k (LeCun, Bengio et al. 1995) of different filter sizes applied onto the input images,where c denotes the number of output channels and k × k denotes the filter kernel size. The convolutions are followed bynon-linear ReLU activation’s (Nair and Hinton 2010).

Appendix B: Additional Experimental Results

Table 4: Network training losses (LALL

θ,φ,γ , Lcovθ,φ , Lsecθ,φ,γ), PSNR, SSIM, and BACC obtained on both city payment datasets whenusing single channel C = 1 grayscale (H×W×1) cover images.

Dataset BPP α β LALLθ,φ,γ Lcovθ,φ Lsecθ,φ,γ PSNR SSIM BACC

A 0.13070.2 1.0 0.11 ± 0.01 0.01 ± 0.00 0.54 ± 0.01 64.84 ± 0.49 0.999 ± 0.001 0.997 ± 0.0030.5 1.0 0.29 ± 0.00 0.01 ± 0.00 0.54 ± 0.01 60.93 ± 1.58 0.999 ± 0.001 0.997 ± 0.0020.8 1.0 0.44 ± 0.00 0.01 ± 0.00 0.54 ± 0.00 60.98 ± 0.13 0.999 ± 0.001 0.999 ± 0.0061.0 1.0 0.55 ± 0.01 0.01 ± 0.00 0.54 ± 0.01 60.15 ± 1.79 0.999 ± 0.001 0.999 ± 0.004

B 0.03590.2 1.0 0.75 ± 0.02 0.82 ± 0.08 0.53 ± 0.01 39.98 ± 0.33 0.985 ± 0.001 0.983 ± 0.0030.5 1.0 0.87 ± 0.04 0.64 ± 0.07 0.55 ± 0.02 42.27 ± 0.49 0.989 ± 0.001 0.977 ± 0.0020.8 1.0 0.98 ± 0.05 0.52 ± 0.06 0.58 ± 0.01 43.19 ± 0.53 0.990 ± 0.001 0.967 ± 0.0061.0 1.0 0.99 ± 0.13 0.41 ± 0.14 0.58 ± 0.01 44.28 ± 1.34 0.991 ± 0.001 0.966 ± 0.005

Values of the distinct loss functions L are multiplied by 104, variances originate from parameter initialization using four distinct random seeds.