gdpr and data archiving - flatirons digital innovations · data archiving satisfies compliance...

How OpenTextTM InfoArchive

Supports GDPR Requirements

WHITE PAPER

GDPR AND DATA ARCHIVING

© 2018 Flatirons Digital Innovations, Inc. All rights reserved.

2© 2018 Flatirons Digital Innovations, Inc. All rights reserved. GDPR AND DATA ARCHIVING

I. Introduction: From GDPR Awareness to GDPR Preparedness------ 3

II. Data Archiving and GDPR --------------------------------------- 4

III. InfoArchive from OpenText and How It Supports GDPR------------ 6

A. Data Protection (“Privacy by Design”)---------------------- 7

B. Consent ------------------------------------------------- 8

C. Data Portability ----------------------------------------- 8

D. Right to Erasure (“Right to be Forgotten”) ------------------ 8

E. Pseudonymization ---------------------------------------- 9

F. Data Breach Notification---------------------------------- 9

G. Storage Limitations (Data Retention Periods) -------------- 9

IV. Getting Started ------------------------------------------------- 10

TABLE OF CONTENTS


l. IntroductionFrom GDPR Awareness toGDPR Preparedness

The General Data Protection Regulation, or GDPR, that goes into effect

May 25, 2018, “is the most sweeping revision to European privacy and

data protection legislation ever,” says Digital Clarity Group.

They add, “The legal reach of the GDPR isn’t defined by geography but

by the use of the personal data of European residents. That means

that it applies to any organization, located anywhere in the world that

either ‘offers goods and services’ to European residents or ‘monitors

their behavior.’” They add that, for affected organizations, “Every single

business process that touches personal data will have to be very carefully

reviewed and, in all likelihood, redesigned to comply with the GDPR–or be

scrapped.”

GDPR establishes provisions around the processing of personal data.

It specifies requirements for “data protection (or privacy) by design”,

consent, data portability, the right to erasure (or the “right to be

forgotten”), pseudonymization, data breach notifications, storage

limitations, and more. A single fine for violating one of these GDPR

provisions can result in up to €20 million or 4% of an organization’s global

turnover, whichever is greater.

With this information in hand, organizations now must take action to

1) thoroughly understand the personal data of European Union residents

that they handle across all departments, 2) identify the steps they need to

take to become GDPR-compliant, and 3) begin implementing their action

plans in order to be prepared by May 25, 2018.

This white paper describes how the practice of data archiving with

InfoArchive from OpenText facilitates GDPR compliance with respect to

the requirements mentioned above. It also describes how Flatirons works

with leading enterprise information management provider OpenText to

address GDPR requirements across the entire organization.


ll. Data Archiving and GDPRIn order to understand how data archiving contributes to GDPR

compliance, one needs to understand what data archiving is and why

organizations archive data. Data archiving is the management of static

information (information that is no longer being updated or that no longer

changes). It involves moving any type of static information—whether

structured or unstructured—from production systems or outdated or

otherwise unsupported applications to an archive repository where it is

kept before being purged at the end of its lifecycle.

There are two general scenarios for archiving data: active archiving and

archiving data from legacy systems. Both should be part of a larger data

or information governance strategy that primarily aims to support data

management requirements and reduce overall data management and

storage costs.

Active archiving refers to the identification of information on production

systems that reaches a point of inactivity—when it effectively becomes

static—and moving it automatically to an archive. In contrast, archiving

data from outdated, unsupported systems is done so legacy systems

can be “turned off” while ongoing access to their data is maintained and

it is safeguarded. In this context, data is migrated from legacy systems

through an extract, transform, and load (ETL) process to a consolidated

archiving platform. The archiving platform retains the data until an

identified purge or destroy date, when the information reaches the end of

its lifecycle.


There are three main reasons why organizations archive information:

1. Compliance – Archiving is most often done in highly regulatedindustries where compliance requirements mandate thatorganizations maintain information for specific periods of time(such as records retention policies in healthcare, banking, insurance,and public utilities or state, regional, or local governments).Consolidating information into a single archiving platform makes iteasier to manage compliance requirements than when it is spreadacross multiple production and legacy systems.

2. Cost Savings and IT Simplification – Organizations archivedata from more costly Tier 1 storage to less expensive archivingrepositories to reduce storage costs and lessen the impact of staticor aging data on the data center footprint. Once data that is subjectto retention policies or deemed important for business purposesis moved to the archive, legacy applications can be “turned off” ordecommissioned, not only simplifying the IT infrastructure, but alsomaking it possible to recoup millions of dollars typically spent onmaintenance and support fees.

3. Strategic Business Purposes (analytics) – Archiving also is donefor strategic business purposes, such as for analytics incorporatinghistorical information to gain insights that may inform productsor services. When static information is spread out over multiplesystems (especially aging, unsupported, or redundant businessapplications that are no longer in use), it can be difficult or nearlyimpossible to tap into these data sources in order to study theirinformation. Moving legacy data to an archive makes it possible toeasily access data for analytics.

For these reasons, organizations archive data as a key component of their

information lifecycle management strategies. Now, with the introduction

of GDPR, archiving static information from both production and legacy

systems becomes even more important.

Specifically:

1. Archiving makes data retrieval easier – When static informationis spread out over a number of production and legacy systems, itis harder to access and fully identify all of an individual’s personaldata. The task takes longer and it is difficult to demonstrate that allpersonal records are accounted for. On the other hand, when staticinformation from all active production and legacy systems hasbeen moved to a consolidated repository with built-in search andretrieval tools, one can more quickly and confidently fulfill requestsfor personal information.


2. Archiving provides tighter control over static information –Effective data archiving supports tighter, more reliable control overinformation that is seldom used. By moving static information frommultiple sources to a single archive, organizations demonstrate anactive method of controlling security, access, and other factors thatpertain to personal data that may not be in place with—or evensupported by—either production or legacy systems.

3. Archiving reduces risk – By consolidating static information to asingle archive, providing tighter control over it, and making retrievaleasier, organizations reduce the risk of GDPR violations through abetter ability to comply with GDPR requirements. Those that do notarchive static information, and instead leave it scattered acrossthe organization on multiple systems, take the risk of failing to meetGDPR specifications or even losing the information altogethershould a legacy application fail.

III. InfoArchive and How it SupportsGDPR ComplianceAs organizations evaluate the measures that they must put in place to

meet GDPR requirements, data archiving is one that should be high on the

list. Among archiving platforms, OpenTextTM InfoArchive stores structured

and unstructured information from multiple production and legacy

applications, removing information silos and streamlining access control

and compliance management. InfoArchive provides mechanisms to not

only safeguard the integrity of content that is no longer being updated or

changed, but it also ensures secure, ongoing access to it by defined users

throughout the entire retention period. Moreover, InfoArchive provides

these controls while doing so at the lowest possible cost.

InfoArchive is the only archiving platform based on open standards that archives structured and unstructured content at scale across an enterprise.


InfoArchive represents archived information as XML objects, making unified access, query, and reporting fast and easy. Tens of billions of objects can be archived, reducing cost of ownership. The benefits of archiving with InfoArchive are cumulative, so cost savings increase as more information is archived.

InfoArchive helps meet compliance requirements such as retention, data encryption, electronic signature, and time stamping. It also supports PCI-DSS (Payment Card Industry Data Security Standard) and complies with the Open Archival Information System (OAIS) standard.

InfoArchive provides unified access to archived information so that authorized users (auditors, partners, employees, customers) can quickly find the information they need. Users can search for data across multiple datasets concurrently. Average search times can be as short as one or two seconds for enterprises that archive millions of documents and data records per day.

Unlike any other archiving solution, InfoArchive’s unified approach to data archiving satisfies compliance regulations, enables organizations to reduce primary storage usage, and facilitates the decommissioning of legacy applications through a single, scalable solution that reduces costs in the process.

InfoArchive and GDPR ComplianceData archiving with InfoArchive supports GDPR compliance in the following ways:

GDPR Requirement InfoArchive Capability

Organizations must be able to demonstrate that they

embrace and embody the core principles concerning

privacy and personal data protection in their policies,

processes, and behaviors.

They must be able to show that “every technical or

business process that handles personal data has been

carefully and conscientiously designed to use as little

data as possible, for the shortest possible period of

time, while exposing it to the fewest number of people,

and deleting the data as quickly as possible when the

processing is completed.”

InfoArchive provides built-in data protection, encryption,

and access controls. Data is encrypted during ingestion

into InfoArchive and decrypted when queried and

rendered in the user interface.

Through the use of permissions and groups, data can be

restricted to identified business units and users.

InfoArchive uses masking to hide access to personal data

when it is not required.

Where possible, personal data can be stored separately

(i.e.: pseudonymized). If necessary, a relation can be

made between the personal data and the other data.

A. Data Protection (or Privacy) by Design and Default



Consent is one of six legal grounds that can be used to

ensure lawfulness of data processing. Consent must

be requested for each process in which personal data

is used, and a record of consent must be stored in an

auditable manner.

InfoArchive stores acknowledgement of consent at a

single location.

By storing a record of consent in relation to data

requested, organizations can prove their compliance

when audited. When consent is withdrawn, the related

data can easily be identified.

*Note, InfoArchive also facilitates providing proof

or records of any of the six legal grounds on which

data processing can occur. In addition to consent, the

others include contract, legal obligation, protection of

the vital interests of the data subject, public interest,

and legitimate interests. These can be stored inside

InfoArchive, providing evidence of lawfulness of

processing and the actual pieces or documents (or linking

to them) that represent this lawfulness.


Individuals can request that their data be transferred to

another firm (in an “easily machine readable format”).

InfoArchive is built upon industry standard XML, which

ensures the ability to extract, transform, and transmit

data in any form, for any application. This supports

GDPR requirements for data portability by making

it easy to respond to data portability requests in a

structured, commonly used and machine-readable

format.


Individuals “should have control of their own personal

data.” This means that an individual can request that

their data should be erased, meaning that every piece

of personal data must be completely erased from all

systems.

During the data archiving requirements definition phase,

data should be identified as personal information to

separate from business logic. This provides the ability to

build a specific UI and query for personal data.

When personal data has been identified via a query, a

custom purging job can be executed to remove only the

personal data, retaining the business logic for insights.

B. Consent

C. Data Portability

D. Right to Erasure (“Right to be Forgotten”)



Pseudonymization is the processing of personal data in

such a manner that the personal data can no longer be

attributed to a specific data subject without the use of

additional information, provided that such additional

information is kept separately and is subject to technical

and organizational measures to ensure that the personal

data are not attributed to an identified or identifiable

natural person.

Pseudonymized data can be ingested into InfoArchive

through the data preparation and ETL (extract,

transform, load) process. InfoArchive can store

pseudonymized records separately from their identifying

fields. InfoArchive security allows providing customized

data protection, based on whether or not the full data

set or only pseudonymized information is required.


In the case of a personal data breach, data controllers

shall without undue delay and, where feasible, not later

than 72 hours after having become aware of it, notify the

personal data breach to the supervisory authority.

Changes to data and system actions can be monitored

at the InfoArchive system level in the audit logs. Audit

logs are provided via the UI (User Interface) and via the

REST API. Database and network monitoring tools can

be used within an over-arching solution.


GDPR permits the storage of data for longer periods

than necessary where the data is being processed for

archiving purposes in the public interest and/or scientific

purposes.

Managing retention within a single InfoArchive repository

simplifies retention that otherwise would be done across

multiple systems.

Duration-based retention policies can be created

using the InfoArchive UI. When a specified duration

period is reached (e.g., seven years), a purge job can be

activated with the data being removed from the archive

permanently.

E. Pseudonymization

F. Data Breach Notifications

G. Storage Limitation / Data Retention Periods


IV. Getting StartedNo matter the size of your organization or the industry you’re in, chances

are you have legacy systems in your IT portfolio and static data residing

in primary storage or on production systems—both of which potentially

contain personal data subject to GDPR compliance.

So how do you get started using InfoArchive to manage your GDPR

compliance?

1. Taking an inventory of legacy applications: Identify outdated,

unsupported systems that are candidates for decommissioning and

determine if they contain personal data subject to GDPR requirements. In

addition to proactively addressing personal data on legacy systems, you

can estimate your return on investment for eliminating legacy systems.

Flatirons can guide you in the process and build a cost/return estimate to

start you off on the right foot.

2. Identifying static personal data on primary storage and production

systems. Determine if you have static data taking up valuable space

on Tier 1 storage or production systems. If so, understand if any of that

information contains personal data of European residents and is subject

to GDPR. By understanding the type of information you’re managing

on primary storage and production systems, you can begin to identify

personal data that can be moved to an archive that supports GDPR

requirements. At the same time, you can start identifying active archiving

triggers that can move static data to an archive when it reaches specific

stages.

3. Review your information governance strategy. It’s one thing to

incorporate data archiving as a foundational component of your

information governance strategy to ensure GDPR compliance. It’s a much

more significant step to connect data archiving to your organization’s

overall approach to information lifecycle management. Consider working

with information lifecycle consultants like Flatirons and OpenText to review

your information governance strategy across your entire organization. We

can facilitate a process with key stakeholders for reviewing your existing

information management processes, gaps for meeting GDPR compliance,

and the most cost-effective strategy for ensuring GDPR compliance

before May 25, 2018.

Flatirons Digital Innovations, Inc., or Flatirons (www.fdiinc.com), builds a more

educated and informed society by enabling transparent and accessible digital

information. It does this by facilitating timely, accurate, and informed conversations

between organizations and their customers that help solve complex content and

data-driven challenges at the heart of business operations. Flatirons specializes in

enterprise content services through technology assessments, solution blueprints,

and implementation, integration and support for projects ranging from Application

Decommissioning and Data Archiving, to Document Capture, Revenue Lifecycle

Management and more. Flatirons is based in Boulder, Colorado.

Flatirons Digital Innovations, Inc. 3005 Center Green Drive, Suite 225

Boulder, CO 803011-888-310-3440

© 2018 Flatirons Digital Innovations, Inc. All rights reserved.

www.fdiinc.com

[email protected]

gdpr and data archiving - flatirons digital innovations · data archiving satisfies compliance...

Documents