1. product documentation / user · control file to assign parameters for aws s3, hadoop, security,...

38
1. Product Documentation / User . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.1 OverView . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.1.1 Installation / Implementation Video Plus Demo Video Plus URL / Key Code / Login . . . . . . . . . . . . . . . . . . . . . . . . 3 1.2 Administrator Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.2.1 Administrator link to create and or update User Authority, Password, Email and more . . . . . . . . . . . . . . . . . . . . . . 4 1.2.2 Control File - A Core to making BigDataRevealed the marvel it is. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 1.2.2.1 Amazon AWS S3 Security Credentials and Key Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 1.2.2.2 Control File Additional Hadoop Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 1.2.2.3 Control File and Setup first time in or when changes are warranted . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 1.2.2.4 Control File Cloudera Impala / Navigator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 1.2.2.5 Control File Databases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 1.2.2.6 Control File Email Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 1.2.2.7 Control File Kerberos Security Configuration Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 1.2.2.8 Control File Twitter Connection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 1.2.3 Manage Users and Maintain Users Credentials and Authorities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 1.2.4 Modify RegEx RegEx Group . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 1.2.4.1 Create, Modify, Delete Regular Expression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 1.2.4.2 Create, Review Maintain, Delete Regular Expression Groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 1.2.4.3 Create and Modify Regular Expression and RegEx Grouping for Pattern Discovery . . . . . . . . . . . . . . . . . . . 12 1.2.5 Notifications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 1.3 Admin User Section . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 1.3.1 Admin User Profile Maintenance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 1.3.2 Create Watches for value ranges for maintenance, AML or for Patterns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 1.3.3 Operational and User Lineage Logs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 1.4 File Content Prep and Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 1.4.1 File Content Viewer and Delimiter Validator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 1.4.2 File System Tree for AWS S3 and Hadoop HDFS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 1.5 Jobs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 1.5.1 Display Jobs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 1.5.1.1 Click on Social Security in Quick Classification showing 5 Discoveries found . . . . . . . . . . . . . . . . . . . . . . . . 18 1.5.1.2 Executive Summary and Quick Column Classification Graphs and drill . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 1.5.1.2.1 Click on Social Security in Quick Classification showing 5 Discoveries found and Drilling into Results . 19 19 1.5.1.2.2 Executive Summary - Interactive Graphs and drill for Quick Columnar Classification . . . . . . . . . . . . . 20 1.5.2 Run Jobs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 1.5.2.1 Quick Classification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 1.5.2.1.1 Quick Classification Running . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 1.5.2.1.2 Quick Classification Run Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 1.5.2.2 Run Pattern Discovery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 1.5.2.2.1 Pattern Job Run and Display Screen . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 1.5.2.2.2 Run The Pattern Discovery for User Selected Patterns for Compliance Checking . . . . . . . . . . . . . . . . 23 1.6 Running of Basic Core Jobs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 1.6.1 Run the Data Discovery Job . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 1.6.2 Validating Delimiters before final execution of the Data Discovery Run . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 1.6.3 Verify Job is running and view when completed . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 1.7 Encrypting Column 11 Social Security vimeo.com/251375791 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 1.7.1 Encryption of Compliance Violations- Validation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 1.8 Decrypting one or more columns of data if credentials allowed . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 1.8.1 Validation that Column 11 has been Decrypted . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 1.9 Indirect Identifiers (the Regulations that will fail the vast majority) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 1.9.1 Direct Identifiers Stage One of 2. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 1.9.1.1 Direct Identifiers Results phase One of Two . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 1.9.1.2 Forgotten Identity Screen used to Decrypt Right if Erasure and Change of Consent . . . . . . . . . . . . . . . . . . . 32 1.9.1.3 Forgotten Identity Screen used to Decrypt Right if Erasure and Change of Consent Cont2 . . . . . . . . . . . . . 32 1.9.2 Indirect Identifiers and the Citizens Right of Erasure AKA Right to be Forgotten. . . . . . . . . . . . . . . . . . . . . . . . . . . 33 1.9.3 Indirect Identifiers completed job Open and Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 1.9.4 Indirect Identifiers completed job Open and Review Cont. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 1.9.5 Indirect Identifiers completed job Open and Review Cont. 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 1.9.6 Indirect Identifiers completed job Open and Review Cont. 4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 1.9.7 Indirect Identifiers completed job Open and Review Cont. 5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 1.9.8 Indirect Identifiers completed job Open and Review Cont. 6 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 1.9.9 To Discover the Indirect identifiers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 1.10 Live Streaming Remediation / Encryption Results Viewer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 1.10.1 Live Streaming Data Compliance/ Remediation / Encryption on the Fly . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 1.10.2 Producer File Creator to connect and process data from live streams . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 1.10.3 Run Parameters for a Producer to access data and have that data Discovered by BigDataRevealed . . . . . . . . . 38

Upload: others

Post on 20-May-2020

11 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 1. Product Documentation / User · Control File to assign parameters for AWS S3, Hadoop, Security, Streaming and more .... Amazon AWS S3 Security Credentials and Key Codes Setup all

1. Product Documentation / User . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21.1 OverView . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

1.1.1 Installation / Implementation Video Plus Demo Video Plus URL / Key Code / Login . . . . . . . . . . . . . . . . . . . . . . . . 31.2 Administrator Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

1.2.1 Administrator link to create and or update User Authority, Password, Email and more . . . . . . . . . . . . . . . . . . . . . . 41.2.2 Control File - A Core to making BigDataRevealed the marvel it is. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

1.2.2.1 Amazon AWS S3 Security Credentials and Key Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51.2.2.2 Control File Additional Hadoop Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61.2.2.3 Control File and Setup first time in or when changes are warranted . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61.2.2.4 Control File Cloudera Impala / Navigator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71.2.2.5 Control File Databases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71.2.2.6 Control File Email Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81.2.2.7 Control File Kerberos Security Configuration Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81.2.2.8 Control File Twitter Connection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

1.2.3 Manage Users and Maintain Users Credentials and Authorities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91.2.4 Modify RegEx RegEx Group . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

1.2.4.1 Create, Modify, Delete Regular Expression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101.2.4.2 Create, Review Maintain, Delete Regular Expression Groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111.2.4.3 Create and Modify Regular Expression and RegEx Grouping for Pattern Discovery . . . . . . . . . . . . . . . . . . . 12

1.2.5 Notifications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121.3 Admin User Section . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

1.3.1 Admin User Profile Maintenance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131.3.2 Create Watches for value ranges for maintenance, AML or for Patterns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141.3.3 Operational and User Lineage Logs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

1.4 File Content Prep and Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151.4.1 File Content Viewer and Delimiter Validator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151.4.2 File System Tree for AWS S3 and Hadoop HDFS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

1.5 Jobs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171.5.1 Display Jobs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

1.5.1.1 Click on Social Security in Quick Classification showing 5 Discoveries found . . . . . . . . . . . . . . . . . . . . . . . . 181.5.1.2 Executive Summary and Quick Column Classification Graphs and drill . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

1.5.1.2.1 Click on Social Security in Quick Classification showing 5 Discoveries found and Drilling into Results . 19191.5.1.2.2 Executive Summary - Interactive Graphs and drill for Quick Columnar Classification . . . . . . . . . . . . . 20

1.5.2 Run Jobs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201.5.2.1 Quick Classification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

1.5.2.1.1 Quick Classification Running . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211.5.2.1.2 Quick Classification Run Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

1.5.2.2 Run Pattern Discovery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 231.5.2.2.1 Pattern Job Run and Display Screen . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 231.5.2.2.2 Run The Pattern Discovery for User Selected Patterns for Compliance Checking . . . . . . . . . . . . . . . . 23

1.6 Running of Basic Core Jobs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 241.6.1 Run the Data Discovery Job . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 241.6.2 Validating Delimiters before final execution of the Data Discovery Run . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 251.6.3 Verify Job is running and view when completed . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

1.7 Encrypting Column 11 Social Security vimeo.com/251375791 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 261.7.1 Encryption of Compliance Violations- Validation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

1.8 Decrypting one or more columns of data if credentials allowed . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 281.8.1 Validation that Column 11 has been Decrypted . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

1.9 Indirect Identifiers (the Regulations that will fail the vast majority) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 301.9.1 Direct Identifiers Stage One of 2. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

1.9.1.1 Direct Identifiers Results phase One of Two . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 321.9.1.2 Forgotten Identity Screen used to Decrypt Right if Erasure and Change of Consent . . . . . . . . . . . . . . . . . . . 321.9.1.3 Forgotten Identity Screen used to Decrypt Right if Erasure and Change of Consent Cont2 . . . . . . . . . . . . . 32

1.9.2 Indirect Identifiers and the Citizens Right of Erasure AKA Right to be Forgotten. . . . . . . . . . . . . . . . . . . . . . . . . . . 331.9.3 Indirect Identifiers completed job Open and Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 331.9.4 Indirect Identifiers completed job Open and Review Cont. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 341.9.5 Indirect Identifiers completed job Open and Review Cont. 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 341.9.6 Indirect Identifiers completed job Open and Review Cont. 4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 351.9.7 Indirect Identifiers completed job Open and Review Cont. 5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 351.9.8 Indirect Identifiers completed job Open and Review Cont. 6 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 361.9.9 To Discover the Indirect identifiers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

1.10 Live Streaming Remediation / Encryption Results Viewer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 371.10.1 Live Streaming Data Compliance/ Remediation / Encryption on the Fly . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 371.10.2 Producer File Creator to connect and process data from live streams . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 371.10.3 Run Parameters for a Producer to access data and have that data Discovered by BigDataRevealed . . . . . . . . . 38

Page 2: 1. Product Documentation / User · Control File to assign parameters for AWS S3, Hadoop, Security, Streaming and more .... Amazon AWS S3 Security Credentials and Key Codes Setup all

Product Documentation / UserBigDataRevealed for EU GDPR and most any Regulatory Data Compliance

BigDataRevealed assists in your ability to Protect

your Customers Private Data to meet the EU

GDPR and most any Regulatory Compliance

 

Core team

Steven MeisterFounder

Tyler Miller Vice President

Shashank Senior Developer

Quick navigation

When you create new pages in thisspace, they'll appear hereautomatically.

OverViewAdministrator FeaturesAdmin User SectionFile Content Prep andSelectionJobsRunning of Basic Core JobsEncrypting Column 11 SocialSecurityvimeo.com/251375791Decrypting one or morecolumns of data if credentialsallowedIndirect Identifiers (theRegulations that will fail thevast majority)Live Streaming Remediation/ Encryption Results Viewer

OverView

BigDataRevealed was built with EU GDPR and all Governmental Data Regulatory Agencies in mind. BigDataRevealed offers a mean tocollaboratively and extensively Discover (find . locate) Personally Identifiable Information in most any data format or type, Remediate ( Quarenteinanad or Encrypt this Sensitive data) protected by Regulatory Agencies as well as Encrypt-on-Fly live streams of business and social data.

BigDataRevealed also Discovers and Remediates the complexity of Indirect Identifiers, The Citizens right of Erasure (Right to be forgotten), andallow Citizens for verify, accept and or deny their consent to all or parts of their personally protected information. 

BigDataRevealed as its name advocates, is meant with Big Data in mind, though delivers just the same for the SMB/SME businesses inhouseand on the cloud.

BigDataRevealed keeps its overall costs down to the Customer by using the Apache Hadoop Open Source Platform as well as the free to low costAmazon AWS S3 Environments.

BigDataRevealed is written in Spark with Java and scales like no other environments and does so cost effectively and securely. 

BigDataRevealed plans in the first quarter to also source directly to most all RDBMS, though is a believer in a Central Repository for proper and

Page 3: 1. Product Documentation / User · Control File to assign parameters for AWS S3, Hadoop, Security, Streaming and more .... Amazon AWS S3 Security Credentials and Key Codes Setup all

accurate results.

BigDataRevealed prides itself on the ability to install and implement in minutes and start delivering Compliance Discovery and Remediation Dayone for millions upon millions of rows of data while jump-starting your Compliance projects exponentially over inhouse, consultative or other thirdparty applications.

Installation / Implementation Video Plus Demo Video Plus URL / Key Code / Login

Here you can find the  Video forvideo of the installation / implementation process.

Here you can find a basic demo  video of BigDataRevealed Application

You know it is your first time in if you are prompted to key in a Key Code. If you do not have the key code please email at privacyinfo@bigdatarev or call 847-440-4439.ealed.com

If logging in for the first time please use Username hadoop and password revealed or ask you administrator for your assigned credentials. It isstrongly recommended to change your credentials in the Admin section at top right once logged in. 

                                                 

Page 4: 1. Product Documentation / User · Control File to assign parameters for AWS S3, Hadoop, Security, Streaming and more .... Amazon AWS S3 Security Credentials and Key Codes Setup all

Administrator Features

Administrator link to create and or update User Authority, Password, Email and more

A senior administrator can add or modify authorities and User can modify their email and password

Administrator screen for User rules and permission and users ability to update email and password

only

Page 5: 1. Product Documentation / User · Control File to assign parameters for AWS S3, Hadoop, Security, Streaming and more .... Amazon AWS S3 Security Credentials and Key Codes Setup all

Control File - A Core to making BigDataRevealed the marvel it is.

Control File to assign parameters for AWS S3, Hadoop, Security, Streaming and more ....

Amazon AWS S3 Security Credentials and Key Codes

Setup all necessary AWS S3 Security Credential to communicate with the Server and the Buckets of Data Assets

Page 6: 1. Product Documentation / User · Control File to assign parameters for AWS S3, Hadoop, Security, Streaming and more .... Amazon AWS S3 Security Credentials and Key Codes Setup all

Control File Additional Hadoop Parameters

Enter any required additional Hadoop Information

Control File and Setup first time in or when changes are warranted

Server Settings are for configuring Hadoop, Spark, Kafka, Drill and other Hadoop parameters

Page 7: 1. Product Documentation / User · Control File to assign parameters for AWS S3, Hadoop, Security, Streaming and more .... Amazon AWS S3 Security Credentials and Key Codes Setup all

Control File Cloudera Impala / Navigator

If you are Using Cloudera Hadoop and either Impala or Navigator fill in the appropriate parameters

Control File Databases

Set-Up Parameters to connect to certain RDBMS Files

Page 8: 1. Product Documentation / User · Control File to assign parameters for AWS S3, Hadoop, Security, Streaming and more .... Amazon AWS S3 Security Credentials and Key Codes Setup all

Control File Email Settings

Used to contact users when Watches and Warnings occur such as AML or Parts beginning to fail

Control File Kerberos Security Configuration Settings

Setup of Kerberos Security settings if the server BigDataRevealed is running on has implemented Kerberos

Page 9: 1. Product Documentation / User · Control File to assign parameters for AWS S3, Hadoop, Security, Streaming and more .... Amazon AWS S3 Security Credentials and Key Codes Setup all

Control File Twitter Connection

Twitter Connection Strings for reading and processing streams as well as contacting people when alerts and issues areencountered

Manage Users and Maintain Users Credentials and Authorities

Manage Users and Maintain Users Credentials and Authorities

Page 10: 1. Product Documentation / User · Control File to assign parameters for AWS S3, Hadoop, Security, Streaming and more .... Amazon AWS S3 Security Credentials and Key Codes Setup all

Modify RegEx RegEx Group

Modify RegEx RegEx Group

Create, Modify, Delete Regular Expression

These are sharable, collaborative pattern detectors 

Video of Creation of RegEx Pattern Discovery and Discovery group and run them in the Pattern Discovery run.

   vimeo.com/251278398

Page 11: 1. Product Documentation / User · Control File to assign parameters for AWS S3, Hadoop, Security, Streaming and more .... Amazon AWS S3 Security Credentials and Key Codes Setup all

Create, Review Maintain, Delete Regular Expression Groups

Allows 2 or more Patterns to be searched, Discovered at once

Page 12: 1. Product Documentation / User · Control File to assign parameters for AWS S3, Hadoop, Security, Streaming and more .... Amazon AWS S3 Security Credentials and Key Codes Setup all

Create and Modify Regular Expression and RegEx Grouping for Pattern Discovery

Create and Modify Regular Expression and RegEx Grouping for Pattern Discovery   vimeo.com/251278398

vimeo.com/251278398 

Notifications

Notifications sent to Users when watches occur

Page 13: 1. Product Documentation / User · Control File to assign parameters for AWS S3, Hadoop, Security, Streaming and more .... Amazon AWS S3 Security Credentials and Key Codes Setup all

Admin User Section

Admin user Section

Admin User Profile Maintenance

Add or Change content in Your User Profile

Page 14: 1. Product Documentation / User · Control File to assign parameters for AWS S3, Hadoop, Security, Streaming and more .... Amazon AWS S3 Security Credentials and Key Codes Setup all

Create Watches for value ranges for maintenance, AML or for Patterns

For a file specify a column to monitor every x time to detect parts going defective, AML or patterns such as email addresses

Operational and User Lineage Logs

Logs of operations run by users and System Logs

Page 15: 1. Product Documentation / User · Control File to assign parameters for AWS S3, Hadoop, Security, Streaming and more .... Amazon AWS S3 Security Credentials and Key Codes Setup all

File Content Prep and Selection

Select folders or files to run BigDataRevealed for Discovery and Remediation

File Content Viewer and Delimiter Validator

https://vimeo.com/251370329 File Content Viewer and Delimiter Validator 

Page 16: 1. Product Documentation / User · Control File to assign parameters for AWS S3, Hadoop, Security, Streaming and more .... Amazon AWS S3 Security Credentials and Key Codes Setup all

File System Tree for AWS S3 and Hadoop HDFS

https://vimeo.com/251370329 File System Tree for AWS S3 and Hadoop HDFS to be processed by BigDataRevealed

This is where the User selects the folders or files to run for Personally Identifiable Information and Business Columnar Classifications. The filescan also be viewed in the file content viewer and well as selecting a specific folder or file, the prior run results will be shown on the ExecutiveSummary Graphs Technical Dashboard.

Page 17: 1. Product Documentation / User · Control File to assign parameters for AWS S3, Hadoop, Security, Streaming and more .... Amazon AWS S3 Security Credentials and Key Codes Setup all

Jobs

Where the User can Run Jobs or Display Jobs

Display Jobs

Show jobs that have been submitted and running and jobs already completed

Page 18: 1. Product Documentation / User · Control File to assign parameters for AWS S3, Hadoop, Security, Streaming and more .... Amazon AWS S3 Security Credentials and Key Codes Setup all

Click on Social Security in Quick Classification showing 5 Discoveries found

https://vimeo.com/251371464 Here we see what columns Social Security was found and what percentage of SocialSecurity was found next screen we will drill deeper into the data assets

Page 19: 1. Product Documentation / User · Control File to assign parameters for AWS S3, Hadoop, Security, Streaming and more .... Amazon AWS S3 Security Credentials and Key Codes Setup all

Executive Summary and Quick Column Classification Graphs and drill

Allows for the simple drilling into the Pattern Discovery results

Click on Social Security in Quick Classification showing 5 Discoveries found and Drilling into Results

We can see this not not a false positive and may need to be Encrypted 

Page 20: 1. Product Documentation / User · Control File to assign parameters for AWS S3, Hadoop, Security, Streaming and more .... Amazon AWS S3 Security Credentials and Key Codes Setup all

Executive Summary - Interactive Graphs and drill for Quick Columnar Classification

https://vimeo.com/251372458

Run Jobs

Below will be a list of Jobs that can be run/executed from this menu against the already selected folder or file

Page 21: 1. Product Documentation / User · Control File to assign parameters for AWS S3, Hadoop, Security, Streaming and more .... Amazon AWS S3 Security Credentials and Key Codes Setup all

Quick Classification

Quick Classification Running

Run quick classification to discover one or more patterns in a column and what percentage of the data that pattern resembles.

Page 22: 1. Product Documentation / User · Control File to assign parameters for AWS S3, Hadoop, Security, Streaming and more .... Amazon AWS S3 Security Credentials and Key Codes Setup all

Quick Classification Run Results

https://vimeo.com/251373215 Showing results for some of the columns that discovery found results and what percentage those resultsrepresent

Page 23: 1. Product Documentation / User · Control File to assign parameters for AWS S3, Hadoop, Security, Streaming and more .... Amazon AWS S3 Security Credentials and Key Codes Setup all

Run Pattern Discovery

Pattern Job Run and Display Screen

Here you can see the pattern job is running and what patterns are being discovered. By clicking on the Discovery Patterns, we can see that thediscovery is being run for email, Socail Security Number and IP Address Valid.

Run The Pattern Discovery for User Selected Patterns for Compliance Checking

https://vimeo.com/251374195 The User can select groups or Individual Patterns for Discovery for auditing for PII Compliance

In this example we are running pattern Discovery for email, IP Address and Social Security Number - After all the runs we will drill down and

Page 24: 1. Product Documentation / User · Control File to assign parameters for AWS S3, Hadoop, Security, Streaming and more .... Amazon AWS S3 Security Credentials and Key Codes Setup all

validate they are not false positives and if not decide and if decided  the User can Encrypt those columns of data.

patterndiscoveryjob.mp4

Running of Basic Core Jobs

The running of the BigDataRevealed Core jobs such as Data Discovery, Columnar Classification and Pattern Discovery

Run the Data Discovery Job

https://vimeo.com/251375024    This job create from the selected folder and or file all the unique data assets and unique

patterns found that the User selected to run

Page 25: 1. Product Documentation / User · Control File to assign parameters for AWS S3, Hadoop, Security, Streaming and more .... Amazon AWS S3 Security Credentials and Key Codes Setup all

Validating Delimiters before final execution of the Data Discovery Run

Allows the User to validate the proper file column delimiter is selected and if not the User may change the delimiter before running the job

Page 26: 1. Product Documentation / User · Control File to assign parameters for AWS S3, Hadoop, Security, Streaming and more .... Amazon AWS S3 Security Credentials and Key Codes Setup all

Verify Job is running and view when completed

This screen allows the User to see the current running jobs and view and drill into the jobs that have been completed

Encrypting Column 11 Social Security vimeo.com/251375791

vimeo.com/251375791 

https://vimeo.com/251375791

Page 27: 1. Product Documentation / User · Control File to assign parameters for AWS S3, Hadoop, Security, Streaming and more .... Amazon AWS S3 Security Credentials and Key Codes Setup all

Encryption of Compliance Violations- Validation

https://vimeo.com/251375791   As we can see from the file content screen column 11 ssn is encrypted

Page 28: 1. Product Documentation / User · Control File to assign parameters for AWS S3, Hadoop, Security, Streaming and more .... Amazon AWS S3 Security Credentials and Key Codes Setup all

Decrypting one or more columns of data if credentials allowed

Page 29: 1. Product Documentation / User · Control File to assign parameters for AWS S3, Hadoop, Security, Streaming and more .... Amazon AWS S3 Security Credentials and Key Codes Setup all

Validation that Column 11 has been Decrypted

vimeo.com/251375791 

Page 30: 1. Product Documentation / User · Control File to assign parameters for AWS S3, Hadoop, Security, Streaming and more .... Amazon AWS S3 Security Credentials and Key Codes Setup all

Indirect Identifiers (the Regulations that will fail the vast majority)

Discovery and Remediation of Indirect Identifiers is probably the most difficult to master. Perhaps as high as 90% ofGDPR RequirementsCompanies will falter when attempting to discover and protect Indirect Identifiers that are spread across multiple files.

Fields that by themselves do not uniquely identify an individual, but when grouped together do identify an individual, or a very small group ofindividuals.

A good example of Indirect Identifiers would be Date of Birth, Postal Code and Gender. Only a handful of individuals will have the same valuein these three fields and constitute a GDPR violation.

Discovery of Cross File Indirect Identifiers. BigDataRevealed’s Automated Direct Key finder cross your enterprise, allows a User to logically‘Join’ multiple files by using another field found in all the files as the key to execute the logical Join. Rows from all files are then processed todetermine if the joined rows contain Indirect Identifiers that constitute a GDPR violation.

Direct Identifiers Stage One of 2.

https://vimeo.com/251400656  Important Video to Watch on the Indirect Identifiers and Consent Regulatory

Requirements.

Step one allows the User to select One or More Direct identifiers (keys) such s email, Passport Number, National Insurance ID, Social SecurityNumber, Phone Number an son on ...

Page 31: 1. Product Documentation / User · Control File to assign parameters for AWS S3, Hadoop, Security, Streaming and more .... Amazon AWS S3 Security Credentials and Key Codes Setup all

This process will come back with a list of files that have one or more of these Patterns of Keys , what percentage of the file has these Patterns.This will insure that phase two will have the ability to join the tables by like keys then Discover on what unique domain values (Indirect Identifiers)like gender, age, postal code, illness and son on..  are present cross two or more files the User previously selected.

If a combination of Indirect Identifiers do exit Cross files, and theses combinations of values would allow a hacker or researcher the ability todetermine within a certain range of certainty of finding a person or small group of people, this would constitute a GDPR Compliance Violation. 

Important to note: All permutations (join of all Indirect Identifiers) must be attempted to be joined and matched across ALL other values of all thecolumns and rows of all the selected files.

Below are the stage one result of files, their columns that have Direct Identifiers that will allow the proper Discovery and joins across all files toidentify Indirect Identifier violations. Here we searche a folder of iles that have emails. Social Securities and IP Address.

Page 32: 1. Product Documentation / User · Control File to assign parameters for AWS S3, Hadoop, Security, Streaming and more .... Amazon AWS S3 Security Credentials and Key Codes Setup all

Direct Identifiers Results phase One of Two

Here we can see the list results that were derived by the BigDataRevealed Pattern Detection finding these Direct Key Identifiers and whatpercentage of each file they represent. Now the User has knowledge of what files can be used in order to detect the Indirect Identifier potentialviolation that occur in their Data Assets of their Customer.

Forgotten Identity Screen used to Decrypt Right if Erasure and Change of Consent

This screen shows columns that have been encrypted for the purpose of Citizens Right to be Forgotten or Add or Removal of Consent of PrivateInformation

Forgotten Identity Screen used to Decrypt Right if Erasure and Change of Consent Cont2

This screen shows columns that have been encrypted for the purpose of Citizens Right to be Forgotten or Add or Removal of Consent of PrivateInformation

Page 33: 1. Product Documentation / User · Control File to assign parameters for AWS S3, Hadoop, Security, Streaming and more .... Amazon AWS S3 Security Credentials and Key Codes Setup all

Indirect Identifiers and the Citizens Right of Erasure AKA Right to be Forgotten.

They share commonalities in their Discovery and Remediation / Encryption.

Indirect Identifiers completed job Open and Review

Open the Job and review the results. Then Decide what columns of values need to be Encrypted to not be in violation of Indirect Identifiers. 

This same process and results are used to meet the Citizens Right of Erasure AKA Right to be Forgotten Regulation.

Page 34: 1. Product Documentation / User · Control File to assign parameters for AWS S3, Hadoop, Security, Streaming and more .... Amazon AWS S3 Security Credentials and Key Codes Setup all

Indirect Identifiers completed job Open and Review Cont.

Shows all the Unique Identifier result, what files they were found on and if intra or cross file results.

The next phase of selecting a unique value will show the file, column and row the Indirect Identifier was found. This will allow the necryption to becompliant with Indirect Identifiers ir can be selected and encrypted for the Citizens right of erasure aka right to be forgotten.

Indirect Identifiers completed job Open and Review Cont. 3

This phase of selecting a unique value will show the file, column and row the Indirect Identifier was found. This will allow the necryption to becompliant with Indirect Identifiers ir can be selected and encrypted for the Citizens right of erasure aka right to be forgotten.

See all rows for unique identifier and file col row and pattern type found for them

Page 35: 1. Product Documentation / User · Control File to assign parameters for AWS S3, Hadoop, Security, Streaming and more .... Amazon AWS S3 Security Credentials and Key Codes Setup all

Indirect Identifiers completed job Open and Review Cont. 4

Row 8 col 11 is being encrypted

Indirect Identifiers completed job Open and Review Cont. 5

This shows that row 8 col 11 is eligible for decryption

Page 36: 1. Product Documentation / User · Control File to assign parameters for AWS S3, Hadoop, Security, Streaming and more .... Amazon AWS S3 Security Credentials and Key Codes Setup all

Indirect Identifiers completed job Open and Review Cont. 6

relative row 8 col 11 Social Security is now encrypted as remediated by the user

To Discover the Indirect identifiers

https://vimeo.com/251400656   Select the Direct identifier key to search across the files and the Indirect identifier

unique values to find across your selected file joinable by your selected Identifier Key

Page 37: 1. Product Documentation / User · Control File to assign parameters for AWS S3, Hadoop, Security, Streaming and more .... Amazon AWS S3 Security Credentials and Key Codes Setup all

Live Streaming Remediation / Encryption Results Viewer

Display and review the results of the remediation / encryption process on the live streaming data

Live Streaming Data Compliance/ Remediation / Encryption on the Fly

BigDataRevealed Discovers live streaming data for Regulatory Compliance of PErsonal Data and Will Encrypt this Data before the data in thestream gets written to your Files system.

Producer File Creator to connect and process data from live streams

Add the name of the stream, url, credentials and more

Page 38: 1. Product Documentation / User · Control File to assign parameters for AWS S3, Hadoop, Security, Streaming and more .... Amazon AWS S3 Security Credentials and Key Codes Setup all

Run Parameters for a Producer to access data and have that data Discovered by

BigDataRevealed

This screen let you pick a producer, set the duration of time reading the streamed data and select the patterns to search for and remediate withencryption before the data is written to a file.