data lake-based approaches to regulatory-driven technology challenges

6
Data Lake-based Approaches to Regulatory- Driven Technology Challenges How a Data Lake Approach Improves Accuracy and Cost Effectiveness in the Extract, Transform, and Load Process for Business and Regulatory Purposes

Upload: booz-allen-hamilton

Post on 18-Dec-2014

74 views

Category:

Data & Analytics


0 download

DESCRIPTION

Booz Allen Hamilton has found that a data lake-based approach to CA3 requirements is scalable, extensible, and improves the range and sophistication of analyses that can be supported while providing higher levels of data control and security.

TRANSCRIPT

Page 1: Data Lake-based Approaches to Regulatory-Driven Technology Challenges

Data Lake-based Approaches to Regulatory-Driven Technology ChallengesHow a Data Lake Approach Improves Accuracy and Cost Effectiveness in the Extract, Transform, and Load Process for Business and Regulatory Purposes

Page 2: Data Lake-based Approaches to Regulatory-Driven Technology Challenges

The concept of big data offers financial institutions an opportunity to build capabilities that both reduce costs and produce better insight. In the area of regulatory compliance, the work required to prepare the organization typically involves modifications to systems, process, and data to allow Collection, Alignment, Aggregation, and Analysis (CA3) to occur. For example, new rules, such as Dodd Frank, over-the-counter (OTC) collateral, and risk management requirements, rely on the same legal entity and customer data infrastructure that need to be upgraded for Anti Money Laundering/Bank Secrecy Act, Sanctions, and Foreign Account Tax Compliance Act (FATCA). Linking the data while limiting the modifications to the systems that underpin both the business and compliance requirements improves performance for customer-facing platforms and regulatory compliance systems alike.

The potential is real, but the volume, variety, and velocity of the data is growing so fast that it is outpacing the ability of current tools to take full advantage of it. Much of the problem lies in the need to extensively prepare the data before it can be analyzed. In parallel, the technologies and techniques underpinning Big Data have matured to the point where they can address the challenge. While early uses focused on deriving insights from very large pools of unstructured data, recent deployments have harnessed multiple tools, including advanced data management, pattern recognition, and adaptive analytics, to address large-scale, high-accuracy, low-latency CA3 of diverse, dispersed data.

Applying Robust Financial Intelligence and Analytics to Stay Ahead

Page 3: Data Lake-based Approaches to Regulatory-Driven Technology Challenges

The Extract, Transform, and Load Challenge For the past 30 years, traditional approaches to sharing and transferring data have all involved some type of Extract, Transform, and Load (ETL) capability that extracts information from one format (database, silo, file, etc.) and transforms it into another data format. The process then loads the data into the target system for use in a set of predetermined analyses. While these approaches to handling data have served some organizations well in the past, they have some notable drawbacks, which become more significant as the volume, variety, and velocity of the data expands.

First and foremost, the process is resource intensive and requires investments in high-cost tools to access the data. For example, each time a new regulation is issued that calls for a new type of analytically derived report, banks must initiate a dedicated IT project, often focused on solving the data ingest issue. This portfolio of projects results in a very large number of data warehouses, each with their own ETL process. To use the diverse data warehouses calls for the creation of customized Point-to-Point (PtP) solutions. These PtP solutions can certainly meet the short-term goal, but often fail to scale up to meet longer-term organizational goals. As banks move into the era of big data, this PtP approach becomes overly complex and difficult to manage.

The Data Lake-based Approach In stark contrast to the challenges presented by a point-to-point ETL approach, Booz Allen Hamilton, a leading strategy and technology consulting firm, has found that a data lake-based approach to CA3 requirements is scalable, extensible, and improves the range and sophistication of analyses that can be supported while providing higher levels of data control and security.

A data lake-based approach takes advantage of the most recent developments in large-scale distributed computing hardware/software to create an innovative way to ingest, index, and analyze massive amounts of data in batch and real time that can scale to exabytes—without compromising integrity, cost-effectiveness, or performance. The Data Lake Approach embeds business rules, often the result of policy and procedure documentation for regulatory compliance, in the cell level data, allowing alignment, aggregation, and analysis to occur rapidly and with far less upfront work by IT departments. With the data lake, an organization’s repository of information—including structured and unstructured data—is consolidated in a single, large “table.” Every inquiry can use the entire body of information stored in the data lake—and it is all available at the same time.

This approach, also referred to as “schema on read,” has five core features that can help banks address increasingly demanding, constantly evolving regulatory requirements. In a data lake-based approach:

1. ETL is not done en-masse prior to the analysis. Data is ingested rapidly in “raw” form, and the indices and relationships to support the analysis are derived, enriched, and overlaid as needed—or even executed at the time of the analysis, reducing the time to operationalize data.

2. Unified queries can be created quickly to allow access across all information sources, reducing the time and complexity involved in creating and federating queries across multiple databases.

3. Multiple data sources can be more quickly fused to enable a very high degree of data agility to compose new reports that meet emerging requirements (e.g., new regulations).

4. Operations and management (O&M) complexity is significantly reduced, with a corresponding drop in O&M costs, while creating the basis for improved security and data management posture.

ETL

Transactions

FEDERATED QUERY

ETL

Transactions

ETL

Transactions

Tailored Reporting

Tools

Transactions Transactions Transactions

LightweightSecurityTagging

RuntimeCreation of

Views

Figure 2. Advanced Data Lake-based Approaches

Figure 1. Traditional Point-to-Point Solution

Page 4: Data Lake-based Approaches to Regulatory-Driven Technology Challenges

5. The low-cost, streamlined ingestion process can be performed in near real-time, making the Data Lake Approach a viable alternative for some requirements that would typically be addressed by implementing Straight Through Processing platforms—at far less cost and disruption to the revenue-generating operations of the bank.

Putting the Data Lake to WorkWith the Data Lake Approach, it now becomes practical—in terms of time, cost, and analytic ability—to turn big data into a powerful tool to deal with escalating regulatory challenges while meeting business demands. We can now ask more far-reaching and complex questions, and find the often hidden patterns and relationships that can lead to game-changing knowledge and insight. The Data Lake Concept is particularly well suited for challenges that have one or more of the following characteristics:

1. Streaming analytics are performed on large-scale data sets

2. PtP data mart solutions are involved

3. The ETL requirement is data, not process heavy

While applying a big data approach to financial regulatory requirements may be innovative, it would not experimental—Booz Allen has created data lake-based systems for more than a dozen government clients. Each time we addressed a new class of problem, (e.g., Homeland Security, Defense) we used a prototype approach to build/test/tailor the Data Lake Approach. We are prepared to work with your leadership team in a similar manner to introduce this capability.

To launch a prototype project, we work with clients to:

• Identifyasmallsetofbusinessandregulatorycriticalapplicationsasthebasisfortheprototype—basically,a subset of projects in process that can be executed quickly to yield results

• Setupdesignrequirementsforinformationreportingrequirementsforinternal/externalusers

• Mirrorasetofreal-worldscenariostocreateananalyticsplatform(i.e.,adatalake)thatwewillusetodemonstrate the schema on read process against the critical applications identified above

• Developaresultssummaryonmultiplelevels(speed,cost,accuracy)andtestthedataforinternalvalidityand defensibility

Booz Allen knows that a clean-sheet approach is not feasible; any viable solution approach must be able to deal with a diverse base of legacy systems and select from the existing portfolio of regulatory IT project requirements. While such conditions can be challenging, by creating an isolated, parallel analytics platform, we are be able to work with live data with no risk to the bank’s production systems.

Page 5: Data Lake-based Approaches to Regulatory-Driven Technology Challenges

“With the Data Lake Approach, it now becomes practical—in terms of time, cost, and

analytic ability—to turn big data into a powerful tool to deal with escalating regulatory

challenges while meeting business demands.”

Page 6: Data Lake-based Approaches to Regulatory-Driven Technology Challenges

www.boozallen.com

About Booz AllenBooz Allen Hamilton has been at the forefront of strategy and technology consulting for nearly a century. Today, Booz Allen is a leading provider of management and technology consulting services to the US government in defense, intelligence, and civil markets, and to major corporations, institutions, and not-for-profit organizations. In the commercial sector, the firm focuses on leveraging its existing expertise for clients in the financial services, healthcare, and energy markets, and to international clients in the Middle East. Booz Allen offers clients deep functional knowledge spanning strategy and organization, engineering and operations, technology, and analytics—which it combines with specialized expertise in clients’ mission and domain areas to help solve their toughest problems.

The firm’s management consulting heritage is the basis for its unique collaborative culture and operating model, enabling Booz Allen to anticipate needs and opportunities, rapidly deploy talent and resources, and deliver enduring results. By combining a consultant’s problem-solving orientation with deep technical knowledge and strong execution, Booz Allen helps clients achieve success in their most critical missions—as evidenced by the firm’s many client relationships that span decades. Booz Allen helps shape thinking and prepare for future developments in areas of national importance, including cybersecurity, homeland security, healthcare, and information technology.

Booz Allen is headquartered in McLean, Virginia, employs approximately 25,000 people, and had revenue of $5.86 billion for the 12 months ended March 31, 2012. For over a decade, Booz Allen’s high standing as a business and an employer has been recognized by dozens of organizations and publications, including Fortune, Working Mother, G.I. Jobs, and DiversityInc. More information is available at www.boozallen.com. (NYSE: BAH)

For more information, contact

Thomas Sanzone Senior Vice President [email protected] 917-305-8003

James Newfrock Vice President [email protected] 917-305-8037

Joshua Sullivan Vice President [email protected] 301-543-4611

Albert Belman Principal [email protected] 917-305-8002

Michael Delurey Principal [email protected] 703-902-6858

03.078.13