beyenetwork_ data warehousing after the bubble

5
<-- Back to full color view Data Warehousing After the Bubble by Lou Agosta Originally published December 4, 2008 Printer-friendly Email to a friend Email to myself Listen now Download MP3 Comments One thing is clear – Basel II and Sarbanes-Oxley did not work. Instead, credit default swaps (CDSs) were regarded as “free money” rather than insurance against which reserves had to be set aside to cover inevitable losses. This data was not captured in the underlying risk data warehouse because the mortgage and CDS products were not supposed to be a hazard. Oops. For my part, I am still trying to understand how a hard working house cleaner and her house painter husband got a $624K mortgage – a paltry quarter of a million dollars was not enough? Mortgages were considered safe, long-term investments on the part of banks and other long-term loan originators. However , at the risk of 20-20 hindsight, of which there is no shortage, this approach was from the days back when banking was supposed to be boring and innovative financial instruments such as CDSs and packages of subprime mortgages were just a glimmer in the investment banker ’s eye. So what is the lesson here? Meaning is use; and data in isolation is worthless. It is information that is useful as the way to eliminate or reduce uncertainty. No data warehouse can contain all dimensions of a business, industry or market; and when so many variables are changing simultaneously , blind spots happen. The inside of a bubble is a lot more comfortable than the reality outside it. Things were neither as rosy as they seemed; nor are they as dark as they now appear . People still need food to eat; transportation, including cars, to get around; and places to live. Mark Tw ain is reported to have said, “Buy land. They aren’t making any more of it.” I believe he added – “but make sure it is not under water.” The bursting of the bubble and the ongoing economic challenges will reinforce several trends in data warehousing. Open source data warehousing. This will accelerate open source’s readiness for enterprise deployment. Functions such as “heart beat” that make data warehouses capable of supporting mission-critical applications that require mirroring, rollback, automatic failover, redo and related components of high availability are now a requirement. These capabilities are a work in progress, but at an accelerating pace, given the urgency of the situation. Open source databases – in particular MySQL from Sun and Postgres Plus from EnterpriseDB – are working their way through the BeyeNETWORK: Data Warehousing After the Bubble http://www.b-eye-network.com/print/9222 1 of 5 08/08/2014 23:18

Upload: fware28

Post on 03-Jun-2018

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: BeyeNETWORK_ Data Warehousing After the Bubble

8/11/2019 BeyeNETWORK_ Data Warehousing After the Bubble

http://slidepdf.com/reader/full/beyenetwork-data-warehousing-after-the-bubble 1/5

<-- Back to full color view

Data Warehousing After the Bubble

by Lou Agosta

Originally published December 4, 2008

Printer-friendlyEmail to a friendEmail to myself

Listen nowDownload MP3Comments

One thing is clear – Basel II and Sarbanes-Oxley did not work. Instead, credit defaultswaps (CDSs) were regarded as “free money” rather than insurance against whichreserves had to be set aside to cover inevitable losses. This data was not capturedin the underlying risk data warehouse because the mortgage and CDS productswere not supposed to be a hazard. Oops. For my part, I am still trying to understandhow a hard working house cleaner and her house painter husband got a $624Kmortgage – a paltry quarter of a million dollars was not enough? Mortgages wereconsidered safe, long-term investments on the part of banks and other long-termloan originators. However, at the risk of 20-20 hindsight, of which there is noshortage, this approach was from the days back when banking was supposed to beboring and innovative financial instruments such as CDSs and packages ofsubprime mortgages were just a glimmer in the investment banker’s eye.

So what is the lesson here? Meaning is use; and data in isolation is worthless. It isinformation that is useful as the way to eliminate or reduce uncertainty. No datawarehouse can contain all dimensions of a business, industry or market; and whenso many variables are changing simultaneously, blind spots happen. The inside of a

bubble is a lot more comfortable than the reality outside it. Things were neither asrosy as they seemed; nor are they as dark as they now appear. People still needfood to eat; transportation, including cars, to get around; and places to live. MarkTwain is reported to have said, “Buy land. They aren’t making any more of it.” Ibelieve he added – “but make sure it is not under water.” The bursting of the bubbleand the ongoing economic challenges will reinforce several trends in datawarehousing.

Open source data warehousing. This will accelerate open source’s readiness forenterprise deployment. Functions such as “heart beat” that make data warehousescapable of supporting mission-critical applications that require mirroring, rollback,automatic failover, redo and related components of high availability are now arequirement. These capabilities are a work in progress, but at an accelerating pace,given the urgency of the situation. Open source databases – in particular MySQLfrom Sun and Postgres Plus from EnterpriseDB – are working their way through the

eNETWORK: Data Warehousing After the Bubble http://www.b-eye-network.com/print/9222

5 08/08/2014 23:18

Page 2: BeyeNETWORK_ Data Warehousing After the Bubble

8/11/2019 BeyeNETWORK_ Data Warehousing After the Bubble

http://slidepdf.com/reader/full/beyenetwork-data-warehousing-after-the-bubble 2/5

enterprise as components of appliances, column-oriented data marts and relatedapplications. These applications are often support-oriented and are not  missioncritical, but provide a testing and training ground for the next generation of frontlineinfrastructure.

Open source is not for the technologically faint of heart. It is optimally deployed inconnection with a support package from a vendor that is going to be available onweekends, holidays and when you least expect to need it. Still, the potential fordisruption of the existing installed base of the standard relational databases issignificant. While information technology is holding up relatively well amidst therecession, it is hard to see how that can continue when customers in finance, retail,hospitality, travel and manufacturing are taking it on the chin. In contrast, opensource remains a bright spot where innovation promises to return improvedproductivity, which after all is the best way of creating new opportunities for costreduction, efficiency and profitability.

Data warehousing in the clouds. As noted in my recent article, cloud computinghas come up fast with companies such as Amazon, Google and Salesforce.com

stealing a march on the information technology infrastructure stalwarts such as Dell,HP, IBM, Microsoft and Oracle. Cloud computing differs from all the usual suspects – the grid, software as a service (SaaS), and simple web hosting – by providingvirtualization of the entire technology stack and a retail interface for the purchase ofbusiness applications in small increments with which to run a medium-sizedbusiness or perhaps an enterprise. The service level agreement (SLA) for theapplication in the cloud is one that a business person can understand,accommodating data persistence, system reliability, redundancy, security andbusiness continuity. However, the catch is that the SLA is still in the process of beingdefined in an enterprise context. Thus, cloud computing is best suited for small and

medium-sized organizations that can afford to be flexible about their requirements inorder to save a few nickels on infrastructure.

Back to basics: data quality. Data quality remains an issue as some customers just disappear, leaving only an entry in the data warehouse to be cleared up. Asrapid seismic changes in consumer behavior occur, other customers move intodemographically different categories and no longer have the same marketing, buyingor shopping profiles. Retail discovers the returning popularity of “lay away” plans,which require their own application profile. The basic question of data warehousingremains more important than ever – who is buying or using what product or service,and when and where are they doing so? In any crisis and breakdown of what is

ordinary – in this case, the end of living beyond one’s means – the natural tendencyis to overreact. In that respect, a single, high quality data point (“fact”) from a datawarehouse is worth a thousand opinions. Stay the course.

Back to basics: front end. Perception of business value migrates in the direction ofthe user interface. In addition to enterprise front end from SAP/Business Objects orIBM/Cognos, upstarts are offering engaging variations on the dashboard theme. Forexample, for those interested in new options, check out LogiXML and SiSense. Afront end vendor that blurs the distinction front/back in an interesting way andreaches back to data sources, providing ETL-like access in addition to analytics, is

Lyza Software. Now layer open source on top of this market. Pentaho is more thanan open source front end since it aspires to data mining and data integration results.However, it did get its start in reporting and dashboards. When successful, all thelaborious work of upstream data integration will result in an "Aha!" experience as thebusiness analyst gains an insight about customer relations, product offerings or

eNETWORK: Data Warehousing After the Bubble http://www.b-eye-network.com/print/9222

5 08/08/2014 23:18

Page 3: BeyeNETWORK_ Data Warehousing After the Bubble

8/11/2019 BeyeNETWORK_ Data Warehousing After the Bubble

http://slidepdf.com/reader/full/beyenetwork-data-warehousing-after-the-bubble 3/5

market dynamics. A new and better user interface is not in itself the cause of thebreakthrough. Without the work of integrating the upstream data, the result wouldnot have been possible.

Back to basics: middle layer. Data integration is arguably a trend with many of theenterprise application integration (EAI), extract, transform and load (ETL) andcustomer data integration (CDI) service vendors leading the charge. Now layer opensource on top of it. An interesting approach to open source ETL is provided byApatar. In addition to a compelling pricing proposition, Apatar is building acommunity of users by means of a shareable database of ETL maps submitted andmaintained, in the spirit of open ETL, in its Forge database. While it is improbablethat anyone else’s application is exactly like yours, business problems fall intocategories and one is likely to be close to what you are seeking. This is a great wayto avoid reinventing the wheel and to jumpstart a project. Data integration requiresschema integration. A schema is a database model (structure) that accuratelyrepresents the data in such a way that it is meaningful. To compare entities such ascustomers, products, sales or store geography across different data stores, theschemas must be reconciled in terms of consistency and meaning. If the meanings

differ, then translation (transformation) rules must be designed and implemented.The point is that IT developers cannot “plug into” data integration by purchasing a“plug in” for a tool without also undertaking the design work to integrate (i.e., mapand translate) the schemas representing the targets and sources.

Back to basics: back end. Design consistent and unified definitions of product,customer, channel, sales or store geography, etc. This is the single most importantaction an IT department can undertake regarding a data warehousing architecture.Front line data warehousing with clickstream applications are here to stay, and keydata dimensions and attributes now also include those relevant to the Web such as

page hierarchies, sessions, user IDs and shopping carts. Every department (finance,marketing, inventory, production) wants the same data in different form – that’s whythe star schema design and its data warehouse implementation were invented.Extensive research is available on how to avoid the religious wars between datawarehouses and data marts by means of a flexible data warehouse design. Thepreviously cited comments on open source databases and data warehousing in theclouds are relevant here. According to my calculation, that constitutes front end,middle, and back end open source options from which to assemble a completesystem. Obviously, enterprise customers will find value is having even more choices,and those are coming. In my opinion, economic uncertainty will be a benefit to opensource and its users. IT benefits from the available bandwidth that developers mayhave now to start something really engaging (“cool”), and it limits the downsidefinancial risk. Win-win.

Plenty of blame and finger-pointing is available as the responsibilities for the housingbubble, credit default swaps (CDS), and packages of toxic mortgage debit getpassed around like hot potatoes. Self-scrutiny on the part of Barney Frank, ChuckSchumer and Henry Waxman, members of Congress who urged on the excesses ofmortgage lenders Freddie Mac and Fannie Mae are noticeably absent. It is true thatAlan Greenspan in testimony before Congress indicated that one of the problemswas that he had “bad data”; but that was in the context of acknowledging that his

point of view on regulation was in need of more work.1 This is the moral equivalentwhen decoded from “central banker talk” of the former Fed chairman saying that hewas wrong, the recognition of which I shall cherish no matter how long I live.Fortunately for Greenspan, he already published his book, because his reputation

eNETWORK: Data Warehousing After the Bubble http://www.b-eye-network.com/print/9222

5 08/08/2014 23:18

Page 4: BeyeNETWORK_ Data Warehousing After the Bubble

8/11/2019 BeyeNETWORK_ Data Warehousing After the Bubble

http://slidepdf.com/reader/full/beyenetwork-data-warehousing-after-the-bubble 4/5

now looks to have been as inflated as the price of housing in the year 2006.However, the one thing that no one has yet done is blame it on the data warehouse.Accurate, timely data is more important than ever before, and the data warehouse isone of the best ways of assuring it. Seriously, I expect there to be more work in thepublic sector building data warehouses (as well as transactional systems) tonavigate through the economic and political dynamics.

End Note:

Neil Irwin and Amit Paley, "Greenspan says he was wrong on regulation." The 

Washington Post , October 24, 2008.1.

SOURCE: Data Warehousing After the Bubble

Lou Agosta

Lou Agosta is an independent industry analyst, specializing in datawarehousing, data mining and data quality. A former industry analyst at Giga

Information Group, Agosta has published extensively on industry trends in datawarehousing, business and information technology. He is currently focusing onthe challenge of transforming America’s healthcare system using informationtechnology (HIT). He can be reached at [email protected].

Editor's Note: More articles, resources, and events are available in Lou'sBeyeNETWORK Expert Channel. Be sure to visit today!

Recent articles by Lou Agosta

Data Warehouses Must Learn New Tricks in Big Data EraAn Interview with Malcolm GladwellAn Interview with John Sall, Co-Founder of SASThe Healthcare Information Technology (HIT) Market is Poised for Growth

More BeyeNETWORK Articles

Information Management Strategies from William McKnight Summary

This excerpt is from William McKnight’s newest book – Information Management: Strategies for Gaining a Competitive Advantage with Data.

Through this book, William McKnight helps you understand the value of 

information in your enterprise. In this somewhat technical chapter, he reminds 

us that in order to be an excellent information manager or strategist, there is 

technical information you must know.

Business unIntelligence—Insight and Innovation Beyond Analytics and Big 

Data Summary 

Is there still a need for the data warehouse? In this excerpt from his new book,

Barry Devlin looks at why the data warehouse can no longer retain its old role 

of being all things to all informational needs.

 

eNETWORK: Data Warehousing After the Bubble http://www.b-eye-network.com/print/9222

5 08/08/2014 23:18

Page 5: BeyeNETWORK_ Data Warehousing After the Bubble

8/11/2019 BeyeNETWORK_ Data Warehousing After the Bubble

http://slidepdf.com/reader/full/beyenetwork-data-warehousing-after-the-bubble 5/5

Comments

Want to post a comment? Login  or become a member  today! 

Be the first to comment! 

 

Copyright 2004 — 2014. Powell Media, LLC. All rights reserved.

BeyeNETWORK™ is a trademark of Powell Media, LLC 

eNETWORK: Data Warehousing After the Bubble http://www.b-eye-network.com/print/9222