hadoop integration adds promise (and complexity) …h22168. nn hadoop integration adds promise (and...

2
TECH DOSSIER HADOOP INTEGRATION ADDS PROMISE (AND COMPLEXITY) TO BIG DATA PROJECTS Big data projects hold promise for revolutionizing the way data is managed, analyzed, and shared, but the terrain is rife with challenges. One of the biggest: integrating the increasingly popular, but still relatively complex Hadoop technology into conventional data warehouse architectures. Hadoop, an open source framework for storing and processing big data, was originally conceived by developers at Google and Yahoo as a way to manage their massive search engines, but was later released to the broader market and has been gaining ground in the enterprise ever since. Unlike relational data warehouses, which store structured data in rows and columns, Hadoop’s sweet spot is storing and managing unstructured data, which it distributes across multiple clusters working in parallel. Hadoop’s approach lets organiza- tions store as much data as they need, addressing the scalability issues related to big data. Because it’s built on a foundation of open source software and commodity server hardware, Hadoop deployments can also be more cost-effective than traditional data storage and analytics methods. These potential benefits are not lost on enterprises that are actively integrating Hadoop into their data warehouse architectures. A recent IDC Market Pulse research study revealed that more than a third (35%) of larger companies say they currently need or will need to support Hadoop as part of a multifaceted big data landscape, alongside centralized data ware- houses, in-database analytics, and analytic appli- ances, among other technologies. CONVERGED SYSTEMS RISE TO THE CHALLENGE Integrating Hadoop into the enterprise data ware- house presents some significant challenges, however. For one thing, a well-developed suite of tools for managing, administering, and securing Hadoop environments has yet to emerge, since the technology is still relatively new. Other obstacles include IT’s general lack of familiarity with imple- mentation best practices along with a shortage of Hadoop-trained experts required to integrate Hadoop into the larger enterprise data warehouse mix. And while the core building blocks of Hadoop (open source software and commodity hardware) can help keep expenses in check, any savings can quickly be negated by additional costs associated with hiring consultants and in-house experts to fill those skill gaps. One emerging approach to mitigate these deploy- ment challenges involves converged systems. Specifi- cally, the simplicity of having a single, integrated appliance that can support simultaneous queries of both structured and unstructured data can help reduce the cost and complexity of building a next- generation data warehouse that encompasses a full Hadoop environment. HP ConvergedSystem 300 for Microsoft Analytics Platform: What’s in the Box? HP Deployment Accelerator Service (onsite deployment services) Microsoft Windows 2012 and System Center software Microsoft Parallel Data Warehouse (PDW) software (Priced separately) HDInsight (Microsoft’s Hadoop-based distribution) (Optional) HP management tools and utilities HP Converged Infrastructure (includes servers, storage, networking) Collaborative support services from HP and Microsoft (HP Proactive Care; Microsoft support priced and purchased separately)

Upload: doandung

Post on 22-Mar-2018

212 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Hadoop IntegratIon adds promIse (and ComplexIty) …h22168. nn Hadoop IntegratIon adds promIse (and ComplexIty) to BIg data projeCts as part of an ongoing partnership, Hewlett-packard

tech dossier

Hadoop IntegratIon adds promIse (and ComplexIty) to BIg data projeCts

<FPO>Big data projects hold promise for revolutionizing the way data is managed, analyzed, and shared, but the terrain is rife with challenges. one of the biggest: integrating the increasingly popular, but still relatively complex Hadoop technology into conventional data warehouse architectures.

Hadoop, an open source framework for storing and processing big data, was originally conceived by developers at google and yahoo as a way to manage their massive search engines, but was later released to the broader market and has been gaining ground in the enterprise ever since.

Unlike relational data warehouses, which store structured data in rows and columns, Hadoop’s sweet spot is storing and managing unstructured data, which it distributes across multiple clusters working in parallel. Hadoop’s approach lets organiza-tions store as much data as they need, addressing

the scalability issues related to big data. Because it’s built on a foundation of open source software and commodity server hardware, Hadoop deployments can also be more cost-effective than traditional data storage and analytics methods.

these potential benefits are not lost on enterprises that are actively integrating Hadoop into their data warehouse architectures. a recent IdC market pulse research study revealed that more than a third (35%) of larger companies say they currently need or will need to support Hadoop as part of a multifaceted big data landscape, alongside centralized data ware-houses, in-database analytics, and analytic appli-ances, among other technologies.

converged systems rise to the challenge

Integrating Hadoop into the enterprise data ware-house presents some significant challenges, however. For one thing, a well-developed suite of tools for managing, administering, and securing Hadoop environments has yet to emerge, since the technology is still relatively new. other obstacles include It’s general lack of familiarity with imple-mentation best practices along with a shortage of Hadoop-trained experts required to integrate Hadoop into the larger enterprise data warehouse mix.

and while the core building blocks of Hadoop (open source software and commodity hardware) can help keep expenses in check, any savings can quickly be negated by additional costs associated with hiring consultants and in-house experts to fill those skill gaps.

one emerging approach to mitigate these deploy-ment challenges involves converged systems. specifi-cally, the simplicity of having a single, integrated appliance that can support simultaneous queries of both structured and unstructured data can help reduce the cost and complexity of building a next-generation data warehouse that encompasses a full Hadoop environment.

hP convergedsystem 300 for microsoft analytics Platform: What’s in the Box?

Hp deployment accelerator service (onsite deployment services)

microsoft Windows 2012 and system Center software

microsoft parallel data Warehouse (pdW) software (priced separately)

HdInsight (microsoft’s Hadoop-based distribution) (optional)

Hp management tools and utilities

Hp Converged Infrastructure (includes servers, storage, networking)

Collaborative support services from Hp and microsoft (Hp proactive Care; microsoft support priced and purchased separately)

Page 2: Hadoop IntegratIon adds promIse (and ComplexIty) …h22168. nn Hadoop IntegratIon adds promIse (and ComplexIty) to BIg data projeCts as part of an ongoing partnership, Hewlett-packard

2 nn Hadoop IntegratIon adds promIse (and ComplexIty) to BIg data projeCts

as part of an ongoing partnership, Hewlett-packard and microsoft offer the Hp Convergedsystem 300 for microsoft analytics platform [also branded as micro-soft analytics platform system]. this high perfor-mance, high-availability, integrated and purpose-built appliance delivers an intelligent approach to big data storage and analytics. this comprehensive platform for creating and supporting end-to-end big data solutions is built using microsoft’s massive parallel processing (mpp) solution, called parallel data Warehouse (pdW), along with HdInsight [microsoft’s 100-percent distribution of Hadoop based on the Hortonworks data platform], and polybase [the ability to use t-sQl for Big data queries]. all this, when coupled with broadly available business intelligence

tools like excel 2013 and powerBI for office 365, means faster analytics and increased accessibility to all users within an organization.

despite all its promise and potential, Hadoop still pres-ents one of the biggest barriers to big data storage and analytics adoption. Converged architecture solu-tions, like the Hp Convergedsystem 300 for microsoft analytics platform, combine the required technolo-gies into a single managed platform that can reduce complexity, help organizations use familiar tools to quickly extract value from Hadoop, and ultimately deliver an enterprise-ready approach to big data.

syndicated content from idg

cloud Bi: going where the data lives

researchers at gartner say that 2014 may be the tipping point for cloud BI. In each of the last four years, around 30% of respondents to a gartner survey said they’d run their mission-critical BI in the cloud. this year, however, nearly half — 45% — said they would adopt cloud BI.

shifting data analytics to the cloud doesn’t come without its challenges, though. For example, it’s unlikely that all corporate data will move to the cloud, particu-larly in larger enterprises. that means many businesses will have to map data from both cloud and on-premises sources to the BI software, whether that software itself is on-premises or in the cloud. also, bandwidth constraints may slow down data transfers and can lead to increased costs, if a business must upgrade its connectivity to improve data transfer.

no end in sight to the growth of cloud analytics

It’s no secret that cloud computing and data analytics are both rapidly growing areas of It. put them together, and you get a winning combination that’s expected to grow by more than 26 percent annually over the next five years. Increased adoption of data analytics is one of the major drivers in this market, according to a new report on the global cloud analytics market from research and markets found. more specifically, many organizations are adopting data analytics in order to better understand consumption patterns, customer acquisition and various other factors believed to increase revenue, cut costs and boost customer loyalty.

how to build a great data science team

enterprises that want to launch big data initiatives — or even more ambitiously, seek to create an “analytics culture” — invariably should answer a handful of critical questions before spending money and allocating resources: What’s the business case for analytics? Which big data tools should we use? should we hire a data analytics vendor to handle everything? If we build an in-house team, where do we get the analytics talent?

Beyond talent acquisition, the fundamental challenge facing enterprises trying to build an effective data analytics team is determining the optimum combination of skills, background, and personality.

hybrid cloud adoption set for a big boost in 2015

spurred in large part by enterprise interest in the hybrid cloud, the overall cloud market is likely to see great growth in the coming year. Industry analyst firm IdC predicts that the global cloud market, including private, public and hybrid clouds, will hit $118 billion in 2015 and crest at $200 billion by 2018. If the market shows that much growth next year, it will mean a 23.2% rise over the $95.8 billion market it reached in 2014.

read the fUll article

read the fUll article

read the fUll article

read the fUll article

For more information, go to

hp.com/go/convergedsystem