the underpinnings of big data success

2
When most people consider big data, they think of the end game: analytics. But according to industry experts, the precursor to successful analytics is an integrated technology foundation that is tuned for a variety of big data workloads. “To maximize the use and results of any enterprise technology implementation, both hardware and software must work well together,” says Boyd Davis, vice president and general manager of data center software at Intel. “Big data is no different. Companies need a foundational layer that provides top-notch manageability and security in support of application-level software and services.” Intel and Cisco are working together to deliver this foundational layer. The Intel ® Distribution for Apache Hadoop software (Intel ® Distribution) is being integrated with the Cisco ® Common Platform Architecture (CPA) for Big Data—a configuration of the Cisco Unified Computing System(Cisco UCS ® ), which is based on Intel ® Xeon ® processors. The result is a comprehensive Hadoop platform that delivers exceptional performance, management, and capacity while reducing risk and accelerating deployment. Creating an enterprise-ready platform Big data technology—and Apache Hadoop in particular—is finding use in an enormous number of applications and is being evaluated and adopted by enterprises of all sizes. While the technology helps transform large volumes of data into actionable information, many organizations are struggling to deploy effective and reliable Hadoop infrastructure that is appropriate for mission-critical applications. “Cisco and Intel enjoy a close technology partnership, and we’re extending this relationship to create next generation big data solutions,” says Paul Perez, vice president and general manager of computing systems at Cisco. “We share a vision for a data analytics platform that is seamlessly integrated into an enterprise environment. One that takes advantage of the storage, networking, and built-in automation of Cisco UCS and Intel’s processor and management technologies, making it easy to plan, provision, execute, and scale.” Supporting a variety of workloads The combination of Intel Distribution and Cisco UCS is being tuned to support a variety of workloads and investigations, including batch-mode analysis, massive parallel processing (MPP) queries, machine-learning, and streaming analytics. Currently the most common big data investigation, batch-mode analysis includes direct MapReduce jobs or Hive queries involving very large data sets. The underpinnings of big data success Cisco and Intel ® partnering in innovation Intel and Cisco team up to deliver an integrated technology foundation that supports a number of big data workloads. Unleashing IT, Big Data Special Edition

Upload: cisco-data-center

Post on 04-Dec-2014

526 views

Category:

Technology


0 download

DESCRIPTION

Intel and Cisco team up to deliver an integrated technology foundation that supports a number of big data workloads.

TRANSCRIPT

Page 1: The underpinnings of big data success

When most people consider big data, they think of the end game: analytics. But according to industry experts, the precursor to successful analytics is an integrated technology foundation that is tuned for a variety of big data workloads.

“To maximize the use and results of any enterprise technology implementation, both hardware and software must work well together,” says Boyd Davis, vice president and general manager of data center software at Intel. “Big data is no different. Companies need a foundational layer that provides top-notch manageability and security in support of application-level software and services.”

Intel and Cisco are working together to deliver this foundational layer. The Intel® Distribution for Apache Hadoop software (Intel® Distribution) is being integrated with the Cisco® Common Platform Architecture (CPA) for Big Data—a configuration of the Cisco Unified Computing System™ (Cisco UCS®), which is based on Intel® Xeon® processors. The result is a comprehensive Hadoop platform that delivers exceptional performance, management, and capacity while reducing risk and accelerating deployment.

Creating an enterprise-ready platform

Big data technology—and Apache Hadoop in particular—is finding use in an enormous number of applications and is being evaluated and adopted by enterprises of all sizes. While the technology helps transform

large volumes of data into actionable information, many organizations are struggling to deploy effective and reliable Hadoop infrastructure that is appropriate for mission-critical applications.

“Cisco and Intel enjoy a close technology partnership, and we’re extending this relationship to create next generation big data solutions,” says Paul Perez, vice president and general manager of computing systems at Cisco. “We share a vision for a data analytics platform that is seamlessly integrated into an enterprise environment. One that takes advantage of the storage, networking, and built-in automation of Cisco UCS and Intel’s processor and management technologies, making it easy to plan, provision, execute, and scale.”

Supporting a variety of workloads

The combination of Intel Distribution and Cisco UCS is being tuned to support a variety of workloads and investigations, including batch-mode analysis, massive parallel processing (MPP) queries, machine-learning, and streaming analytics.

Currently the most common big data investigation, batch-mode analysis includes direct MapReduce jobs or Hive queries involving very large data sets.

The underpinnings of big data success Cisco and Intel®

partnering in innovation

Intel and Cisco team up to deliver an integrated technology foundation that supports a number of big data workloads.

Unleashing IT, Big Data Special Edition

Page 2: The underpinnings of big data success

According to Davis, the typical response time is one to several minutes. “An example of batch-mode analysis would be a job that tries to find anomalies in trading transactions that happened over a period of a month or a year,” he explains. “This would be accomplished by combining trading data with other large reference data sets.”

MPP queries typically involve data warehouse applications like Hive, with an expectation of browser response time or better. In these queries, the reference data sets are generally smaller than batch-mode analytics.

“MPP queries are often performed to analyze and segment the purchase patterns of customers in a retail chain over short periods of time—using up to a week of data—in order to set prices,” says Davis. “Another example is a pipeline set of queries, used for tasks like malware detection, where an automated job takes output of one query and uses it as input for another. The shorter response time for each query speeds up detection, and therefore improves prevention and response measures.”

Machine-learning includes both predictive analytics and data mining. Using Bayesian classifiers, neural networks, and other algorithms, machines can automatically improve their modeling and prediction capabilities. With data mining, unknown relationships within data sets can be discovered.

“We can use predictive analytics to anticipate machine failures,” Davis explains. “And data mining can be used to discover interesting dependencies in, say, social networking data or telecom call records.”

Streaming analytics involve immediate investigations as data flows into a cluster, rather than being pulled from a static repository. This type of analysis is becoming increasingly important when dealing with sensor data—compiled by smart meters, security systems, and the like—allowing discoveries and decisions to be made in real time.

“We are optimizing our Hadoop software so it works seamlessly on Cisco UCS, regardless of the workload or application,” says Davis. “It will be as close to plug-and-play as possible, so enterprises can focus on application-level software and services and not worry about the foundational layer.”

“Cisco is committed to big data, open source, and our work with Intel,” says Perez, “to optimize data-intensive computing for on-premise enterprise and hosted as-a-service environments.”

Speak to a Cisco Big Data expert

You have questions, we have answers. For a complimentary consultation with a Cisco Big Data expert about your challenges and opportunities, request a meeting at: www.UnleashingIT.com/BigData/MeetingRequest.aspx.

This article first appeared online at www.unleashingit.com, available after subscribing at www.unleashingit.com/LogIn.aspx.

© 2013 Cisco and/or its affiliates. All rights reserved. Cisco, the Cisco logo, Cisco Unified Computing System, and Cisco UCS are trademarks or registered trademarks of Cisco and/or its affiliates in the U.S. and other countries. To view a list of Cisco trademarks, go to this URL: www.cisco.com/go/trademarks. Third party trademarks mentioned are the property of their respective owners. The use of the word partner does not imply a partnership relationship between Cisco and any other company. (1309)

Intel, the Intel logo, Xeon, and Xeon Inside are trademarks or registered trademarks of Intel Corporation in the U.S. and/or other countries.

Unleashing IT, Big Data Special Edition