netapp technical case study - cern—the european ... · technical case sudy cern—the european...

10
Technical Case Study CERN—the European Organization for Nuclear Research How CERN helps physicists unlock the secrets of the universe running critical operations on a foundation of Oracle databases on NetApp storage. Following the Data to Knowledge and Discovery The physicists at CERN strive to expand humankind’s general understanding of our world, pushing beyond the boundaries of knowledge to fathom the secrets of the universe. Driven by curiosity and a quest for pure knowledge, CERN’s scientific community conducts fundamental research that follows the data wherever it leads in the search for clues and discoveries about how the universe works. But that does not mean that CERN research is without practical and often revolutionary application in everyday life. In 1989, for example, Tim Berners‑Lee, a scientist at CERN, invented the World Wide Web, conceived and developed to meet the demand for automatic information sharing among the global high‑energy physics community. CERN also served as the incubator for capacitive touch screens, invented in 1973 by Bent Stumpe and colleagues and originally put to use in the control room of the CERN SPS accelerator. Those innovations—applied research spin‑offs, if you will—have transformed modern communications. The work at CERN In addition to seeking answers to questions about the universe, the CERN community works to: • Encourage global collaboration, bringing nations together through science. • Educate, providing advanced worker training and building enthusiasm for physics among the next generation of scientists. • Advance the frontiers of technology, cooperating with industry to bring forward new technologies.

Upload: duongtram

Post on 23-May-2018

224 views

Category:

Documents


1 download

TRANSCRIPT

Technical Case StudyCERN—the European Organization for Nuclear Research

How CERN helps physicists unlock the secrets of the universe running critical operations on a foundation of Oracle databases on NetApp storage.

Following the Data to Knowledge and DiscoveryThe physicists at CERN strive to expand humankind’s general understanding of our world, pushing beyond the boundaries of knowledge to fathom the secrets of the universe. Driven by curiosity and a quest for pure knowledge, CERN’s scientific community conducts fundamental research that follows the data wherever it leads in the search for clues and discoveries about how the universe works.

But that does not mean that CERN research is without practical and often revolutionary application in everyday life. In 1989, for example, Tim Berners‑Lee, a scientist at CERN, invented the World Wide Web, conceived and developed to meet the demand for automatic information sharing among the global high‑energy physics community. CERN also served as the incubator for capacitive touch screens, invented in 1973 by Bent Stumpe and colleagues and originally put to use in the control room of the CERN SPS accelerator. Those inno vations—applied research spin‑offs, if you will—have transformed modern communications.

The work at CERNIn addition to seeking answers to questions about the universe, the CERN community works to:

• Encourageglobalcollaboration,bringingnationstogetherthroughscience.

• Educate,providingadvancedworkertrainingandbuildingenthusiasmfor physics among the next generation of scientists.

• Advancethefrontiersoftechnology,cooperatingwithindustrytobringforward new technologies.

2

Researching the building blocks of the universeCERN provides some of the world’s most technologically advanced facilities for researching the basic building blocks of the universe. Facilities include particle accelerators and specialized machines to help prove the existence of exotic forms of matter. Research at CERN facilities falls into three major areas of study:

• The origin of mass. Research in this area includes searching for the Higgs particle, a hypothetical and elementary particle predicted by the Standard Model (SM) of particle physics. The Higgs particle belongs to a class of particles known as bosons and is considered the key to explaining why particles have mass.

• Dark matter. Galaxies behave as if they have more mass than can be observed. Theories suggest that there is a partner to every existing particle in the SM. Called supersymmetric particles, these particles could be the unseen dark matter.

• The Big Bang. What happened just after the beginning of the universe? Theorizing that the universe contained a hot, dense mixture of quarks and gluons (called quark‑gluon plasma), scientists want to recreate similar conditions to analyze the properties of that mixture.

About the Large Hadron ColliderThe CERN complex hosts a succession of particle accelerators, each able to reach increasingly higher energies. The latest addition to the complex is the Large Hadron Collider (LHC), the world’s largest and most powerful particle accelerator. The CERN Control Centre near Geneva, Switzerland, houses all of the controls for the accelerator, its services, and technical infrastructure.

The LHC, launched in 2008 and installed about 100 meters underground, forms a 27‑kilometer circle that spans the border between France and Switzerland. The ring consists of superconducting magnets with a number of accelerating structures that boost the energy of particles. Traveling in opposite directions in separate pipes, beams inside the LHC are guided around the accelerator by a magnetic field achieved with superconducting magnets pre‑cooled with liquid nitrogen and then filled with liquid helium to drop the temperature to a colder‑than‑outer‑space temperature of about ‑271°C. Beams are directed to collide around the ring at points coinciding with the location of LHC particle detectors. International collaborations currently run four distinct big experiments—each characterized by its unique particle detector—to study LHC collisions and the properties of matter produced in those collisions.

The LHC creates 600 million collisions per second, producing raw data at the rate of 1 million gigabytes per second. Software converts that raw data to readable data objects for later event analysis. Current experiments produce more than 20PB of new data annually, helping CERN scientists push knowl‑edge forward and answer questions about the fundamental laws of nature.

Information Technology department roleCERN’s Information Technology department manages the IT support infrastructure for a staff of about 2,500 and a global research community of more than 10,000 scientists and students representing 608 universities and 113 nationalities. Responsibilities of the CERN scientific and technical staff include designing, building, and ensuring the smooth operation of particle accelerators as well as preparing, running, analyzing, and interpreting data gathered from scientific experiments.

“Our biggest challenge is handling the volume and rate of data growth.”Frédéric HemmerHead, IT DepartmentCERN

3

The department provides access to a broad array of IT services and data to a demanding scientific community that comprises nearly half of the world’s particle physicists. “They will turn the knob until it breaks,” remarks Frédéric Hemmer, head of CERN’s IT department. “But addressing the challenges our users present is part of what makes life here at CERN so enjoyable. We’re constantly adapting IT, even on a weekly basis, to facilitate collaboration and communication and to handle the increasing rate and scale of incoming experimental data.”

Balancing Demands for Performance, Scalability, and Reliability with Cost ConstraintsThe big science being done at CERN introduces equivalently big data management challenges. IT has to anticipate the needs of inventive users conducting experiments with often‑unpredictable requirements. To keep pace, Hemmer and his team must be innovators themselves, rapidly and efficiently delivering IT solutions that empower the CERN research commu‑nity. CERN IT delivers this functionality while facing the universal challenge of providing more services with limited funding and the same or decreasing data center and administrative resources. In choosing foundational elements of the IT infrastructure technology stack, CERN continually balances technical demands for performance, reliability, and scalability with the constancy of financial constraints.

Within the IT team, Database Services owns the responsibility for both the foundational database and the associated storage technologies. CERN first began utilizing Oracle databases and tools in 1982. Today, Oracle technology is used throughout the organization and plays a critical role in accelerator control systems, engineering and administrative applications, and LHC experiments. Oracle technology delivers requisite functionality, including high availability, scalability, and performance, with comprehensive tools for data distribution, protection, and manageability.

On the data storage side, essential requirements include manageability, availability, and scalability to respond to fast‑changing or unexpected requirements. For example, heavy lead ions cause especially complicated collisions that can make estimating data rates an inexact science. In one case, incoming data rates were five times higher than predicted. Hemmer further quantifies: “Data can come into our computer center at rates up to 6GB per second—that’s equivalent to the contents of two DVDs every three seconds. Our job is to ensure that that data is readable and permanently available to our community of physicists. Data is our existence. Our biggest challenge is handling the volume and rate of data growth.”

Delivering an agile data infrastructure that is intelligent, immortal, and infiniteHemmer’s team must build an agile data infrastructure that can: 1) deliver rapid impact through intelligent data management; 2) deliver “immortal” data availability, including nondisruptive upgrades to leverage technology advances without introducing downtime to the CERN instruments and scientific activities that run 24/7/365; and 3) provide nearly infinite scaling that will enable storage performance and capacity to grow in lock‑step with CERN’s research requirements and databases.

In 2007, after a public tender process, CERN selected NetApp® technology for the LHC logging database built on an Oracle database with Real Application Clusters (RAC) technology. Since that time, CERN has unified its entire Oracle infrastructure on NetApp and today stores 99% of all Oracle data on NetApp solutions. NetApp’s affordable cost of entry with linearly scalable performance and capacity has enabled CERN to grow its storage footprint at the pace of research demand.

4

Eric Grancher, database services architect within CERN IT, says that NetApp delivers enabling functionality to the Oracle environment: “NetApp’s certification with Oracle RAC over NFS is an asset. NetApp also offers distinct functionality, including 10‑Gigabit Ethernet [10GbE] support, low‑impact snapshot and cloning, the ability to deliver required performance and capacity at an affordable price point [utilizing NetApp Flash Cache intelligent caching with high‑capacity SATA disk drives], support for large files [up to 16TB], and, most recently, Data ONTAP® operating in Cluster‑Mode for more efficient data mobility. We welcome Data ONTAP Cluster‑Mode that lets us move data—for load‑balancing, for moving less‑used or inactive data to lower‑cost drives, or for technology updates—without having to stop the application.”

Oracle on NetApp across the organizationToday, nearly 100 Oracle databases run on NetApp storage. The CERN IT department provides Oracle services for:

• LHCcontrolandloggingoperations

• Onlineexperiments

• Offlineexperiments

• Administration,includingpayrollservices

• Engineeringservices

Grancher emphasizes the critical nature of CERN’s Oracle databases running on NetApp: “Our Oracle on NetApp infrastructure underpins both physics and business operations at CERN. CERN relies on Oracle databases to keep the LHC online and to maintain availability of our administrative databases—if those systems go down, it impacts the work of hundreds of people. One of the key decisions we made in building a highly reliable infrastructure was to deploy storage that we trust, that is simple to manage, and then layer on top of that. We take care of our storage and count on it to provide a stable service on which to build database and application services.”

Figure 1) CERN’S LHC and experiment operations.

LHC Experiments

LHC Operations

IT/DB Group

ExperimentOffline Databases Tier-1 CentersExperiment

Online Databases

Middleware

Administrative, IT, andEngineering DatabasesAccelerators ACC

CASTOR (CERN Advanced STORage Manager)

Data

Raw Data

Streams

Streams

5

Meeting the Technical Challenges of a Superscale EnvironmentCERN IT infrastructure services must be continuously available and must be superscalable to keep pace with prodigious data growth.

CERN science never sleeps—keeping the LHC onlineAny problem receiving or managing data can bring the system down, stopping the particles’ beam in the LHC. The powerful tools that monitor and control the LHC are built on Oracle databases running on a NetApp data infrastructure.

• Controlling database (ACCCON). This database stores accelerator set‑tings and controls. CERN operators monitor the accelerator 24/7, inputting required configuration changes into the database via control‑room screens. Should this database become unavailable for even a few minutes, operators would be unable to control the accelerator and would have to dump the beam—that is, extract the beam into huge graphite blocks to diffuse the beam’s energy—to protect the multi‑billion‑dollar LHC. Out‑of‑range temperatures, for example, could damage magnets that cost upwards of $1,000,000each,andcomplicatedrepairscouldtakeoperationsofflinefor weeks or even months.

• Logging database (ACCLOG). This database records input from thousands of sensors in the LHC, maintaining long‑term logs of the status of thou‑sands of magnets and all moving parts, including collimators that protect the beams by scraping off‑track particles. This largest and fastest growing of the CERN Oracle databases currently contains 4.1 trillion rows of data (126TB) and, because it contains calibration data, is also essential to keeping the LHC online.

Finding a needle in 20 million haystacksAnother key challenge in providing access to CERN’s massive stores of experimental data is delivering sufficient performance to Oracle index databases. Oracle databases running on NetApp manage the metadata that tracksandenablesaccesstorawresearchdatastoredinflatfilesontheCERNAdvanced STORage manager (CASTOR) hierarchical storage management system. CASTOR commodity disk farms and tape silos today provide 40PB of capacity. Over each year of the LHC’s operation, the 4 giant detectors observing trillions of elementary particle collisions will accumulate more than 10 million gigabytes of data, equivalent to the contents of about 20 million CD‑ROMs. At current recording rates, the CERN physics experiments will generate more than 20PB of new data annually that must be managed by the Oracle databases. CERN’s advances in big data analytics help researchers derive maximum and rapid value from these enormous datasets and will ultimately find application in industry, helping to enhance business outcomes through predictive analyses.

6

Keeping upCERN’s IT department also must enable database and storage systems to keep up with staggering data growth. Across CERN today, NetApp provides 901TB of capacity to Oracle databases, and CERN database staff expects capacity requirements to grow rapidly. Accelerator databases are expected to grow by 50TB each year. Such rapid growth demands unprecedented scalability and efficiencies in the CERN database and storage technology stack.

Key enabling technologies to achieve balance for Oracle environmentsGrancher says deploying Oracle databases on NetApp enables the Database Services team to balance requirements for efficiency with necessary stability, performance, and scalability. He cites vital functionality:

• 10GbE offers a known growth path to greater bandwidth plus the cost efficiencies of a widely adopted mainstream technology. Leveraging 10GbE also allows CERN to use the same switches and networking that serve the rest of the lab. That means CERN IT can reduce costs by handing off administration to the networking team that is already staffed to provide 24/7 management and support.

• Oracle Direct NFS (dNFS) enables multiple paths to storage. This tech‑nology contributes to scalability and, because it bypasses the server operating system, typically doubles the performance of traditional NFS. But just as importantly, dNFS takes Oracle over NFS from simple to extremely simple—the CERN IT staff does not have to worry about how to configure NFS because Oracle generates NFS requests directly from the database.

• SATA plus NetApp Flash Cache software makes it possible to achieve performance comparable to FC drives at a much lower price point. An FC solution would have been price‑prohibitive at CERN’s performance requirements and growth rates.

• NetApp FlexClone® software enables efficient creation of temporary, writ‑able copies. CERN required space‑efficient Snapshot™ copies and writable copies of large databases, but also needed to make sure that replication processes did not impact performance. The CERN tender actually speci‑fied the maximum impact that creating a specific number of Snapshot copies could have on given workloads.

7

• NetApp Data ONTAP 8 operating in Cluster-Mode makes it possible to maintain peak application performance and storage efficiency by adding storage and moving data without disrupting ongoing operations. In CERN’s environment, no application can be stopped, so the infrastructure must deliver continuous availability with nondisruptive upgrade and other administrative operations. Grancher says that Cluster‑Mode works particularly well with Oracle over NFS to give CERN needed agility.

How NetApp Participated in Furthering CERN’s Mission for ResearchHemmer suggests that the most successful technology deployments occur in the presence of a strong partnership. “We count on our providers to be innovative and proactive, helping to increase our cost effectiveness and use of resources.”

Grancher offers an example: “With the rapid growth of the LHC logging database—it expands at 50TB annually—we needed an alternative to our costly FC solution. Moving to SATA would have solved our capacity and cost issues, but we expected performance problems. NetApp’s recommendation to put Flash Cache in front let us keep performance at parity.”

Oracle RACDatabases

StorageInterconnect

NetApp FASStorage Systems

NetAppDisk Shelves

Figure 2) CERN’s NAS-based storage infrastructure

Making Oracle Database 11g betterA member since 2005 of the Oak Table network for Oracle scientists, Grancher understands and emphasizes the importance of implementing a storage foundation that enhances database environments. NetApp delivers a single, integrated platform for an agile data infrastructure that is:

• Intelligent. Management simplicity helps the CERN IT team more quickly deliver infrastructure to facilitate research. For example, CERN utilizes NetApp FlexVol® virtual volumes to simplify provisioning and achieve efficiencies with thin‑provisioned volumes. NetApp OnCommand® management software also enables automation that reduces human errors. Says Grancher: “The Oracle over NFS to NetApp storage has simplified how we access and manage data. With the time our database team saves we’re able to offer more services to more users. NetApp has smart tools, and we make good use of them.”

8

Grancher says that Oracle VM server virtualization on NFS is “simple, extensible, and stable.” In collaboration with Oracle, NetApp developed a Storage Connect plug‑in for Oracle VM 3.0. The plug‑in simplifies and centralizes management of Oracle Database and application environments by integrating advanced NetApp storage functionality—like deduplication and thin‑provisioning capabilities, for example—with Oracle VM 3.0.

NetApp technology also enables more efficient data protection and recoverability. Specifically, NetApp lets CERN protect data while avoiding data duplication, provide multiuse datasets without copying, and eliminate duplicate copies of data. “Without NetApp SnapRestore® technology,” Grancher states, “we’d need weeks to recover just one multiterabyte Oracle Database. NetApp also makes the size of the database irrelevant—we can copy a 1‑ or 10‑terabyte database in seconds and restore it in minutes or hours. It used to take 28 days to restore a 100TB Oracle Database—now it takes 15 minutes. Used in conjunction with Oracle Real Application Testing, SnapRestore technology also lets us quickly replay a workload for testing.”

• Immortal. Stability of storage is a big asset to the stability of CERN database workloads on top. NetApp RAID‑DP® technology, redundant components and high‑availability‑pair controller configurations, and the latest Data ONTAP Cluster‑Mode functionality contribute to CERN’s ability to build a no‑downtime, no‑data‑loss storage foundation.

Grancher points out that NetApp technology has let CERN evolve its Oracle Database solutions with zero downtime: “CERN has never had a downtime outage of SATA drives. Moving from FC SAN to SATA NAS, we’ve maintained exactly the same level of reliability. Since first deploying NetApp storage in 2007, CERN has not lost a single data block on NetApp. We can’t overemphasize the importance of this—if CERN databases don’t run, the accelerator doesn’t run, and physics doesn’t function.”

• Infinite. NetApp has also allowed CERN IT to deliver affordable perfor‑mance. When the capacity requirements of large‑scale Oracle databases made FC‑based storage no longer a viable option financially, CERN was able to combine more affordable SATA drives with NetApp Flash Cache to deliver needed capacity without performance sacrifice. Grancher says, “Using Flash Cache with SATA we’re achieving 35,000 IOPS over Ethernet—that’s the equivalent performance of 250 disks. If a big part of your workload fits into the cache, response time can drop into the millisecond range versus the 10–15 milliseconds that would be the standardforSATAalone.Wealsohaveflexibilitytospecifywhattocache—for example, we don’t cache archive redo logs—and the cache automatically adapts to workloads. That saves time and minimizes errors.”

With the pace and scope of data growth at CERN, scalable storage capacity and performance are fundamental. States Grancher, “CERN is much like any other organization managing an OLTP or big data environment. Our IT infrastructure has to be adaptable, reliable, scalable, and efficient, and our staff has to be proactive in integrating technologies and making effective use of limited resources—even as we deal with massive data growth. From affordable cost of entry to just‑in‑time storage expansion, NetApp has allowed us to grow our storage infrastructure in step with our ever‑expanding data and research requirements.”

Storage‑efficiency technologies also help CERN achieve its “keep forever” data strategy. Hemmer says, “When data comes in our computer center, it must be stored permanently. Researchers may want to access data years after it was collected, so we never, never throw away data.”

“CERN has never had a downtime outage of SATA drives. Moving from FC SAN to SATA NAS, we’ve maintained exactly the same level of reliability.”Eric GrancherDatabase Services ArchitectCERN

9

A Reliable and Extensible Foundation for ResearchGrancher comments on the larger impact of the Oracle on NetApp infrastruc‑ture: “Most rewarding for our Database Services team is being able to build something stable, an architecture that’s satisfying in terms of results and that’snotaone-timesolution,butratheraflexiblefoundationforgrowth.Ourcustomers—CERN’s global community of physicists, students, and staff—can rely on this infrastructure to deliver dependable data access, enable seamless collaboration, and ensure responsive services.”

IT footprint: 2X lessspace, power, cooling(SATA vs. SAS)

Hemmer adds, “We’ve received a number of spontaneous plaudits from scientists for the way in which our computing infrastructure has contributed to the delivery of physics results. By giving them the tools and data access they need for research, we’re helping physicists find those breakthrough clues and make the big discoveries that will have an impact far beyond the bounds of our organization.”

www.netapp.com

© 2012 NetApp, Inc. All rights reserved. No portions of this document may be reproduced without prior written consent of NetApp, Inc. Specifications are subject to change without notice. NetApp, the NetApp logo, Go further, faster, Data ONTAP, FlexClone, FlexVol, OnCommand, RAID‑DP, SnapRestore, and Snapshot are trademarks or registered trademarks of NetApp, Inc. in the United States and/or other countries. All other brands or products are trademarks or registered trademarks of their respective holders and should be treated as such. NA‑143‑0812‑A4

Key Products and Technologies

NetApp• FASstoragesystems

• DS4243diskshelves

• 3TBSATA,2TBSATA

• 512GBFlashCache

• DataONTAP8

• FlexVol

• FlexClone

• Snapshottechnology

• SnapRestore

• OnCommandsoftware

• Thinprovisioning

• Largeaggregates

• NVRAM

• NFS/CIFS

Oracle• OracleDatabase11g Enterprise

Edition with Real Application Clusters Technology and partitioning options

• OracleDirectNFS

• OracleStreams

• OracleVM

Other• HPProCurve10Gb/s

Ethernet switches

• IBMTivoliTSMtapesystemand TDPO library

• Servers from multiple vendors, all equipped with 10Gb Ethernet

About CERNCERN, the European Organization for Nuclear Research, is one of the world’s largest and most respected centers for scientific research. Its business is fundamental physics, finding out what the universe is made of and how it works. At CERN, the world’s largest and most complex scientific instruments are used to study the basic constituents of matter—the fundamental particles. By studying what happens when these particles collide, physicists learn about the laws of nature.

Founded in 1954, the CERN Laboratory sits astride the Franco‑Swiss border near Geneva, Switzerland. It was one of Europe’s first joint ventures and now has 20 Member States. www.cern.ch

About NetAppNetApp creates innovative storage and data management solutions that deliver outstanding cost efficiency and accelerate business breakthroughs. Discover our passion for helping companies around the world go further, faster at www.netapp.com.

Go further, faster®