managing data center power and cooling

5
Managing Data Center Power and Cooling © 2007 FORCE10 NETWORKS, INC. [ PAGE 1 OF 5 ] PAPER White Introduction: Crisis in Power and Cooling As server microprocessors become more powerful in accordance with Moore’s Law, they also consume more power and generate more heat. Similar geometric improvement in disk storage technology has driven rapid growth of online data, using mass storage systems that often consume as much power as the servers themselves. With the rapid growth in computing power and storage capacity, typical data center power consumption (kWh) and power density (kWh/sq. ft.) are both spiraling upward, placing a strain on many existing data center power distribution and cooling systems. Each watt of power consumed in the data center requires 3.413 BTU/hour of cooling capacity to remove the associated heat. Depending on climate factors, removing heat at the rate of 3.413 BTU/hour requires an additional 0.4-0.6 watts of electrical power.According to HP, data center power densities have grown from 2.1 kWh/rack in 1992 to 14 kWh/rack in 2006, requiring cooling systems that can deal with local "hot spots" within the computer room. A Ziff-Davis Survey conducted in November 2005 found that 71% of IT decision makers are dealing with or tracking issues related to power consumption and cooling, while 63% are increasing electrical power capacity or expanding the size of the data center. A similar survey by IDC in May 2006, found that power provisioning and power consump- tion are among the top three issues in the data center. Another aspect of the power problem is the growing cost of electricity. Currently the 3-year costs of power and cooling in the U.S. are roughly equivalent to the acquisition cost of data center capital equipment, according to IDC. With the demand for computing power and the cost of electrical power continuing to escalate, power can be expected to consume a larger share of IT budgets, possibly as much as 50% in the next few years. As these trends continue to unfold, data center managers will need to give careful consideration to the impact of each investment on the power and cooling profile of their facility. In addition, the life cycle costs of power and cooling will become increasingly important factors in TCO calculations used to guide selection among competing solutions. Aiding this analysis is an emerging set of data center and server power efficiency metrics being developed by the EPA and industry consortia. In the long run, the EPA hopes that manufacturers of data center equipment will publish EnergyStar metrics to help customers better manage power consumption. The key power efficiency metrics coming out of this and similar efforts are expected to be application workload/watt for servers, GB/watt for storage, and Gbps/watt for networking. In parallel with standards efforts, end users can readily develop their own power efficiency metrics and calculations. For example, for compute power efficiency, one can focus on the performance benchmarks that are meaningful for critical applications and divide by the corresponding power consumption in watts. For High Performance Computing, Mflops/watt is a popular power efficiency metric. For databases application workloads, TPC-H Composite Queries-per-Hour per watt (QpHH/watt) for a given size of database is one possible power efficiency metric. There are a number of potential benefits that can be derived from an increased focus on power consumption and power efficiency: Extending the life of existing data centers and minimizing retrofits Gaining at least partial control of growing expenses for power and cooling Optimizing new data center designs

Upload: datacenters

Post on 21-Feb-2017

262 views

Category:

Technology


6 download

TRANSCRIPT

Managing Data Center Power and Cooling

© 2 0 0 7 F O R C E 1 0 N E T W O R K S , I N C . [ P A G E 1 O F 5 ]

PAPERWhite

Introduction: Crisis in Power and CoolingAs server microprocessors become more powerful in accordance with Moore’s Law, they also consume more power and generate more heat. Similar geometric improvement in disk storage technology has driven rapid growth of onlinedata, using mass storage systems that often consume as much power as the servers themselves. With the rapid growth incomputing power and storage capacity, typical data center power consumption (kWh) and power density (kWh/sq. ft.)are both spiraling upward, placing a strain on many existing data center power distribution and cooling systems.

Each watt of power consumed in the data center requires 3.413 BTU/hour of cooling capacity to remove the associated heat. Depending on climate factors, removing heat at the rate of 3.413 BTU/hour requires an additional 0.4-0.6 watts of electrical power. According to HP, data center power densities have grown from 2.1 kWh/rack in 1992to 14 kWh/rack in 2006, requiring cooling systems that can deal with local "hot spots" within the computer room.

A Ziff-Davis Survey conducted in November 2005 found that 71% of IT decision makers are dealing with or trackingissues related to power consumption and cooling, while 63% are increasing electrical power capacity or expandingthe size of the data center. A similar survey by IDC in May 2006, found that power provisioning and power consump-tion are among the top three issues in the data center.

Another aspect of the power problem is the growing cost of electricity. Currently the 3-year costs of power and coolingin the U.S. are roughly equivalent to the acquisition cost of data center capital equipment, according to IDC. With thedemand for computing power and the cost of electrical power continuing to escalate, power can be expected to consume a larger share of IT budgets, possibly as much as 50% in the next few years.

As these trends continue to unfold, data center managers will need to give careful consideration to the impact ofeach investment on the power and cooling profile of their facility. In addition, the life cycle costs of power and cooling will become increasingly important factors in TCO calculations used to guide selection among competingsolutions. Aiding this analysis is an emerging set of data center and server power efficiency metrics being developedby the EPA and industry consortia. In the long run, the EPA hopes that manufacturers of data center equipment willpublish EnergyStar metrics to help customers better manage power consumption. The key power efficiency metricscoming out of this and similar efforts are expected to be application workload/watt for servers, GB/watt for storage,and Gbps/watt for networking. In parallel with standards efforts, end users can readily develop their own power efficiency metrics and calculations. For example, for compute power efficiency, one can focus on the performancebenchmarks that are meaningful for critical applications and divide by the corresponding power consumption inwatts. For High Performance Computing, Mflops/watt is a popular power efficiency metric. For databases applicationworkloads, TPC-H Composite Queries-per-Hour per watt (QpHH/watt) for a given size of database is one possiblepower efficiency metric.

There are a number of potential benefits that can be derived from an increased focus on power consumption andpower efficiency:

• Extending the life of existing data centers and minimizing retrofits

• Gaining at least partial control of growing expenses for power and cooling

• Optimizing new data center designs

Managing Data Center Power and Cooling

Optimizing the overall power efficiency of the datacenter requires a comprehensive approach that focuses on technologies and strategies to minimizepower consumption and maximize power efficiency at every level within the infrastructure, including CPUchips, power supplies, servers, storage devices, andnetworking equipment. In addition to measures thatmaximize power efficiency for hardware devices,there are also software strategies, such as server virtualization, that can play a significant role in reducing power consumption.

CPU chips: At a given level of compute performance,the basic architecture of the CPU chip can have a significant impact on power consumption. For example, integrated memory controllers on the CPUcan reduce overall power consumption of the chipset. Many server manufacturers offer a fairly widechoice of CPUs in a given model of server, allowingpower efficiency to be considered when making product selections and tradeoffs.

© 2 0 0 7 F O R C E 1 0 N E T W O R K S , I N C . [ P A G E 2 O F 5 ]

Beyond architectural differences, all leading-perfor-mance CPU chips have reached levels of power consumption that prevent tracking Moore’s Law simplyby increasing clock speed. As shown in Figure 2, thegeometric growth in transistors per chip is expected to continue unabated through 2010, while power perchip and clock speed are being forced to level off significantly. These technology trends have led chipmanufacturers to turn to multi-core chips to takeadvantage of continued growth in transistor densities.For example, dual-core CPUs can deliver higher performance than single-core CPUs because scalingback clock frequency by only ~15-20% can cut powerconsumption by ~40%, allowing two cores per die.Over the remainder of this decade we can expectMoore’s Law at the transistor level to drive a doublingin the number of cores per chip every 18-24 months.

The applications that benefit the most from multi-corechip architectures include multi-threaded applications(such as cluster computing), transaction processing,and multi-tasking. For applications such as these, a

Figure 1. Growth trends in transistors/chip, clock speed, and power/chip

can reduce power consumption by as much as 20-50%,while also consuming less floor space. The result islower overall power consumption but higher powerdensities measured in terms of watts/rack. The higherpower densities (in the range of >12-15 kWh/rack)may require special cooling capabilities, such as local liquid cooling of blade server racks. Irrespective ofserver packaging choices, server power supplies (aswell as the power supplies of all data center devices)should be selected, wherever possible, for high powerconversion efficiencies (e.g., conversion efficiency inexcess of 80% and power factors approaching 1.0).

Server Virtualization: Server virtualization allowsapplications to be consolidated on a smaller numberof servers, through elimination of many low utilizationservers dedicated to single applications or operatingsystem versions. Reducing the number of servers canbring power savings of up to 50% depending on theapplication mix. Server virtualization is attractivebecause it can be deployed on existing servers, mini-mizes disruption of existing applications, and hasmany other TCO benefits besides power conservation.

Storage: For storage devices, power is consumed primarily by spindle motors, which means that powerconsumption is relatively independent of the capacityof the disk. Therefore, storage power efficiency (measured in GBytes/watt) is maximized by deployingthe highest capacity disks that have I/O characteristicscompatible with the applications being served.Currently, large drives (~500 GB) are often deployedfor less demanding applications, such as data mining,while drives with capacities >100 GB are used forapplications requiring higher performance I/O, such as for database applications. Storage virtualizationtechnologies and large-scale tiered storage are strategies that offer the potential to maximize powerefficiency by minimizing storage over-provisioning.

Switched Network Infrastructure: Power efficiency for switches and routers is measured by throughputefficiency in Gbps/watt. For high-density, chassis-based switch/routers required for large data centers,power efficiency largely depends on the power characteristics of the device’s backplane. In addition toproviding the physical connectivity for the switchingfabric carrying data between line cards, the backplaneserves as the grid that distributes power to the linecards and control modules of the switch. For passivecopper backplanes, power efficiency is primarily a

© 2 0 0 7 F O R C E 1 0 N E T W O R K S , I N C . [ P A G E 3 O F 5 ]

dual core processor can deliver >60% higher perfor-mance than a single core processor dissipating thesame power. However, single core processors will offerbetter performance for I/O intensive, single-threadedapplications because multiple cores have to contendfor memory and I/O bandwidth. From both a perfor-mance and a power perspective, the trend to multi-coreCPUs should drive a higher priority on multi-threadedprogramming models for new applications.

A second chip level technique to improve power efficiency is with dynamic Clock Frequency and VoltageScaling (CFVS). CFVS provides performance-on-demandby dynamically adjusting CPU performance (via clockrate and voltage) to match the workload. With CFVS,the CPU runs at the minimum clock speed (and power level) needed by the current workload. Clockfrequency and voltage are controlled by the operatingsystem’s power management utility via industry-standard Advanced Configuration and Power Interface (ACPI) calls.

The benefits of CFVS are depicted conceptually in the charts of Figure 2. In the chart on the left, powerconsumption is shown as a function of utilization,both with and without CFVS. CFVS can deliver up to75% power savings at idle and 40-70% power savingsfor utilization in the 20-80% range. In the chart on theright, power efficiency in workload units per watt isshown as a function of utilization. Because CPU performance is not degraded by CFVS, dramaticimprovements in power efficiency are possible withCFVS, as depicted in the chart.

Server Packaging: Rack-optimized servers and bladeservers can share a number of components, includinghigher efficiency power supplies and cooling subsys-tems. Compared to traditional servers, blade servers

Figure 2. Effect of CFVS on power consumption and powerefficiency

© 2 0 0 7 F O R C E 1 0 N E T W O R K S , I N C . [ P A G E 4 O F 5 ]

function of the resistance of the copper traces. Forexample, the Force10 E-Series switch/router uses apatented 4 layer, 4 ounce copper backplane to mini-mize resistance and power consumption. As a result,the E-Series backplane itself has a power efficiency of4.5 Gbps/watt. As shown in Figure 3 (which is anexcerpt from Force10’s power and power efficiencymodeling tool) the E1200 switch/router fully configuredwith 1000Base-T GbE interfaces running at line rate(LR) has a system-level power efficiency of 0.125Gbps/watt (8 watts per 1000Base-T port).

In similar fashion, an E1200 fully configured with 4 port 10 GbE XFP line cards running at line rate has a system level power efficiency of 0.12 Gbps/watt (83 watts per 10 GbE port).

The high reserve capacity of the E-Series backplane willallow future improvements in line card port densities to be achieved on the same power budget. This meansthat system-level power efficiency improvement willtrack port densities. For example, when GbE port density is doubled, the system power efficiency willalso nearly double.

Ultra high density, power-efficient switch/routers withcarrier-class reliability, such as the Force10 E-Series,can be leveraged in two additional ways to furtherreduce data center power consumption.

1.Switch Consolidation: consolidating a number of low density switches into a large, high density switch with shared redundant power offers power savings analogous to that of the blade server vs. traditional servers. High density also allows the traditional access and aggregation/distribution layers of switching to be collapsed into a single layer of switching that performs both of these functions. The scalability/density of the Force10 E-Series often enables network consolidations with a >3:1 reduction in the number of data center switches. This high reduction factor is due to the combination of the following factors:

• Elimination of a distinct access switching layer (i.e., 2-Tier switching vs. 3-Tier switching)

• More servers per aggregation switch, resulting in fewer aggregation switches

• More aggregation switches per core switch, resulting in fewer core switches

2.Unified Switch Fabric: With the advent of intelligent Ethernet NICs that reduce both host-based latency and CPU utilization for network transfers, Ethernet is well-positioned to function as a unified, or converged, switching fabric that provides LAN connectivity, storage networking, and cluster interconnect across the data center. With a unified Ethernet fabric, power is conserved because only one network adapter is needed for each server and no additional sets of switches are required for specialized fabrics.

Virtual Data Center: Another approach to reducingpower consumption in the data center is to moveaway from a model based on static, dedicated physicalresources for each application toward a virtualizedmodel where each application draws on a shared poolof resources in order to satisfy workload requirements.Since workloads of various applications peak at different times in the business cycle, the sharedresource model can do the same job with far fewerresources, resulting in far lower power consumption.

Server virtualization, in conjunction with automatedsystem management and a unified Ethernet switchingfabric, constitutes a significant step toward a virtualizeddata center architecture that offers optimized resourceutilization and minimal power consumption. A detaileddiscussion of the Force10 Networks blueprint for thedesign of a power-efficient Virtual Data Center isavailable on the Force10 website.

Figure 3. Power Consumption of a E1200 with 672 Line-RateGbE Ports

* This power consumption table uses the maximum power draw in calculating each element in the E-Series system. Actual average power draw will typically be 10-25% more efficient.

ConclusionUntil now, IT staff has typically focused on the escalation of computing power and storage capacity,coupled with smaller form factors for servers and storage devices, and the strains they are placing on the power and cooling facilities of the data center.However, according to the Ethernet Alliance, inefficientnetworking could also be wasting as much as $450m ayear, or 5.8 TW-h, in the United States, and potentiallythree times that much, worldwide.

To address network energy efficiency, the IEEE hasdecided to form a study group to scope standards pertaining to Energy Efficient Ethernet (EEE). This groupwill work to ensure maximum efficiency under normaluse scenarios and develop designs for lower energy useat lower utilization and for minimum energy usage overthe operational lifetime of networking platforms.

In addition to industry alliance and standards efforts,there are a number of technologies and strategies available to allow data center managers to improve thepower efficiency of existing data centers and optimizethe power and cooling designs of new data centers. The benefits of focusing on power efficiency metricsand power conservation at every level within the datacenter infrastructure minimizes the cost of physicalplants, as well as the recurring cost of electrical power,an increasingly important component of TCO and ofoverall IT budgets.

© 2007 Force10 Networks, Inc. All rights reserved. Force10 Networks and E-Series are registered trademarks, andForce10, the Force10 logo, P-Series, S-Series, TeraScale and FTOS are trademarks of Force10 Networks, Inc. All othercompany names are trademarks of their respective holders. Information in this document is subject to change withoutnotice. Certain features may not yet be generally available. Force10 Networks, Inc. assumes no responsibility for any errorsthat may appear in this document.

WP20 307 v1.1

Force10 Networks, Inc.350 Holger WaySan Jose, CA 95134 USAwww.force10networks.com

408-571-3500 PHONE

408-571-3550 FACSIMILE

© 2 0 0 7 F O R C E 1 0 N E T W O R K S , I N C . [ P A G E 5 O F 5 ]