testing of the ibm z-series modular refrigeration unit ... · testing of the ibm z-series modular...
TRANSCRIPT
ISC ENGINEERING ISC PROCUREMENT
Testing of the IBM z-series Modular Refrigeration Unit
Terrence QuinnIBM
Integrated Supply ChainProcurement Engineering
10/18/2007 TAQ/548A 2
ISC ENGINEERING ISC PROCUREMENT
Testing of the IBM z-series Modular Refrigeration Unit
Overview: This presentation will describe the methodology and hardware used in the testing of the current z-series MRU. Focus will be placed on the hardware and software of the tester. Datafrom the 4 years of testing of the current generation of MRU’s will be discussed showing the evolution of the MRU design as well as its testing requirements and the tester.
10/18/2007 TAQ/548A 3
ISC ENGINEERING ISC PROCUREMENT
Testing of the IBM z-series Modular Refrigeration Unit
Acknowledgement: The data and work presented on the Modular Refrigeration Unit tester is the cumulative team work of severalindividuals;
Frank CascioFrank DesianoMark SinclairTerrence Quinn
In addition, the Modular Refrigeration Unit is the result of the work of a large team at IBM
10/18/2007 TAQ/548A 4
ISC ENGINEERING ISC PROCUREMENT
Testing of the IBM z-series Modular Refrigeration Unit
History: IBM began using refrigeration to cool its high end servers in 1997. Called a Modular Cooling Units because of its modular design (it is field servicable and can be replaced while the server is running). The design has evolved over time. It is the 3rd generation of the design.
The cooling system consists of;
- Cold plate (commonly referred to as the Evaporator) thatattaches to the processor module- The Modular Refrigeration Unit (MRU) - contains the rest of the refrigeration system including the compressor, the condensor and control valves.- Blower – moves air through the condensor and air cools the MRU- Control card – electronics that provide power and control signal used to run the MRU.
10/18/2007 TAQ/548A 5
ISC ENGINEERING ISC PROCUREMENT
Testing of the IBM z-series Modular Refrigeration Unit
What is the Modular Refrigeration Unit?- The part of the cooling system containing the compressor, control valves, condenser
and thermal sensors. The current MRU is approximately 3ft x 1ft x 1 ft in size, about 70 lbs. It was designed to be field servicable.
10/18/2007 TAQ/548A 6
ISC ENGINEERING ISC PROCUREMENT
Testing of the IBM z-series Modular Refrigeration Unit
Basic Refrigeration Diagram
Compressor
Condenser
ExpansionValve
EvaporatorAir flow
High pressure gas
liquidgas
gas
10/18/2007 TAQ/548A 7
ISC ENGINEERING ISC PROCUREMENT
Testing of the IBM z-series Modular Refrigeration Unit
Diagram of the refrigeration cycle used in current z-series
Compressor
Condenser
ExpansionValves
Evaporator
Air flow
High pressure gas
liquidgas
gas
gas liquid
Blower
ControlCard
gas
MRU
10/18/2007 TAQ/548A 8
ISC ENGINEERING ISC PROCUREMENT
Testing of the IBM z-series Modular Refrigeration Unit
Testing Strategy: What is the best way to test the MRU?
Considerations;- Safe Operation
- Handling (MRU weight is approximately 70 lbs)- Chemicals
- Component quality- Approximately 100 components
- Assembly quality- Electrical connections- Braze joints
- Refrigeration Function- MRU will typically run continuously (24 hrs, 7 days/week) in customer environment.
- Throughput- Ease of use- Need to support IBM Manufacturing demand fluctuation- High level of automation
10/18/2007 TAQ/548A 9
ISC ENGINEERING ISC PROCUREMENT
Testing of the IBM z-series Modular Refrigeration Unit
Resulting Tester Design Decisions;
- Cart/stall based tester design allows for easy movement of MRU- Computer controlled, minimal operator interaction, single, easy to use screen- Multiple (4), independent test cells per tester- 24 hour long test. Multiple test (10) conditions. Multiple start/stop operations- Active monitoring of 10 thermistors and other sensors. - Automatic Pass/Fail assessment- Permanent storage of all MRU results (downloaded to permanent media
periodically.- Operation of MRU under PID (Proportional Integral Derivative) control at different powers, essentially operation of MRU to run evaporator at a target set point. Approximation of actual MRU operation in field.
- Operation of both loops at maximum power (maximum load on MRU)- Operation of each loop at low power. Verifies each loop wired properly, MRU has low heatload function.
- Operation of MRU under manual control at 2 different powers. Essentially a capacity test. Valves set to a predetermined position and resulting evaporator temperature monitored.
- No need to leak test in tester since Helium leak testing during build operation prior to operation in tester.
- Refrigerant level monitored before and after tester using scales.
10/18/2007 TAQ/548A 10
ISC ENGINEERING ISC PROCUREMENT
Testing of the IBM z-series Modular Refrigeration Unit
MRU Tester
10/18/2007 TAQ/548A 11
ISC ENGINEERING ISC PROCUREMENT
Testing of the IBM z-series Modular Refrigeration Unit
MRU Tester
10/18/2007 TAQ/548A 12
ISC ENGINEERING ISC PROCUREMENT
Testing of the IBM z-series Modular Refrigeration Unit
MRU Tester Cart (close up)
10/18/2007 TAQ/548A 13
ISC ENGINEERING ISC PROCUREMENT
Testing of the IBM z-series Modular Refrigeration Unit
Primary Tester Results Monitored;
- First Test Passes – More of a qualitative measure. MRU’s that fail can typically be repaired but require additional process time and handling.- Evaporator temperature – This is the key function of the MRU. Maintain a constant desired temperature at the evaporator- Valve position – Provides an indication of valve performance. The valve is the key component for controlling refrigeration performance. If it is not functioning well, MRU performance will suffer. There are two valves working on same refrigerant source, so some ‘coupling’ of loops occurs, so valves need to be similar.- Thermistor performance. 10 thermistors in assembly. 8, non-critical, used for monitoring MRU performance and debug, 2 critical and will shut down MRU based on results. Must make sure all are working properly.
10/18/2007 TAQ/548A 14
ISC ENGINEERING ISC PROCUREMENT
Testing of the IBM z-series Modular Refrigeration Unit
Typical Test Results – Evaporator temperature
10/18/2007 TAQ/548A 15
ISC ENGINEERING ISC PROCUREMENT
Testing of the IBM z-series Modular Refrigeration Unit
Test Results – Valve positions
10/18/2007 TAQ/548A 16
ISC ENGINEERING ISC PROCUREMENT
Testing of the IBM z-series Modular Refrigeration Unit
Test Results – First Test Passes1Q
03
2Q03
3Q03
4Q03
1Q04
2Q04
3Q04
4Q04
1Q05
2Q05
3Q05
4Q05
1Q06
2Q06
3Q06
4Q06
1Q07
2Q07
3Q07
Quarter or Week ending
40
50
60
70
80
90
100
1st T
est P
ass
Yie
ld (%
)
Danu MRU Process Yield
10/18/2007 TAQ/548A 17
ISC ENGINEERING ISC PROCUREMENT
Testing of the IBM z-series Modular Refrigeration Unit
Yield chart -Useful way to monitor MRU production.-Similar but not exactly an SPC chart.-Early yield problem (1Q03) a result of process line startup and learning curve.- Recent yield problem due to two unrelated issues.
- Gradual change in critical temperatures over time (due to collective effects of various design changes) caused failure rate to creep up.-Valve motor change by valve supplier (no initial notification) severely impacted MRU performance. Essentially MRU test parameters so well tuned that performance change immediately impacted results.
-Historic average of MRU yield is 85%.-Approximately one quarter of fails (4%) due to various component failures (thermistorfails, cable wiring defects, Another quarter of fails (4%) due to assembly mistakes (improperly connected/loose cables).-Remaining failures a result of valves being too dissimilar to each other to work together. Valve is a commercial part, not designed for this specific operation (dual loops with single compressor). Due to nature of design it is not possible to pre-sort valves.
10/18/2007 TAQ/548A 18
ISC ENGINEERING ISC PROCUREMENT
Testing of the IBM z-series Modular Refrigeration Unit
< 24.930 24.938 24.954 24.971 24.988 25.004 25.021 25.037 25.054 25.070 > 25.079Average Normalized Temperature (C)
0.00%
10.00%
20.00%
30.00%
40.00%
50.00%
Per
cent
age
of P
opul
atio
n
MRU PerformanceEvaporator 1, PID, 835W - Average Temp
Test Results – Evaporator temperatures
< 24.938 24.945 24.960 24.975 24.990 25.004 25.019 25.034 25.049 25.063 > 25.071Average Normalized Temperature (C)
0.00%
10.00%
20.00%
30.00%
40.00%
50.00%
Per
cent
age
of P
opul
atio
n
MRU PerformanceEvaporator 2, PID, 835W - Average Temp
10/18/2007 TAQ/548A 19
ISC ENGINEERING ISC PROCUREMENT
Testing of the IBM z-series Modular Refrigeration Unit
Test Results – Evaporator valve position
< 124 132 148 164 180 197 213 229 245 261 > 269Valve Stepper Motor Position
0.00%
5.00%
10.00%
15.00%
20.00%
25.00%
30.00%
35.00%
Per
cent
age
of P
opul
atio
n
MRU PerformanceEvaporator 1, PID, 835W - Valve Position
< 124 132 149 165 182 198 215 232 248 265 > 273Valve Stepper Motor Position
0.00%
5.00%
10.00%
15.00%
20.00%
25.00%
30.00%
35.00%
Perc
enta
ge o
f Pop
ulat
ion
MRU PerformanceEvaporator 2, PID, 835W - Valve Position
10/18/2007 TAQ/548A 20
ISC ENGINEERING ISC PROCUREMENT
Testing of the IBM z-series Modular Refrigeration Unit
Test Results – Evaporator temperatures
< 24.217 24.304 24.478 24.652 24.826 25.000 25.174 25.348 25.522 25.696 > 25.783Average Normalized Temperature (C)
0.00%
20.00%
40.00%
60.00%
80.00%
100.00%
Perc
enta
ge o
f Pop
ulat
ion
MRU PerformanceEvaporator 1, PID, 400W - Average Temp
< 20.082 20.379 20.975 21.571 22.167 22.762 23.358 23.954 24.549 25.145 > 25.443Average Normalized Temperature (C)
0.00%
10.00%
20.00%
30.00%
40.00%
Perc
enta
ge o
f Pop
ulat
ion
MRU PerformanceEvaporator 2, PID, 0W - Average Temp
10/18/2007 TAQ/548A 21
ISC ENGINEERING ISC PROCUREMENT
Testing of the IBM z-series Modular Refrigeration Unit
Test Results – Evaporator temperatures
< 6.327 6.708 7.472 8.236 8.999 9.763 10.526 11.290 12.054 12.817 > 13.199Average Normalized Temperature (C)
0.00%
5.00%
10.00%
15.00%
20.00%
25.00%
30.00%
Per
cent
age
of P
opul
atio
n
MRU PerformanceEvaporator 1, Valve @ 200, 500W - Average Temp
< 6.577 6.917 7.597 8.277 8.957 9.637 10.317 10.997 11.677 12.357 > 12.697Average Normalized Temperature (C)
0.00%
5.00%
10.00%
15.00%
20.00%
25.00%
30.00%
Per
cent
age
of P
opul
atio
n
MRU PerformanceEvaporator 2, Valve @ 200, 500W - Average Temp
10/18/2007 TAQ/548A 22
ISC ENGINEERING ISC PROCUREMENT
Testing of the IBM z-series Modular Refrigeration Unit
Test Results – Evaporator temperatures
< 18.142 18.444 19.046 19.648 20.250 20.852 21.455 22.057 22.659 23.261 > 23.562Average Normalized Temperature (C)
0.00%
5.00%
10.00%
15.00%
20.00%
25.00%
Per
cent
age
of P
opul
atio
n
MRU PerformanceEvaporator 1, Valve @ 400, 835W - Average Temp
< 18.558 18.838 19.399 19.959 20.519 21.079 21.639 22.200 22.760 23.320 > 23.600Average Normalized Temperature (C)
0.00%
5.00%
10.00%
15.00%
20.00%
25.00%
30.00%
Per
cent
age
of P
opul
atio
n
MRU PerformanceEvaporator 2, Valve @ 400, 835W - Average Temp
10/18/2007 TAQ/548A 23
ISC ENGINEERING ISC PROCUREMENT
Testing of the IBM z-series Modular Refrigeration Unit
Population Distribution charts
-Another useful way to monitor MRU production.-Evaporator temperature distribution shows population holding less than +/- 1 C in first test condition (MRU under active feedback control), the specification is +/- 3 C. MRU performance is extremely consistent.-Valve stepper motor position (average value) in same test also shows extreme consistency in the design at the time of initial build.-Single loop tests show that when only one loop running, MRU control is even tighter (also active feedback control). ‘Off’ Evaporator is actually at ambient air temperature.-Manual/Capacity test show more population variation. Still it demonstrates very good consistency in production.
10/18/2007 TAQ/548A 24
ISC ENGINEERING ISC PROCUREMENT
Testing of the IBM z-series Modular Refrigeration Unit
Field issues
-MRU Field failures are re-run on tester as part of the Failure Analysis process. Helps diagnose problems quickly.-Field failures less than 1 % (fails/(installs*months of operation))-Field failures falling into 4 main categories
-Over temperature/Loss of Refrigeration control-Compressor failures
-MRU compressor generally always on. 100% duty cycle. Significantly different than other refrigeration applications.-Changed certain low heatload operations. Particular ‘standby’ conditions caused significant strain on compressors.
-Loss of refrigerant-Traced to handling issues. Eventually corrected by process improvements in IBM Manufacturing
-Component failures- Only 3 thermistor failures over course of current MRU product.
10/18/2007 TAQ/548A 25
ISC ENGINEERING ISC PROCUREMENT
Testing of the IBM z-series Modular Refrigeration Unit
Conclusion
-MRU tester has proved effective way of controlling incoming quality of MRU’s to IBM.-Has caught several issues prior to them impacting IBM or our customer base.-Limits the types of defects that escape to the field.-Has contributed to design improvements particularly as part of the failure analysis process.