storage_diagnostics_and_troubleshooting_guide
TRANSCRIPT
![Page 1: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/1.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 1
Storage Diagnostics and Troubleshooting Participant Guide
Global Education Services LSI Corporation
![Page 2: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/2.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 2
3rd edition (July 2008)
![Page 3: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/3.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 3
Table of Contents
Terms and Conditions .............................................................................................. 5
Storage Systems Diagnostics and Troubleshooting Course Outline ............................... 9
Module 1: Storage System Support Data Overview ................................................... 13
All support Data Capture..................................................................................... 14
Major Event Log (MEL) Overview ......................................................................... 17
State Capture Data File ....................................................................................... 32
Accessing the Controller Shell .............................................................................. 34
Logging In To the Controller Shell (06.xx) ............................................................ 34
Logging In To the Controller Shell (07.xx) ............................................................ 34
Controller Analysis.............................................................................................. 35
Additional Output ............................................................................................... 48
Knowledge Check ............................................................................................... 50
Additional Commands ......................................................................................... 51
Debug Queue..................................................................................................... 56
Knowledge Check ............................................................................................... 59
Modifying Controller States.................................................................................. 60
Diagnostic Data Capture (DDC) ........................................................................... 62
Knowledge Check ............................................................................................... 65
Module 3: Configuration Overview and Analysis....................................................... 67
Configuration Overview and Analysis.................................................................... 68
Knowledge Check ............................................................................................... 74
Drive and Volume State Management................................................................... 75
Volume Mappings Information ............................................................................. 92
Knowledge Check ............................................................................................... 94
Portable Volume Groups in 07.xx ......................................................................... 95
RAID 6 Volumes in 07.xx..................................................................................... 96
Troubleshooting Multiple Drive Failures ................................................................ 97
Offline Volume Groups ...................................................................................... 106
Clearing the Configuration................................................................................. 108
Recovering Lost Volumes .................................................................................. 109
Knowledge Check ............................................................................................. 114
Module 4: Fibre Channel Overview and Analysis .................................................... 115
Fibre Channel................................................................................................... 116
Fibre Channel Arbitrated Loop (FC-AL) ............................................................... 116
Fibre Channel Arbitrated Loop (FC-AL) – The LIP ................................................ 117
Knowledge Check ............................................................................................. 122
Drive Side Architecture Overview ....................................................................... 123
Knowledge Check ............................................................................................. 139
![Page 4: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/4.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 4
Destination Driver Events.................................................................................. 140
Read Link Status (RLS) and Switch-on-a-Chip (SOC)............................................ 143
What is SOC or SBOD?...................................................................................... 148
Field Case........................................................................................................ 160
Drive Channel State Management ...................................................................... 161
SAS Backend.................................................................................................... 163
Appendix A: SANtricity Managed Storage Systems .................................................. 173
6998 /6994 /6091 (Front) ................................................................................. 174
6998 /6994 /6091 (Back) .................................................................................. 174
3992 (Back) ..................................................................................................... 175
3994 (Back) ..................................................................................................... 176
4600 16-Drive Enclosure (Back)......................................................................... 176
4600 16-Drive Enclosure (Front) ........................................................................ 176
Appendix B: Simplicity Managed Storage Systems .................................................. 178
1333 ............................................................................................................... 178
1532 ............................................................................................................... 179
1932 ............................................................................................................... 180
SAS Drive Tray (Front)...................................................................................... 181
SAS Expansion Tray (Back) ............................................................................... 181
Appendix C – State, Status, Flags (06.xx) .............................................................. 183
Appendix D – Chapter 2 - MEL Data Format ........................................................... 189
Appendix E – Chapter 30 – Data Field Types.......................................................... 203
Appendix F – Chapter 31 – RPC Function Numbers ................................................. 215
Appendix G – Chapter 32 – SYMbol Return Codes................................................... 229
Appendix H – Chapter 5 - Host Sense Data ............................................................ 261
Appendix I – Chapter 11 – Sense Codes ................................................................ 279
![Page 5: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/5.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 5
Terms and Conditions
Agreement
This Educational Services and Products Terms and Conditions (“Agreement”) is between LSI Corporation (“LSI”), a Delaware corporation, doing business in AL, AZ, CA, CO, CT, DE, FL, GA, KS, IL, MA, MD, MN, NC, NH, NJ, NY, OH, OR, PA, SC, UT, TX, VA and WA as LSI Corporation, with a place of business at 1621 Barber Lane, Milpitas, California 95035 and you, the Student. By signing this Agreement, or clicking on the “Accept” button as appropriate, Student accepts all of the terms and conditions set forth below. LSI reserves the right to change or modify the terms and conditions of this Agreement at any time.
Course materials
The course materials are derived from end-user publications and engineering data related to LSI’s Engenio Storage Group (“ESG”) and reflect the latest information available at the time of printing but will not include modifications if they occurred after the date of publication. In all cases, if there is discrepancy between this information and official publications issued by LSI, LSI’s official publications shall take precedence. LSI assumes no obligation for the accuracy or correctness of the course materials and assumes no obligation to correct any errors contained herein or to advise Student of liability for the accuracy or correctness of the course materials provided to Student. LSI makes no commitment to update the course materials and LSI reserves the right to change the course materials, including any terms and conditions, from time to time at its sole discretion. LSI reserves the right to seek all available remedies for any illegal misuse of the course materials by Student. LSI reserves the right to seek all available remedies for any illegal misuse of the course materials.
Certification
Student acknowledges that purchasing or participating in an LSI course does not imply certification with respect to any LSI certification program. To obtain certification, Student must successfully complete all required elements in an applicable LSI certification program. LSI may update or change certification requirements at any time without notice.
Ownership
LSI and its affiliates retain all right, title and interest in and to the course materials, including all copyrights therein. LSI grants Student permission to use the course materials for personal, educational purposes only. The resale, reproduction, or distribution of the course materials, and the creation of derivative works based on the course materials, is prohibited without the prior express written permission of LSI. Nothing in this Agreement shall be construed as an assignment of any patents, copyrights, trademarks, or trade secret information or other intellectual property rights.
![Page 6: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/6.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 6
Testing
While participating in course, LSI may test Student's understanding of the subject matter. Furthermore, LSI may record the Student's participation in a course with videotape or other recording means. Student agrees that LSI is the owner of all such test results and recordings, and may use such test results and recordings subject to LSI's privacy policy.
Software license
All software utilized or distributed as course materials, or an element thereof, is licensed pursuant to the license agreement accompanying the software.
Indemnification
Student agrees to indemnify, defend and hold LSI, and all its officers, directors, agents, employees and affiliates, harmless from and against any and all third party claims for loss, damage, liability, and expense (including reasonable attorney's fees and costs) arising out of content submitted by Student, Student's use of course materials (except as expressly outlined herein), or Student's violations of any rights of another.
Disclaimer of warranties
THE COURSE MATERIALS (INCLUDING ANY SOFTWARE) ARE PROVIDED ON AN “AS
IS” AND “AS AVAILABLE” BASIS, WITHOUT WARRANTY OF ANY KIND. LSI DOES
NOT WARRANT THAT THE COURSE MATERIALS: WILL MEET STUDENT'S REQUIREMENTS; WILL BE UNINTERRUPTED, TIMELY, SECURE, OR ERROR-FREE; OR WILL PRODUCE RESULTS THAT ARE RELIABLE. LSI EXPRESSLY DISCLAIMS ALL WARRANTIES, WHETHER EXPRESS, IMPLIED OR STATUTORY, ORAL OR WRITTEN, WITH RESPECT TO THE COURSE MATERIALS, INCLUDING WITHOUT LIMITATION THE IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE WITH RESPECT TO THE SAME. LSI EXPRESSLY DISCLAIMS ANY WARRANTY WITH RESPECT TO ANY TITLE OR NONINFRINGEMENT OF ANY THIRD-PARTY NTELLECTUAL PROPERTY RIGHTS, OR AS TO THE ABSENCE OF COMPETING CLAIMS, OR AS TO INTERFERENCE WITH STUDENT’S QUIET ENJOYMENT.
Limitation of liability
STUDENT AGREES THAT LSI SHALL NOT BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, CONSEQUENTIAL OR EXEMPLARY DAMAGES, INCLUDING BUT NOT LIMITED TO, DAMAGES FOR LOSS OF PROFITS, GOODWILL, USE, DATA OR OTHER SUCH LOSSES, ARISING OUT OF THE USE OR INABILITY TO USE THE COURSE MATERIALS, EVEN IF LSI HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES, LSI'S LIABILITY FOR DAMAGES TO STUDENT FOR ANY CAUSE WHATSOEVER, REGARDLESS OF THE FORM OF ANY CLAIM OR ACTION, SHALL NOT EXCEED THE AGGREGATE FEES PAID BY STUDENT FOR THE USE OF THE COURSE MATERIALS INVOLVED IN THE CLAIM.
![Page 7: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/7.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 7
Miscellaneous
Student agrees to not export or re-export the course materials without the appropriate United States and foreign government licenses, and shall otherwise comply with all applicable export laws. In the event that course materials in the form of software is acquired by or on behalf of a unit or agency of the United States government (the “Agency”), the Agency agrees that such software is comprised of “commercial computer software” and “commercial computer software documentation” as such terms are used in 48 C.F.R. 12.212 (Sept. 1995) and is provided to the Agency for evaluation or licensing (A) by or on behalf of civilian agencies, consistent with the policy set forth in 48 C.F.R. 12.212; or (B) by or on behalf of units of the Department of Defense, consistent with the policies set forth in 48 C.F.R. 227-7202-1 (June 1995) and 227.7203-3 (June 1995). This Agreement shall be governed by and construed in accordance with the laws of the State of California, with regard to its choice of law or conflict of law provisions. In the event of any conflict between foreign laws, rules and regulations and those of the United States, the laws, rules and regulations of the United States shall govern. In any action or proceeding to enforce the rights under this Assignment, the prevailing party shall be entitled to recover reasonable costs and attorneys' fees. In the event that any provision of this Agreement shall, in whole or in part, be determined to be invalid, unenforceable or void for any reason, such determination shall affect only the portion of such provision determined to be invalid, unenforceable or void, and shall not affect the remainder of such provision or any other provision of this Agreement. This Agreement constitutes the entire agreement between LSI and Student relating to the course materials and supersedes any prior agreements, whether written or oral, between the parties.
Trademark acknowledgments
Engenio, the Engenio design, HotScaletm, SANtricity, and SANsharetm are trademarks or registered trademarks of LSI Corporation. All other brand and product names may be trademarks of their respective companies.
Copyright notice
© 2006, 2007, 2008 LSI Corporation. All rights reserved Agreement accepted by Student (Date):
Agreement not accepted by Student (Date):
![Page 8: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/8.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 8
Left Blank Intentionally
![Page 9: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/9.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 9
Storage Systems Diagnostics and Troubleshooting Course Outline
Course Description:
Storage Systems Diagnostics and Troubleshooting is an advanced course that presents the technical aspects of diagnosing and troubleshooting LSI-based storage systems through advanced data analysis and in-depth troubleshooting. The basic objective of this course is to equip the participants with the essential concepts associated with troubleshooting and repairing LSI-based storage systems using either SANtricitytm Storage Management software, analysis of support data or controller shell commands. The information contained in the course is derived from internal engineering publications and is confidential to LSI Corporation. It reflects the latest information available at the time of printing but may not include modifications if they occurred after the date of publication.
Prerequisites:
Ideally the successful student will have completed both the Installation and Configuration and the Support and Maintenance courses offered by Global Education services at LSI Corporation.
However, an equivalent knowledge of storage management, installation, basic maintenance and problem determination with LSI-based storage systems can be substituted.
Students should have at least 6 months field exposure with LSI storage products and technologies in a support function.
Audience:
This course is designed for customer support personnel responsible for diagnosing and troubleshooting LSI storage systems through the use of support data analysis and controller shell access. The course is designed for individuals employed as Tier 3 support of LSI-based storage systems.
It is assumed that the student has in-depth experience and knowledge with Fiber Channel Storage Area Network (SAN) technologies including RAID, Fiber Channel topology, hardware components, installation, and configuration.
Course Length:
Approximately 4 days in length with 60% lecture and 40% hands-on lab.
![Page 10: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/10.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 10
Course Objectives
Upon completion of this course, the participant will be able to: • Recognize the underlying behavior of LSI-based storage systems • Analyze a storage system for failures through the analysis of support data
• Successfully analyze backend fiber channel errors • Successfully interpret configuration errors
Course Modules
1. Storage System Support Data Analysis 2. Storage System Level Overview 3. Configuration Overview and Analysis 4. IO Driver and Drive Side Error Reporting and Analysis
Module 1: Storage System Support Data Overview
Upon completion should be able to complete the following:
• Describe the purpose of the files that are included within an the All Support Data Capture
• Analyze the Major Event Log at a high level in order to diagnose an event Lab
• Gather the support data file • Analyze a MEL event • Diagram the events in a MEL that lead to an error
Module 2: Storage System Level Overview
Upon completion should be able to complete the following:
• Log into the controller shell • Identify and modify the controller states
• Recognize the battery function within the controllers • Describe the network functionality • List developer functions available within the controller shell commands
Lab • Log into the controller shell
• Modify controller states
![Page 11: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/11.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 11
Module 3: Configuration Overview and Analysis
Upon completion should be able to complete the following:
• Describe the difference between the legacy configuration structures and the new 07.xx firmware configuration database
• Analyze an array’s configuration from shell output and recognize any errors in the configuration
LAB
• Fix configuration errors on live system
Module 4: IO Driver and Drive Side Error Reporting and Analysis
Upon completion should be able to complete the following:
• Describe how fibre channel topology works • Determine how fibre channel topology relates to the different protocols that LSI
uses in its storage array products
• Analyze backend errors for problem determination and isolation LAB
• Analyze backend data case studies
![Page 12: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/12.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 12
Left Blank Intentionally
![Page 13: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/13.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 13
Module 1: Storage System Support Data Overview
Upon completion should be able to complete the following:
• Describe the purpose of the files that are included within an the All Support Data Capture
• Analyze the Major Event Log at a high level in order to diagnose an event
![Page 14: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/14.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 14
All support Data Capture
• ZIP archive of useful debugging files • Some files are for development use only, and are not support readable • Typically the first item requested for new problem analysis
• Benefits – Provides a point-in-time snapshot of system status. – Contains all logs needed for a ‘first look’ at system failures. – Easy customer interface through the GUI. – Non-disruptive
• Drawbacks
– Requires GUI accessibility. – Can take some time to gather on a large system.
All Support Data Capture
![Page 15: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/15.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 15
All Support Data Capture Files - 06.xx.xx.xx
• driveDiagnosticData.bin – Drive log information contained in a binary format.
• majorEventLog.txt – Major Event Log
• NVSRAMdata.txt – NVSRAM settings from both controllers
• objectBundle – Binary format file containing java object properties
• performanceStatistics.csv – Current performance statistics by volume
• persistentReservations.txt – Volumes with persistent reservations will be noted here
• readLinkStatus.csv – RLS diagnostic information in comma separated value format
• recoveryGuruProcedures.html – Recovery Guru procedures for all failures on the system
• recoveryProfile.csv – Log of all changes made to the configuration
• socStatistics.csv – SOC diagnostic information in comma separated value format
• stateCaptureData.dmp/txt – Informational shell commands ran on both controllers
• storageArrayConfiguration.cfg – Saved configuration for use in the GUI script engine
• storageArrayProfile.txt – Storage array profile
• unreadableSectors.txt – Unreadable sectors will be noted here, noting the volume and drive LBA
![Page 16: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/16.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 16
All Support Data Capture Files - 07.xx.xx.xx
• Contains all the same files as the 06.xx.xx.xx releases but adds 3 new files. – Connections.txt
• Lists the physical connections between expansion trays – ExpansionTrayLog.txt
• ESM event log for each ESM in the expansion trays – featureBundle.txt
• Lists all premium features and their status on the system • Most useful files for first-look system analysis and troubleshooting
– stateCaptureData.dmp/txt – majorEventLog.txt – storageArrayProfile.txt – socStatistics.csv – readLinkStatus.csv – recoveryGuruProcedures.html
![Page 17: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/17.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 17
Major Event Log (MEL) Overview
Major Event Log Facts
• Array controllers log events and state transitions to an 8192 event circular buffer. • Log is written to DACSTOR region of drives.
– Log is permanent – Survives:
• Power cycles • Controller swaps
• SANtricity can display log, sort by parameters and save to file. • Only critical errors send SNMP traps and Email alerts
A Details Window from a MEL log (06.xx)
![Page 18: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/18.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 18
General Raw Data Categories (06.xx)
General Raw Data Categories (07.xx)
![Page 19: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/19.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 19
Byte Swapping
• Remember when byte swapping select all of the bytes in the field
• NOTE: Do not swap the nibbles
– e.g. Value is not “00 00 00 00 00 00 01 fa”
Comparison of the Locations of the Summary Information and Raw Data (06.xx)
![Page 20: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/20.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 20
Quick View of the Locations Raw Data Fields (06.xx)
MELH - Signature
MEL version - 2 means 5.x code or 06.x code Event Description - Includes: Event Group, Component, Internal Flags, Log Group &
Priority I/O Origin – refer to the MEL spec for the event type
Reporting Controller - 0=A 1=B
Valid? - 0=Not valid 1=Valid data O1 - Number of Optional Data Fields
O2 - Total length of all of the Optional Data Fields in Hex F1 - Length of this optional data field
F2 - Data field type (If there is a value of 0x8000 this is a continuation of
the previous optional data field. This would be read as a continuation of the previous data field type 0x010d.)
F3 - The “cc” means drive side channel and the following value refers to the channel number and is 1 relative.
Sense Data - Vendor specific depending on the component type.
N/U - Not Used
![Page 21: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/21.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 21
Comparison of the Locations of the Summary Information and Raw Data (07.xx)
![Page 22: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/22.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 22
Quick View of the Locations Raw Data Fields (07.xx)
Event Description - includes: Event Group, Component, Internal Flags, Log Group & Priority
Location – Decode based on the component type
Valid? - 0=Not valid 1=Valid data
1. – I/O Origin 2. - Reserved
3. - Controller reported by (0=A 1=B) 4. - Number of optional data fields present
5. - Total length of optional Data
6. - Single optional field length 7. - Data field type, data field types that begin with 0x8000 are a continuation of the
previous data field of the same type
Sense Data - vendor specific depending on the component type.
![Page 23: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/23.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 23
MEL Summary Information
• Date/Time: Time of the event adjusted to the management station local clock
• Sequence number: Order that the event was written to the MEL
• Event type: Event code, check MEL Specification for a list of all event types
• Event category: Category of the event (Internal, Error, Command)
• Priority: Either informational or critical
• Description: Description of the event type
• Event specific codes: Information related to the event (if available)
• Component type: Component the event is associated with
• Component location: Physical location of the component the event is associated with
• Logged by: Controller which logged the event
Event Specific Codes
• Skey/ASC/ASCQ
– Defined in Chapter 11 (06.xx), 12 (07.xx) of the Software Interface Spec • AEN Posted events
– Event 3101 • Drive returned check condition events
– Event 100a
• Return status/RPC function/null – Defined in Chapter 31 & 32 of the MEL Spec (06.16)
• Controller return status/function call for requested operation events
– Event 5023
![Page 24: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/24.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 24
Controller Return States
• Return status and RPC function call as defined in the MEL Specification
Event Specific Codes
• Return Status
0x01 = RETCODE_OK
• RPC Function Call
0x07 = createVolume_1()
![Page 25: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/25.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 25
![Page 26: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/26.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 26
Event Specific Codes
• SenseKey /ASC /ASCQ
6/3f/80 = Drive no longer usable (The controller set the drive state to “Failed – Write Failure”)
AEN Posted for recently logged event (06.xx)
• Byte 14 = 0x7d (FRU)
• Bytes 26 & 27 = 0x02 & 0x05 (FRU Qualifiers)
• Values decoded using the Software Interface Specification Chapter 5 (6.xx)
• FRU Qualifiers are decoded depending on what the FRU value is
![Page 27: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/27.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 27
Sense Data (SIS Chapter 5)
• Byte 14 FRU = 0x7d – FRU is Drive Group (Devnum = 0x60000d)
• Byte 26 = 0x02
– Tray ID = 2
• Byte 27 = 0x05 – Slot = 5
AEN posted for recently logged event (06.xx)
• Byte 14 = 0x06 (FRU)
• Bytes 26 & 27 = 0xd5 & 0x69 (FRU Qualifiers)
• Values decoded using the Software Interface Specification Chapter 5 (6.xx)
• FRU Qualifiers are decoded depending on the FRU code
![Page 28: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/28.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 28
Sense Data (SIS Chapter 5)
• SenseKey / ASC / ASCQ
6/3f/c7 = Non Media Component Failure
• Byte 14 FRU = 0x06 – FRU is Subsystem Group
• Byte 26 = 0xd5
1 1 0 1 0 1 0 1 = 0x55 = tray 85
• Byte 27 = 0x69
0 1 1 0 1 0 0 1
– Device State = 0x3 = Missing – Device Type Identifier = 0x09 = Nonvolatile Cache
Automatic Volume Transfer
• IO Origin field
o 0x00 = Normal AVT o 0x01 = Forced AVT
• LUN field o Number of volumes being transferred o Will be 0x00 if it is a forced volume transfer
![Page 29: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/29.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 29
Automatic Volume Transfer
• IO Origin field – 0x00 = Normal AVT – 0x01 = Forced AVT
• LUN field – Number of volumes being transferred – Will be 0x00 if it is a forced volume transfer
![Page 30: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/30.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 30
Mode Select Page 2C
• IOP ID Field o Contains the Host Number that issued the Mode Select (referenced in the
tditnall command output)
• Optional data is defined in the Software interface Specification, section 6.15 (or 5.15)
![Page 31: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/31.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 31
Module 2: Storage System Analysis
Upon completion should be able to complete the following: • Log into the controller shell • Identify and modify the controller states
• Recognize the battery function within the controllers • Describe the network functionality • List developer functions available within the controller shell commands
![Page 32: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/32.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 32
State Capture Data File • Series of controller shell commands ran against both controllers • Different firmware levels run different sets of commands
• Some information still needs to be gathered manually
Amethyst/Chromium (06.16.xx,06.19.xx/06.23.xx)
The following commands are collected in the state capture for the Amethyst and Chromium releases: moduleList spmShowMaps fcAll
arrayPrintSummary spmShow socShow
cfgUnitList getObjecGraph_MT showEnclosures
vdShow ccmStateAnalyze netCfgShow
cfgUnitList i showSdStatus
cfgUnit ionShow 99 dqprint
ghsList showEnclosuresPage81 printBatteryAge
cfgPhyList fcDump dqlist
Chromium 2 State Capture Additions (06.60.xx.xx) The release of Chromium 2 (06.60.xx.xx) introduced the following additional commands to the state capture dump. tditnall luall fcHosts 3
iditnall ionShow 12 svlShow
fcnShow excLogShow getObjectGraph_MT 99*
chall ccmStateAnalyze 99**
* getObjectGraph_MT 99 replaced the individual getObjectGraph_MT calls used in previous
releases
** ccmStateAnalyze 99 replaces the ccmStateAnalyze used in previous releases
![Page 33: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/33.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 33
Crystal (07.10.xx.xx) The following commands are collected in the state capture for the Crystal release: evfShowOwnership luall hwLogShow
rdacMgrShow ionShow spmShowMaps
vdmShowDriveTrays fcDump spmShow
vdmDrmShowHSDrives fcAll 10 fcHosts
evfShowVol showSdStatus getObjectGraph_MT
vdmShowVGInfo ionShow 99 ccmShowState
bmgrShow discreteLineTableShow netDfgShow
bidShow ssmShowTree inetstatShow
tditnall ssmDumpEncl dqprint
iditnall socShow dqlist
fcnShow showEnclosuresPage81 taskInfoAll
![Page 34: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/34.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 34
Accessing the Controller Shell • Accessed via RS-232 port on communication module • Default settings are 38,400 baud, 8-N-1 no flow control
• 06.xx firmware controllers allow access to the controller shell over the network via rlogin
• 07.xx firmware controllers allow access to the controller shell over the network via
telnet • Always capture your shell session using your terminal’s capturing
functionality
Logging In To the Controller Shell (06.xx)
• If logging serially, get command prompt by sending Break signal, followed by Esc key when prompted.
– Using rlogin you may be prompted for a login name, use “root”
• Enter password when prompted: – Infiniti
• Command prompt is a ‘right arrow’ ( -> ) • The shell allows user to access controller firmware commands & routines directly
Logging In To the Controller Shell (07.xx)
• If logging in serially, get command prompt by sending Break signal, followed by Esc key when prompted.
– Otherwise shell access can be gained via the telnet protocol.
• You will be prompted for a login name, use “shellUsr”
• Enter password when prompted: – wy3oo&w4 –
• Command prompt is a ‘right arrow’ ( -> )
• The shell allows user to access controller firmware commands & routines directly.
![Page 35: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/35.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 35
Controller Analysis
![Page 36: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/36.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 36
![Page 37: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/37.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 37
![Page 38: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/38.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 38
![Page 39: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/39.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 39
![Page 40: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/40.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 40
![Page 41: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/41.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 41
![Page 42: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/42.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 42
Controller Analysis
• bidShow 255 (07.xx)
• Driver level information, similar to bmgrShow but for development use
getObjectGraph_MT / getObjectGraph_MT 99
• Prior to Chromium 2 (06.60.xx.xx), and in Crystal (07.xx) the getObjectGraph_MT command was used several times to collect the following:
• getObjectGraph_MT 1 – Controller Information • getObjectGraph_MT 4 – Drive Information • getObjectGraph_MT 8 – Component Status
• As of Chromium 2 (06.60.xx.xx) the state capture utilizes getObjectGraph_MT 99 which collects the entire object graph including controller, drive, component, and volume/configuration data.
• The object graph is actually used by the Storage Manager software to provide the visual representation of the current array status.
• The output of getObjectGraph_MT can be used to determine individual component status.
The downside of using the getObjectGraph_MT output is that it is somewhat complicated and cryptic however it can be very valuable in determining problems with the information being reported to the customer via Storage Manager.
![Page 43: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/43.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 43
![Page 44: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/44.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 44
![Page 45: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/45.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 45
![Page 46: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/46.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 46
![Page 47: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/47.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 47
![Page 48: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/48.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 48
Additional Output
![Page 49: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/49.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 49
![Page 50: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/50.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 50
Knowledge Check Analyze the storageArrayProfile.txt file to find the following information:
Controller Firmware version:
Board ID:
Network IP Address Controller A:
Controller B:
Volume Ownership (by SSID) Controller A:
Controller B:
ESM Firmware Version:
Find the same information in the StateCaptureData.txt file. List what command was referenced to find the information.
Command Referenced
06.xx 07.xx
Controller Firmware version:
Board ID:
Network IP Address:
Volume Ownership (by SSID):
ESM Firmware Version:
![Page 51: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/51.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 51
Additional Commands
![Page 52: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/52.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 52
![Page 53: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/53.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 53
![Page 54: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/54.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 54
![Page 55: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/55.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 55
![Page 56: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/56.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 56
Debug Queue • Used to log pertinent information about various firmware functions. • Each core asset team can write to the debug queue.
• There is no standard for data written to the debug queue, each core asset team writes the information it feels is needed for debug.
• The debug queue output is becoming increasingly important for problem
determination and root cause analysis.
• Because so much data is being written to the debug queue, it is important to gather it as soon as possible after the initial failure.
• Because there is no standard for the data written to the debug queue, it is necessary for multiple development teams to work in conjunction to analyze the debug queue.
• This makes it difficult to interpret from a support standpoint without development
involvement.
![Page 57: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/57.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 57
![Page 58: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/58.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 58
Debug Queue Rules
• First check ‘dqlist’ to verify which trace contains events during the time of failure
• It is possible that there may not be a debug queue trace file that contains the timeline of the failure, in this case, no information can be gained
• First data capture is a must with the debug queue as information is logged very quickly
• Even though a trace may be available for a certain timeframe, it is not a guarantee that further information can be gained about a failure event
Summary
• Look at the first / last timestamps and remember that they’re in GMT.
• Don’t just type ‘dqprint’ unless you actually want to flush and print the ‘trace’ trace file (the one we’re currently writing new debug queue data to). Only typing ‘dqprint’ can actually make you lose the useful data if you’re not paying attention.
• Keep in mind that the debug queue wasn’t designed for you to read, only for you to collect and someone in development to read.
• Remember, even LSI developers, when looking at debug queue traces, need to
go back to the core asset team that actually wrote the code that printed specific debug queue data, in order to decode it.
![Page 59: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/59.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 59
Knowledge Check What command would you run to gather the following information:
Detailed process listing:
Available controller memory:
Lock status:
There is no need to capture controller shell login sessions.
True False The Debug Queue should only be printed at development request.
True False The Debug Module is needed for access to all controller shell commands.
True False
![Page 60: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/60.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 60
Modifying Controller States
• Controller states can by modified via the GUI to place a controller offline, in-service mode, online, or to reset a controller
• These same functions can be achieved from the controller shell if GUI access is
not available
• Commands that end in _MT use the SYMbol layer and require that the network be enabled but does not require that the controller actually be on the network. The controller must also be through Start Of Day
• The _MT commands are valid for both 06.xx and 07.xx firmware
• The legacy (06.xx and lower) commands are referenced in the ‘Troubleshooting
and Technical Reference Guide Volume 1’ on page 27
• To transfer all volumes from the alternate controller and place the alternate controller in service mode
-> setControllerServiceMode_MT 1 -> cmgrSetAltToServiceMode (07.xx only)
• While the controller is in service mode it is still powered on and is available for shell access. However it is not available for host I/O, similar to a ‘passive’ mode.
• To transfer all volumes from the alternate controller and place the alternate controller offline
-> setControllerToFailed_MT 1 -> cmgrSetAltToFailed (07.xx only)
• While the controller is offline it is powered off and is unavailable for shell access.
It is not available for host I/O
![Page 61: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/61.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 61
• To place the alternate controller back online from either an offline state, or from in service mode
-> setControllerToOptimal_MT 1 -> cmgrSetAltToOptimal (07.xx only)
• This will place the alternate controller back online and active, however will not
automatically redistribute the volumes to the preferred controller
• In order to reset a controller
• Soft reset controller – Reboot
• Reset controller with full POST
– sysReboot – resetController_MT 0
• Reset the alternate controller (06.xx)
– isp rdacMgrAltCtlReset –
• Reset the alternate controller
– altCtlReset 2 – resetController_MT 1
![Page 62: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/62.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 62
Diagnostic Data Capture (DDC)
Brief History
• Multiple ancient IO events in the field
• Need of having better diagnostic capability
• Common infrastructure which can be used for many such events
What is DDC (Diagnostic Data Capture)?
• A mechanism to capture sufficient diagnostic information about the
controller/array state at the time of an unusual event, and store the diagnostic data for later retrieval/transfer to LSI Development for further analysis
• Introduced in Yuma 1.2 (06.12.16.00)
• Part of Agate (06.15.23.00)
• All future releases
Unusual events triggering DDC (as of 07.xx)
• Ancient IO
• Master abort due to bad address accessed by the fibre channel chip results in
PCI error
• Destination device number registry corruption
• EDC Error returned by the disk drives
• Quiescence failure of volumes owned by the alternate controller
![Page 63: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/63.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 63
DDC Trigger
• MEL event gets logged whenever DDC logs are available in the system
• A system-wide Needs Attention condition is created for successful DDC capture
• Batteries
– Get enabled if system has batteries which are sufficiently charged – DDC logs triggered by ancient IO MAY sustain without batteries, as
ancient IO does not cause hard reboot.
• No new DDC trigger if all of the following are true – New event is of same type as previous – New trigger happens within 10 minutes of the previous trigger – Previous DDC logs have not been retrieved (DDC - NA is set)
Persistency of DDC Information
• DDC info is persistent across power cycle, and controller reboot provided the
following is true: – System contains batteries which are sufficiently charged
DDC Logs format
• Binary
• Must be sent to LSI development to be analyzed
DDC CLI commands
• Commands to retrieve the DDC information
– save storageArray diagnosticData file=“<filename>.zip";
• Command to clear the DDC NA – reset storageArray diagnosticData; – CLI calls this command internally in case retrieval is successful – This can be called without any retrieval (Just to clear NA)
![Page 64: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/64.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 64
DDC MEL Events
• MEL_EV_DDC_AVAILABLE
– Event # 6900 – Diagnostic data is available – Critical
• MEL_EV_DDC_RETRIEVE_STARTED – Event # 6901 – Diagnostic data retrieval operation started – Informational
• MEL_EV_DDC_RETRIEVE_COMPLETED – Event # 6902 – Diagnostic data retrieval operation completed – Informational
• MEL_EV_DDC_NEEDS_ATTENTION_CLEARED – Event # 6903 – Diagnostic data Needs Attention status cleared – Informational
![Page 65: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/65.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 65
Knowledge Check
1) A controller can only be placed offline via the controller shell interface.
True False
2) A controller in service mode is available for 1/O.
True False
3) An offline controller is not available for shell access.
True False
4) DDC is to be collected and interpreted by support personnel.
True False
![Page 66: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/66.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 66
Left Blank Intentionally
![Page 67: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/67.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 67
Module 3: Configuration Overview and Analysis Upon completion should be able to complete the following:
• Describe the difference between the legacy configuration structures and the new 07.xx firmware configuration database
• Analyze an array’s configuration from shell output and recognize any errors in the configuration
![Page 68: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/68.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 68
Configuration Overview and Analysis
• In 06.xx firmware, the storage array configuration was maintained as data structures resident in controller memory with pointers to related data structures
• The data structures were written to DACstore with physical references (devnums) instead of memory pointer references
• Drawbacks of this design are that the physical references used in DACstore
(devnums) could change, which could cause a configuration error when the controllers are reading the configuration information from DACstore
• As of 07.xx the storage array configuration has been changed to a database design • The benefits are as follows:
– A single configuration database that is stored on every drive in a storage array – Configuration changes are made in a transactional manner – i.e. updates are
either made in their entirety or not at all – Provides support for > 2TB Volumes, increased partitions, increased host ports – Unlimited Global Hot Spares – More drives per volume group – Pieces can be failed on a drive as opposed to the entire drive
![Page 69: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/69.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 69
Configuration Overview and Analysis
What does this mean to support?
• Drive States and Volume States have changed slightly • Shell commands have changed
– cfgPhyList, cfgUnitList, cfgSetDevOper, cfgFailDrive, etc
Configuration Overview and Analysis (06.xx)
• How is the configuration of an 06.xx storage array maintained?
• Each component of the configuration is maintained via data structures
– Piece Structure – Drive Structure – Volume Structure
• Each structure contains a reference pointer to associated structures as well as
information directly related to it’s component
• Pieces – Pieces are simply the slice of a disk that one volume is utilizing, there
could be multiple pieces on a drive, but a piece can only reference one drive
• Piece Structures
– Piece structures maintain the following configuration data • A pointer to the volume structure • A pointer to the drive structure • Devnum of drive that the piece resides on • Spared devnum if a global hot spare has taken over • The piece’s state
• Drive Structures
– Drive structures maintain the following configuration data • The drives devnum and tray/slot information • Blocksize, Capacity, Data area start and end • The drive’s state and status • The drive’s flags • The number of volumes resident on the drive (assuming it is
assigned) • Pointers to all pieces that are resident on the drive (assuming it is
assigned)
![Page 70: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/70.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 70
• Volume Structures – Volume structures maintain the following information
• SSID number • RAID level • Capacity • Segment size • Volume state • Volume label • Current owner • Pointer to the first piece
06.xx configuration layout
![Page 71: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/71.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 71
Configuration Overview and Analysis (07.xx)
• How is the configuration of an 07.xx storage array maintained?
• Each component of the configuration is maintained via ‘records’ in the
configuration database – Piece Records – Drive Records – RAID Volume Records – Volume Group Records
• Each record maintains a reference to it’s parent record and it’s own specific state
info • The “Virtual Disk Manager” (VDM) uses this information and facilitates the
configuration and I/O behaviors of each volume group – VDM is the core module that consists of the drive manager, the piece
manager, the volume manager, the volume group manager, and exclusive operations manager
• Pieces
– Pieces may also be referenced as ‘Ordinals’. Just remember that piece == ordinal and ordinal == piece
• Piece Records
– Piece records maintain the following configuration data • A reference to the RAID Volume Record • Update Timestamp of the piece record • The persisted ordinal (what piece number, in stripe order, is this
record in the RAID Volume) • The piece’s state
– Note that there is no reference to a drive record – The update timestamp is set when the piece is failed – The parent record for a piece is the RAID Volume record it belongs to
![Page 72: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/72.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 72
• Drive Records
– Drive records maintain the following configuration data • The physical drive’s WWN • Blocksize, Capacity, Data area start and end • The drive’s accessibility, role, and availability states (more on this
later) • The drive’s physical slot and enclosure WWN reference • The WWN of the volume group the drive belongs to (assuming it
is assigned) • The drive’s ordinal in the volume group (its piece number) • Reasons for why a drive is marked incompatible, non-redundant,
or marked as non-critical fault • Failure Reason • Offline Reason
– Note that there is no reference to the piece record itself, only the ordinal
value – The parent record for an assigned drive is the Volume Group record
• RAID Volume Records
– RAID Volume records maintain the following configuration data • SSID • RAID level • Current path • Preferred path • Piece length • Offset • Volume state • Volume label • Segment size
– Volume Records only refer back to their parent volume group record via the WWN of the volume group
• Volume Group Records
– Volume Group records simply maintain the following • The WWN of the Volume Group • The Volume Group Label • The RAID Level • The current state of the Volume Group • The Volume Group sequence number
– Note that the Volume Group record does not reference anything but itself
![Page 73: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/73.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 73
07.xx configuration layout
Configuration Overview and Analysis • There are several advantages that may not be immediately obvious
o The 06.xx configuration manager used devnums (which could change) and arbitrary memory locations (which change on every reboot)
o 07.xx configuration uses hard set values such as physical device WWNs,
and internally set WWN values for RAID Volumes and Volume Groups which will not change once created.
• The configuration database is maintained on all drives in the storage array
• Provides for a more robust and reliable means of handling failure scenarios
![Page 74: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/74.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 74
Knowledge Check
1) 06.xx config uses data structures or database records to maintain the configuration?
2) 07.xx config database is stored on every drive.
True False 3) Shell commands to analyze the config did not change between 06.xx and 07.xx.
True False 4) What are the 3 data structures used for 06.xx config?
5) What are the 4 database records used for 07.xx config?
![Page 75: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/75.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 75
Drive and Volume State Management
![Page 76: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/76.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 76
![Page 77: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/77.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 77
![Page 78: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/78.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 78
![Page 79: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/79.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 79
![Page 80: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/80.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 80
![Page 81: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/81.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 81
![Page 82: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/82.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 82
![Page 83: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/83.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 83
![Page 84: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/84.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 84
![Page 85: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/85.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 85
![Page 86: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/86.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 86
![Page 87: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/87.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 87
![Page 88: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/88.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 88
Volume State Management
Beginning with Crystal there are different classifications for volume group states
• Complete – All drives in a group are present
• Partially Complete – Drives are missing however redundancy is available to allow I/O operations to continue
• Incomplete – Drives are missing and there is not enough redundancy available to allow I/O operations to continue
• Missing – All drives in a volume group are inaccessible
• Exported – Volume group and associated volumes are offline as a result of a user
initiated export (used in preparation for a drive migration)
![Page 89: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/89.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 89
Hot Spare Behavior
• Only valid for non-RAID 0 volumes and volume groups
• Not valid if any volumes in the volume group are dead
• A hot spare can spare for a failed drive or NotPresent drive that has failed pieces
• If an InUse hot spare drive fails and that failure causes any volumes in the volume group to transition to failed state, then the failed InUse hot spare will remain integrated in the VG to provide the best chance or recovery
• If none of the volumes in the volume group are in the failed state, then the failed InUse hot spare is de-integrated from the volume group making it a “failed standby” hot spare and another optimal standby hot spare will be integrated
• If failure occurred due to reconstruction (read error), then the InUse hot spare drive won’t be failed but it will be de-integrated from the volume group. We won’t retry integration with another standby hot spare drive. This “read error” information is not persisted or held in memory so we will retry integration if the controller was ever rebooted or if there was an event that would start integration.
• When copyback completes, the InUse hot spare drive is de-integrated from its group and is transitioned to a Standby Optimal hot spare drive.
• New hot spare features (07.xx) – An integrated hot spare can be made the permanent member of the
volume group it is sparing in via a user action in SANtricity Storage Manager
![Page 90: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/90.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 90
![Page 91: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/91.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 91
![Page 92: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/92.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 92
Volume Mappings Information
![Page 93: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/93.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 93
![Page 94: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/94.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 94
Knowledge Check
1) For 07.xx list all of the possible:
Drive accessibility states:
Drive role states:
Drive availability states:
06.xx 07.xx
2) What command(s) would you reference in order to get a quick look at all volume states?
06.xx 07.xx
3) What command(s) would you reference in order to get a quick look at all drive states?
![Page 95: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/95.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 95
Portable Volume Groups in 07.xx
• Previously drive migrations were performed via a system of checking NVSRAM bits, marking volume groups offline, removing drives, and finally carefully re-inserting drives in to the receiving system one at a time and waiting for the group to be merged and brought online.
• This procedure is now gone and has been replaced by portable volume group functionality.
• Portable volume group functionality provides a means of safely removing and moving an entire drive group from one storage system to another
• Uses the model of “Exporting” and “Importing” the configuration on the associated disks
• “Exporting” a volume group performs the following
o Volumes are removed from the current configuration and configuration database synchronization ceases
o The Volume Group is placed in the “Export” state and the drives marked
offline and spun down
o Drive references are removed once all drives in the “Exported” volume group are physically removed from the donor system
• Drives can now be moved to the receiving system
o Once all drives are inserted to the receiving system the volume group does not immediately come online
o The user must specify that the configuration of the new disks be
“Imported” to the current system configuration
o Once “Imported” the configuration data on the migrated group and the existing configuration on the receiving system are synchronized and the volume group is brought online
![Page 96: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/96.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 96
RAID 6 Volumes in 07.xx
• First we should get the “Marketing” stuff out the of the way
o RAID 6 is provided as a premium feature o RAID 6 will only be supported on the Winterpark (399x) platform due to
controller hardware requirements
• XBB-2 (Which will release with Emerald 7.3x) will support RAID 6
o RAID 6 Volume Groups can be migrated to systems that do not have RAID 6 enabled via a feature key but only if the controller hardware supports RAID 6
• The volume group that is migrated will continue to function
however a needs attention condition will be generated because the premium features will not be within limits
The Technical Bits
• LSI’s RAID 6 implementation is of a P+Q design o P is for parity, just like we’ve always had for RAID 5 and can be used to
reconstruct data o Q is for the differential polynomial calculation which when used with
Gaussian elimination techniques can also be used to reconstruct data o It’s probably easier to think of the “Q” as CRC data
• A RAID 6 Volume Group can survive up to two drive failures and maintain access to user data
• Minimum number of drives for a RAID 6 Volume Group is five drives with a
maximum of 30
• There is some additional capacity overhead due to the need to store both P and Q data (i.e. the capacity of two disks instead of one like in RAID 5)
• Recovery from RAID 6 failures only requires slight modification of RAID 5
recovery procedures o Revive up to the third drive to fail o Reconstruct the first AND second drive to fail
• Reconstructions on RAID 6 volume groups will take about twice as long as a normal RAID 5 reconstruction
![Page 97: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/97.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 97
Troubleshooting Multiple Drive Failures • When addressing a multiple drive failure, there are several key pieces of information
that need to be determined prior to performing any state modifications.
• RAID Level o Is it a RAID 6?
– RAID 6 volume group failures occur after 3 drives have failed in the volume group
o Is it a RAID 3/5 or RAID 1? – RAID 5 volume group failures occur after two drives have failed in
an volume group. o RAID 1 volume group failures occur when enough drives fail to cause an
incomplete mirror. – This could be as few as two drives or half the drives + 1.
o RAID 0 volume groups are dead upon the first drive failure
• Despite the drive failures is each individual volume group configuration complete? – i.e. Are all drives accounted for, regardless of failed or optimal?
• How many drives have failed and what volume group does each drive belong? • In what order did the drives fail in each individual volume group?
• Are there any global hot spares? o Are any of the hot spares in use o Are there any hot spares not in use and if so are they in an optimal
condition?
• Are there any backend errors that lead to the initial drive failures? o This is the most common cause of multiple drive failures, all backend
issues must be fixed or isolated before continuing any further Multiple Drive Failures – Why RAID Level is Important
• RAID 6 Volume Groups o RAID 6 volume groups can survive 2 drive failures due to the p+q
redundancy model, after the third drive failure the volume group is marked as failed
o Up until the third drive failure, data in the stripe is consistent across the drives
![Page 98: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/98.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 98
• RAID 5 and RAID 3 Volume Groups o After the second drive failure the volume group and associated volumes
are marked as failed, no I/Os have been accepted since the second drive failed
o Up until the second drive failure, data in the stripe is consistent across the drives
• RAID 1 Volume Groups o RAID 1 volume groups can survive multiple drive failures as long as one
side of the mirror is still optimal o RAID 1 volume groups can be failed after only two drives fail if both the
data drive and the mirror drive fail o Until the mirror becomes incomplete the RAID 1 pairs will function
normally
• RAID 0 o As there is no redundancy these arrays cannot generally be recovered.
However, the drives can be revived and checked – no guarantees can be made that the data will be recovered.
Multiple Drive Failures – Configuration Considerations
• Although there are several mechanisms to ensure configuration integrity there are failure scenarios that may result in configuration corruption
• If the failed volume group’s configuration is incomplete, reviving and reconstructing
drives could permanently corrupt user data
• If any of the drives have an ‘offline’ status (06.xx), reviving drives could revert them to an unassigned state
• How can this be avoided? o Check to see if the customer has an old profile that shows the appropriate
configuration for the failed volume group(s) o If the volume group configuration appears to be incomplete, corrupted, or if
there is any doubt – escalate immediately
![Page 99: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/99.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 99
Multiple Drive Failures – How Many Drives?
• Assuming the volume group configuration is complete and all drives are accounted for you need to determine how many drives are failed
• Make a list of the failed drives in each failed volume group
• Using the output of ionShow 12 determine whether or not these drives are in an open state
o If the drives are in a closed state they will be inaccessible and attempts to spin up, revive, or reconstruct will likely fail
Multiple Drive Failures – What’s the failure order?
• Failure order is important for RAID 6, RAID 3/5, and RAID 1 volume group failures.
• Determining the failure order is just as important as determining the status of the failed volume group’s configuration
• Failure order should be determined from multiple data points
o The Major Event Log (MEL) o Timestamp information from the drive’s DACstore (06.xx)
o Timestamp information from the failed piece (07.xx)
• Often times, failures occur close together and will show up either at the same timestamp or within seconds of each other in the MEL
Multiple Drive Failures – What’s the failure order? (06.xx)
• In order to obtain information from DACstore the drive must be spun up
isp cfgPrepareDrive,0x<phydev> Note: this is the only command that uses the “phydev” address not the devnum address
• This command will spin the drive up, but not place it back in service.
It will still be listed as failed by the controller. However since it is spun up, it will service direct disk reads of the DACstore region necessary for the following commands.
![Page 100: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/100.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 100
![Page 101: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/101.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 101
![Page 102: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/102.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 102
I’ve got my failure order, what’s next?
• Using the information on the previous slides you should have now determined what
the failure order is of the drives.
• Special considerations need to be made depending on the RAID level of the failed volume group
o For RAID 6 volume groups, the most important piece of information is the
first two drives that failed o For RAID 5 volume groups, the most important piece of information is the
first drive that failed
o For RAID 1 volume groups, the most important piece of information is the first drive that failed causing the mirror to break.
• Before making any modifications to the failed drives, any unused global hot spares should be failed to prevent them sparing for drives unnecessarily.
o To fail the hot spares – Determine which unused hot spares are to be failed – From the GUI
• Select the drive, • from the Advanced menu select Recovery >> Fail Drive
– From the controller shell • Determine the devnums of the hot spares that are to be failed • Using the devnum enter
– isp cfgFailDrive,0x<devnum> (06.xx) – setDriveToFailed_MT 0x<devnum> (06.xx & 07.xx)
![Page 103: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/103.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 103
Reviving Drives
• Begin with the last drive that failed and revive drives until the volume group becomes degraded
• From the GUI o Select the last drive to fail and from the Advanced menu select
Recovery >> Revive >> Drive o Check to see if the volume group is degraded, if not move on to the next
drive (Last -> First) and revive it. Repeat this step until the volume group is degraded
o Volume group and associated volumes should now be in a degraded state.
• From the controller shell
o Using the devnum of the drive perform the following
• isp cfgSetDevOper,0x<devnum> (06.xx) • setDriveToOptimal_MT 0x<devnum> (06.xx & 07.xx)
o Check to see if the volume group is degraded, if not move on to the next
drive (Last -> First) and revive it. Repeat this step until the volume group is degraded
o The volume group and associated volumes should now be in a degraded
state
• Mount volumes in read-only (if possible) and verify data
![Page 104: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/104.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 104
Cleanup
• If data checks out, reconstruct the remaining failed drives, replace drives as warranted
– From the GUI
• Select the drive • From the Advanced menu select
Recovery >> Reconstruct Drive
– From the controller shell
• Using the devnum of the drive perform the following • isp cfgReplaceDrive,0x<devnum> (06.xx)
• startDriveReconstruction_MT 0x<devnum> (06.xx &
07.xx)
• Once reconstructions have begun, the previously failed hot spares can be revived
– From the GUI
• Select the last drive to fail • From the Advanced menu select
Recovery >> Revive >> Drive
– From the controller shell
• Using the devnum of the drive perform the following
– isp cfgSetDevOper,0x<devnum> (06.xx) – setDriveToOptimal_MT 0x<devnum> (06.xx &
07.xx)
![Page 105: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/105.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 105
Multiple Drive Failures – A Few Final Notes
• If there is any doubt about the failure order, the array configuration, or you are
simply not confident – find a senior team member to consult with prior to taking any action.
– Beyond this you can ALWAYS escalate
• You are dealing with a customer’s data, be mindful of this at all times.
– Think about what you are doing, establish a plan based on high level facts
– Take your time – Write down the information as you review the data – If something doesn’t look right, ask a co-worker or escalate
• RAID 0 Volume Groups
– Revive the drives, check the data. – There is no guarantee that data will be recovered, and depending on the
nature of the drive failure the array may not stay optimal long enough to use the data.
• If there are multiple drive failures, there is chance that a backend problem is at fault
– DO NOT PULL AND RESEAT DRIVES – Every attempt should be made to resolve any backend issues prior to
changing drive states.
– Get the failure order information, address the backend issue, spin up drives and restore access.
![Page 106: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/106.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 106
Offline Volume Groups
Offline Volume Groups (06.xx)
• As a protection mechanism in 06.xx configuration manager, if all members (drives) of a volume group are not present during start of day, the controller will mark the associated volume group offline until all members are available
• This behavior can cause situations where a volume group is left in an offline status
with all drives present, or with one drive listed as out of service
![Page 107: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/107.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 107
Offline Volume Groups (06.xx)
• IMPORTANT: If a group is offline, it is unavailable for configuration changes.
That means that if any drives in the associated volume group are failed and revived, they will not be configured into the volume group, but will transition to an unassigned state instead
• In order to bring a volume group online through the controller shell with no pieces out of service, or only one piece out of service
– isp cfgMarkNonOptimalDriveGroupOnline,<SSID>
• Where ‘SSID’ is any volume in the group, this only needs to be run once against any volume in the group
Offline Volume Groups (07.xx)
• Because 07.xx firmware does not implement this functionality, it is not expected that this will be a concern for 07.xx systems
• Volume Groups that do not have all members (drives) present during start of day
will transition to their appropriate state
– Partially Complete – Degraded – Incomplete – Dead – Missing
• Even though the group is listed as degraded or dead, it is possible that all volumes will still be in an optimal state since no pieces are marked as out of service
![Page 108: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/108.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 108
Clearing the Configuration • In extreme situations it may be necessary to clear the configuration from the system • This can be accomplished by either clearing the configuration information from the
appropriate region in DACstore or by completely wiping DACstore from the drives and rebuilding it during start of day
• The configuration can be reset via the GUI
– Advanced >> Recovery >> Reset >> Configuration (06.xx) – Advanced >> Recovery >> Clear Configuration >> Storage Array (07.xx)
• To wipe the configuration information
– sysWipe
• This command must be run on both controllers. • For 06.xx systems, the controllers must be rebooted once the
command has completed. • As of 07.xx the controllers will reboot automatically once the
command has completed
• To wipe DACstore from all drives
– sysWipeZero 1 (06.xx) – dsmWipeAll (07.xx)
• After either of these commands, the controllers must be rebooted in order to write new DACstore to all the drives
• To wipe DACstore from a single drive
– isp cfgWipe1,0x<devnum> (06.xx) • Either the controllers must be rebooted in order to write new DACstore to
the drive, or it must be (re)inserted into a system
– dsmWipe 0x<devnum>,<writeNewDacstore> (07.xx)
![Page 109: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/109.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 109
• Where <writeNewDacstore> is either a 0 to not write new DACstore until start of day or the drive is (re)inserted into a system, or a 1 to write new clean DACstore once it has been cleared
• There are times where the Feature Enable Identifier key becomes corrupt, in order to clear it and generate a new Feature Enable Identifier use the following command.
• safeSysWipe (06.xx and 07.xx)
• For 07.xx systems, you must also remove the safe header from the
database • dbmRemoveSubRecordType 18 (07.xx)
Note: This is a very dangerous command as it wipes out a record in the database – make sure you type “18” and not another number
• Once this has been completed on both controllers, they will need to both be rebooted in order to generate a new ID.
• All premium feature keys will need to be regenerated with the new ID and reapplied.
Recovering Lost Volumes • There are times that volumes are lost and need to be recovered, either due to a
configuration problem with the storage array, or the customer simply deleted the wrong volume
• Multiple pieces must be known about the missing volume in order to ensure data
recovery – Drives and Piece Order of the drives in the missing volume group – Capacity of each volume in the volume group – Disk offset where each volume starts – Segment Size of the volumes – RAID level of the group – Last known state of the drives
• This information can be obtained from historical capture all support data files relatively easy
• Finding Drive and Piece order
– Old Profile in the ‘Volume Group’ section – vdShow or cfgUnit output in the stateCaptureData.dmp file (06.xx)
– evfShowVol output in the stateCaptureData.txt file (07.xx)
![Page 110: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/110.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 110
• Finding Capacity, Offset, RAID level, and Segment size
– vdShow or cfgUnit output in the stateCaptureData.dmp file (06.xx) – evfShowVol output in the stateCaptureData.txt file (07.xx)
• The last known state of the drives is a special case where a drive was previously failed in a volume prior to the deletion of the volume, it must be failed again after the recreation of the volume in order to maintain consistent data/parity
• SMcli command to recreate a volume without initializing data on the volume
– recover volume (drive=(trayID,slotID) | drives=(trayID1,slotID1 ... trayIDn,slotIDn) | volumeGroup=volumeGroupNumber) userLabel="volumeName" capacity=volumeCapacity offset=offsetValue raidLevel=(0 | 1 | 3 | 5 | 6) segmentSize=segmentSizeValue [owner=(a | b) cacheReadPrefetch=(TRUE | FALSE)]
• This command is discussed in the EMW help in further detail
– Help >> Contents >> Command Reference Table of Contents >> Commands Listed by Function >> Volume Commands >> Recover RAID Volume
• When specifying the capacity, specify it in bytes for a better chance of data recovery, if entered in Gigabytes there could be some rounding discrepancies in the outcome
• A lost volume can be created using this method as many times as necessary until the data is recovered as long as there are no writes that take place to the volume when it is recreated improperly
• NEVER use this method to create a brand new volume that contains no data. Doing so will cause data corruption upon degradation, since the volume was never initialized during creation.
• If creating volumes using the GUI, instead of the ‘recover volume’ CLI command, steps must first be made in the controller shell in order to prevent initialization
• There is a flag in the controller shell that defines whether or not to initialize the data region of the drives upon new volume creations
– writeZerosFlag
![Page 111: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/111.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 111
Recovering Lost Volumes – Setup
(Note in the following examples: red denotes what to type, black is the output, blue press <enter> key) -> writeZerosFlag
value = 0 = 0x0
-> writeZerosFlag=1 -> writeZerosFlag
value = 1 = 0x1
-> VKI_EDIT_OPTIONS EDIT APPLICATION SCRIPTS (disabled) Enter ‘I’ to insert statement; ‘D’ to delete statement; ‘C’ to clear all options; + to enable debug options; ‘Q’ to quit i <enter> Enter statement to insert (exit insert mode with newline only): writeZerosFlag=1 <enter> EDIT APPLICATION SCRIPTS (disabled)
1) writeZerosFlag=1
Enter ‘I’ to insert statement; ‘D’ to delete statement; ‘C’ to clear all options; + to enable debug options; ‘Q’ to quit + <enter> EDIT APPLICATION SCRIPTS (enabled)
1) writeZerosFlag=1 Enter ‘I’ to insert statement; ‘D’ to delete statement; ‘C’ to clear all options; + to enable debug options; ‘Q’ to quit q <enter> Commit changes to NVSRAM (y/n) y <enter> value = 12589824 = 0xc01b00 ->
![Page 112: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/112.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 112
Recovering Lost Volumes
• A lost volume can be created using this method as many times as necessary until the
data is recovered as long as there are no writes that take place to the volume when it is recreated improperly
• NEVER use this method to create a brand new volume that contains no data. Doing
so will cause data corruption upon degradation, since the volume was never initialized during creation
• Always verify that once the volume has been recreated that the system has been
cleaned up from all changes made during the volume recreation process
Recovering Lost Volumes – Cleanup
-> writeZerosFlag
value = 1= 0x1
-> writeZerosFlag=0 -> writeZerosFlag
value = 0 = 0x0 -> VKI_EDIT_OPTIONS EDIT APPLICATION SCRIPTS (enabled) 1) writeZerosFlag=1 Enter ‘I’ to insert statement; ‘D’ to delete statement; ‘C’ to clear all options; + to enable debug options; ‘Q’ to quit c <enter> Clear all options? (y/n) y <enter> EDIT APPLICATION SCRIPTS (enabled)
Enter ‘I’ to insert statement; ‘D’ to delete statement; ‘C’ to clear all options; + to enable debug options; ‘Q’ to quit - <enter> EDIT APPLICATION SCRIPTS (disabled) Enter ‘I’ to insert statement; ‘D’ to delete statement; ‘C’ to clear all options; + to enable debug options; ‘Q’ to quit q <enter> Commit changes to NVSRAM (y/n) y <enter> value = 12589824 = 0xc01b00 ->
![Page 113: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/113.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 113
Recovering Lost Volumes – IMPORTANT
• IMPORTANT: do not attempt to recover lost volumes without development help.
Since this deals with customer data, it is a very sensitive matter
![Page 114: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/114.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 114
Knowledge Check
1) 06.xx – List the process required to determine the drive failure order for a volume group.
2) 07.xx – List the process required to determine the drive failure order for a
volume group.
3) Clearing the configuration is a normal troubleshooting technique that will be used
frequently.
True False 4) Recovering a lost volume is a simple process that should be done without
needing to take much into consideration.
True False
![Page 115: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/115.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 115
Module 4: Fibre Channel Overview and Analysis Upon completion should be able to complete the following:
• Describe how fibre channel topology works
• Determine how fibre channel topology relates to the different protocols that LSI uses in its storage array products
• Analyze backend errors for problem determination and isolation
![Page 116: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/116.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 116
Fibre Channel • Fibre Channel is a transport protocol
– Used with upper layer protocols such as SCSI, IP, and ATM
• Provides a maximum of 127 ports in an FC-AL environment – Is the limiting factor in the number of expansion drive trays that can be
used on a loop pair
Fibre Channel Arbitrated Loop (FC-AL) • Devices are connected in a ‘one way’ loop or ring topology
– Can either be physically connected in a ring fashion or using a hub
• Bandwidth is shared among all devices on the loop
• Arbitration is required for one port (the ‘initiator’) to communicate with another (the ‘target’)
![Page 117: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/117.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 117
Fibre Channel Arbitrated Loop (FC-AL) – The LIP • Prior to beginning I/O operations on any drive channel a Loop Initialization (LIP)
must occur.
– This must be done to address devices (ports) on the channel with an ALPA (Arbitrated Loop Physical Address) and build the loop positional map
• A 128-bit (four word) map is passed around the loop by the loop master (the
controller)
– Each offset in the map corresponds to an ALPA and has a state of either 0 for unclaimed or 1 for claimed
• There are two steps in the LIP that we will skip
– LISM – Loop Initialization Select Master • The “Loop Master” is determined • The “Loop Master” assumes the lowest ALPA (0x01) • The “A” controller is always the loop master (under optimal
conditions)
– LIFA – Loop Initialization Fabric Address • Fabric Assigned addresses are determined • Occurs on HOST side connections
• The three steps we will be looking at are the
– LIPA – Loop Initialization Previous Address – LIHA – Loop Initialization Hard Address – LISA – Loop Initialization Soft Address
• The LIP process is the same regardless of drive trays attached (JBOD & SBOD)
![Page 118: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/118.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 118
Fibre Channel Arbitrated Loop (FC-AL) – The LIP
• The LIPA Phase
– The Loop Master sends the loop map out and designates it as the LIPA phase in the header of the frame
– The loop map is passed from device to device in order
– If a device’s port was previously logged in to the loop it will attempt to
assume it’s previous address by setting the appropriate offset in the map to ‘1’
– If a device was not previously addressed it will pass the frame on to the
next device in the loop
Fibre Channel Arbitrated Loop (FC-AL) – The LIP
• The LIHA Phase – Once the LIPA phase is complete the loop master will send the loop map
out again however specifying this as the LIHA phase in the header of the frame
– The loop map is once again passed from device to device in the loop
– Each device will check it’s hard address against the loop map
– If the offset of the loop map that corresponds to the device’s hard
address is available (set to 0) it will set that bit to 1, assuming the corresponding ALPA, and pass the loop map on to the next device
– If the hard address is not available it will pass the loop map on and await
the LISA stage of initialization
– Devices that assumed an ALPA in the LIPA phase will simply pass the map on to the next device
![Page 119: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/119.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 119
Fibre Channel Arbitrated Loop (FC-AL) – The LIP
• How are hard addresses determined?
– Hard Addresses are determined by the ‘ones’ digit of the drive tray ID and the slot position of the device in the drive tray
– Controllers are set via hardware to always assume the same hard IDs to
ensure that they assume the lower two ALPA addresses in the loop map (0x01 for “A” and 0x02 for “B”)
• What is the benefit?
– By using hard addressing on devices a LIP can be completed quickly and non-disruptively
– LIPs can occur for a variety of reasons – loss of
communication/synchronization, new devices joining the loop (hot adding drives and ESMs)
– I/Os that were in progress when the LIP occurred can be recovered
quickly without the need for lengthy timeouts and retries
Fibre Channel Arbitrated Loop (FC-AL) – The LIP
• The LISA Phase – Once the LIHA phase has completed the loop master will send the loop
map out again and now designating it as the LISA phase in the frame header
– Devices that had not assumed an ALPA on the loop map in the LIPA and
LIHA phase of initialization will now take the first available ALPA in the loop map
• If no ALPA is available the device will be ‘non-participating’ and will not be addressable on the loop
– When the LISA phase is received again by the loop master it will check
the frame header for a specific value that indicates that LISA had completed
![Page 120: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/120.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 120
Fibre Channel Arbitrated Loop (FC-AL) – The LIP
• Once LISA has completed, the loop master will distribute the loop map again and
each device will enter it’s hex ALPA in the order that it is received
– This is referred to as the LIRP (Loop Initialization Report Position) phase
• The loop master will distribute the completed loop map to all devices to inform them of their relative position in loop to the loop master
– This is referred to as the LILP (Loop Initialization Loop Position) phase
• The loop master ends the LIP by transmitting a CLS (Close) frame to all devices on the loop placing them in monitor mode
Fibre Channel Arbitrated Loop (FC-AL) – The LIP
• Hard Address Contention
– Hard address contention occurs when a device is unable to assume the ALPA that corresponds to its hard address and can be caused by
• The ‘ones’ digit of the tray ID not being unique among the drive
trays on a given loop • A hardware problem that results in the device reading the
incorrect hard address or the device is simply reporting the wrong address during the LIP
– Hard address contention will result in devices taking soft addresses
during the LIP
• ALPA Map Corruption
– A bad device on the loop will corrupt the ALPA map resulting in devices not assuming the correct address or not participating in the loop
• The net of these conditions is that LIPs become a disruptive process that can have adverse affects on the operation of the loop
![Page 121: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/121.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 121
Fibre Channel Arbitrated Loop (FC-AL) – Communication
• Each port has what is referred to as a Loop Port State Machine (LPSM) that is used
to define the behavior when it requires access or use of the loop
• While the loop is idle, the LPSM will be in MONITOR mode and transmitting IDLE frames
• In order for one device to communicate with another arbitration must be performed
– An ARB frame will be passed along the loop from the initiating device to the target device
– If the ARB frame is received and contains the ALPA of the initiating device
it will transition from MONITOR to ARB_WON
– An OPN (Open) frame will be sent to the device that it wishes to open communication with
– Data is transferred between the two devices
– CLS (Close) is sent and the device ports return to the MONITOR state
![Page 122: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/122.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 122
Knowledge Check
1) The Fibre Channel protocol does not have very much overhead for login and communication.
True False
2) Soft addressing should not cause a problem in an optimal system.
True False 3) List all the LIP phases:
![Page 123: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/123.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 123
Drive Side Architecture Overview
![Page 124: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/124.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 124
![Page 125: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/125.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 125
SCSI Architecture Model Terminology
• nexus: A relationship between two SCSI devices, and the SCSI initiator port and
SCSI target port objects within those SCSI devices.
• I_T nexus: A nexus between a SCSI initiator port and a SCSI target port.
• logical unit: A SCSI target device object, containing a device server and task manager, that implements a device model and manages tasks to process commands sent by an application client.
![Page 126: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/126.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 126
![Page 127: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/127.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 127
![Page 128: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/128.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 128
Role column
FCdr – Fibre Channel drive SATAdr – SATA drive SASdr – SAS drive
ORP columns indicate the overall state of the lu for disk device types (normally should be “+++”). O= Operation – the state of the ITN currently chosen
+) chosen itn is not degraded d) chosen itn is degraded
R= Redundancy – the stat of the redundant ITN
+) alternate itn is up d) alternate itn is degraded -) alternate itn is down x) there is no alternate itn
P= performance – Are we using the preferred path? +) chosen itn is preferred
![Page 129: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/129.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 129
-) chosen itn is not preferred ) no itn preferences The Channels column indicates the state of the itn on that channel which is for its lu.
*) up & chosen +) up & not chosen D) degraded & chosen D) degraded & not chosen -) down x) not present
![Page 130: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/130.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 130
![Page 131: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/131.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 131
![Page 132: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/132.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 132
![Page 133: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/133.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 133
![Page 134: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/134.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 134
![Page 135: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/135.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 135
![Page 136: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/136.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 136
![Page 137: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/137.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 137
![Page 138: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/138.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 138
Fibre Channel Overview and Analysis
• In order to reset the backend statistics that are displayed by the previous commands
o iopPerfMonRestart
• This must be done on both controllers
• Also flushes debug queue
![Page 139: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/139.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 139
Knowledge Check
1) What command will show drive path information?
2) What command will show what hosts are logged in?
![Page 140: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/140.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 140
Destination Driver Events
![Page 141: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/141.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 141
Destination Driver Events (Error Codes)
• Target detected errors:
status-sk/asc/ascq = use SCSI definitions
(status=ff means unused, sk=00 means unused) • Hid detected errors:
02-0b/00/00 IO timeout ff-00/01/00 ITN fail timeout (ITN has been disconnected for too long) ff-00/02/00 device fail timeout (all ITNs to device have been discon. for too long) ff-00/03/00 cmd breakup error
![Page 142: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/142.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 142
Destination Driver Events (Error Codes)
Lite detected errors: 02-0b/xx/xx xx = XCB_STAT code from table below
#define XCB_STAT_GEN_ERROR 0x01
#define XCB_STAT_BAD_ALPA 0x02
#define XCB_STAT_OVERFLOW 0x03
#define XCB_STAT_COUNT 0x04
#define XCB_STAT_LINK_FAILURE 0x05
#define XCB_STAT_LOGOUT 0x06
#define XCB_STAT_OXR_ERROR 0x07
#define XCB_STAT_ABTS_SENDER 0x08
#define XCB_STAT_ABTS_RECEIVER 0x09
#define XCB_STAT_OP_HALTED 0x0a
#define XCB_STAT_DATA_MISMATCH 0x0b
#define XCB_STAT_KILL_IO 0x0c
#define XCB_STAT_BAD_SCSI 0x0d
#define XCB_STAT_MISROUTED 0x0e
#define XCB_STAT_ABTS_REPLY_TIMEOUT 0x0f
#define XCB_STAT_REPLY_TIMEOUT 0x10
#define XCB_STAT_FCP_RSP_ERROR 0x11
#define XCB_STAT_LS_RJT 0x12
#define XCB_STAT_FCP_CHECK_COND 0x13
#define XCB_STAT_FCP_SCSI_STAT 0x14
#define XCB_STAT_FCP_RSP_CODE 0x15
#define XCB_STAT_FCP_SCSICON 0x16
#define XCB_STAT_FCP_RESV_CONFLICT 0x17
#define XCB_STAT_FCP_DEVICE_BUSY 0x18
#define XCB_STAT_FCP_QUEUE_FULL 0x19
#define XCB_STAT_FCP_ACA_ACTIVE 0x1a
#define XCB_STAT_MEMORY_ERR 0x1b
#define XCB_STAT_ILLEGAL_REQUEST 0x1c
#define XCB_STAT_MIRROR_CHANNEL_BUSY 0x1d
#define XCB_STAT_FCP_INV_LUN 0x1e
#define XCB_STAT_FCP_DL_MISMATCH 0x1f
#define XCB_STAT_EDC_ERROR 0x20
#define XCB_STAT_EDC_BLOCK_SIZE_ERROR 0x21
#define XCB_STAT_EDC_ORDER_ERROR 0x22
#define XCB_STAT_EDC_REL_OFFSET_ERROR 0x23
#define XCB_STAT_EDC_UDT_FLUSH_ERROR 0x24
#define XCB_STAT_FCP_IOS 0x25
#define XCB_STAT_FCP_IOS_DUP 0x26
![Page 143: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/143.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 143
Read Link Status (RLS) and Switch-on-a-Chip (SOC)
• Each port on each device maintains a Link Error Status Block (LESB) which tracks the following errors
– Invalid Transmission Words – Loss of Signal – Loss of Synchronization – Invalid CRCs – Link Failures – Primitive Sequence Errors
• Read Link Status (RLS) is a link service that collects the LESB from each device
• Transmission Words
– Formed by 4 Transmission Characters – Two types:
• Data Word – Dxx.y, Dxx.y, Dxx.y, Dxx.y
• Special Function Word such as Ordered Set – Kxx.y, Dxx.y,Dxx.y, Dxx.y
– Ordered Set consists of Frame Delimiter, Primitive Signal, and Primitive Sequence
• A Transmission Word is Invalid when one of the following conditions is detected:
– At least one Invalid Transmission Character is within Transmission Word – Any valid Special Character is at second, third, or fourth character
position of a Transmission Word
– A defined Ordered Set is received with Improper Beginning Running Disparity
![Page 144: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/144.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 144
![Page 145: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/145.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 145
RLS Diagnostics
• Analyze RLS Counts:
– Look for “step” or “spike” in error counts – Identify the first device (in Loop Map Order) that detects high number of
Link Errors • Link Error Severity Order: LF > LOS > ITW
– Get the location of the first device – Get the location of its upstream device
![Page 146: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/146.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 146
RLS Diagnostics Example
Example:
• Drive [0,9] has high error counts in ITW, LF, and LOS
• Upstream device is Drive [0,8]
• Drive [0,8] and Drive [0,9] are in same tray
• Most likely bad component: Drive [0,8]
Important Note:
• Logs need to be interpreted, not merely read
• The data is representative of errors seen by the devices on the loop • No Standard error counting • Different devices may count the error in different rate • RLS counts are still valid in SOC environments • Not valid however for SATA trays
![Page 147: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/147.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 147
![Page 148: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/148.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 148
What is SOC or SBOD?
• Switch-On-a-Chip ( SOC )
• Switch Bunch Of Disks (SBOD) Features:
• Crossbar switch (Loop-Switch) • Supported in FC-AL topologies • Per device monitoring
SOC Components
• Controllers – 6091 Controller – 399x Controller
• Drive Trays – 2Gb SBOD ESM (2610) – 4Gb ESM (4600 – Wrigley)
SBOD vs JBOD
![Page 149: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/149.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 149
What is the SES?
SCSI Enclosure Services
• The SOC provides monitor and controller for SES • The SES is the device that consumes the ALPA
• The brains of the ESM
![Page 150: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/150.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 150
![Page 151: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/151.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 151
![Page 152: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/152.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 152
SOC Statistics
• In order to clear the drive side SOC statistics
clearSocErrorStatistics_MT
• In order to clear the controller side SOC statistics
socShow 1
![Page 153: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/153.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 153
![Page 154: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/154.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 154
Determining SFP Ports
• 2GB SBOD drive enclosures ports go from left to right
• 4GB SBOD drive enclosures ports start from the center and go to the outside (Wrigley-Husker)
• On all production models ports are labels on drive trays
Port State (PS)
• Inserted – The standard state when a devices is present
• Loopback – a connection when Tx is connected to Rx
• Unknown – non-deterministic state
• Various forms of bypassed state exist.
– Most commonly seen: • Byp_TXFlt is expected when a drive is not inserted • Byp_NoFru is expected when an SFP is not present
– Other misc.
• Bypassed, Byp_LIPF8, Byp_TmOut, Byp_RxLOS, Byp_Sync, Byp_LIPIso, Byp_LTBI, Byp_Manu, Byp_Redn, Byp_Snoop, Byp_CRC, Byp_OS
Port State (PS) meanings
• Bypassed – Generic bypass condition (indication that port was never in use)
• Byp_TXFlt – Bypassed due to transmission fault • Byp_NoFru – No FRU installed • Byp_LIPF8 – Bypass on LIP (F8,F8) or No Comma • Byp_TmOut – Bypassed due to timeout • Byp_RxLOS – Bypassed due to receiver Loss Of Signal (LOSG) • Byp_Sync – Bypasses due to Loss Of Synchronization (LOS) • Byp_LIPIso – Bypass – LIP isolation port • Byp_LTBI – Loop Test Before Insert testing process • Byp_Manu – General catch all for a forced bypass state • Byp_Redn – Redundant port connection • Byp_CRC – Bypassed due to CRC errors • Byp_OS – Bypassed due to Ordered Set errors • Byp_Snoop
![Page 155: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/155.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 155
Port Insertion Count (PIC)
• Port insertion count – The number of times the device has been inserted into this port.
• The value is incremented each time a port successfully transitions from the bypassed state to inserted state.
• Range: 0-255 = 28 Loop state (LS)
• The condition of the loop between the SOC and component • Possible States:
– Up = Expected state when a device is present
– Down = Expected state when no device is present
– Transition states as loop is coming up ( listed in order )
• Down -> Init -> Open -> Actv -> Up Loop Up Count (LUC)
• The total instances that the loop has been identified as having changed from Down to Up during the SOC polling intervals.
– Note: This implies that a loop can go down and up multiple instances in
one SOC polling cycle and only be detected once. – Polling cycle is presently 30 ms
– Range: 0-255 = 28
CRC Error Count (CRCEC) • Number of CRC ( Cyclic Redundancy Check) errors that are detected in frames.
• A single invalid word in a frame will increment the CRC counter
• Range: 0 - 4,294,967,294 = 232
![Page 156: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/156.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 156
Relative Frequency Drive Error Avg. (relFrq count / RFDEA)
• SBODs are connected to multiple devices.
• This leads to the SBOD being in multiple clock domains
• Overtime clocks tend to drift. SBODs employ a clock check feature comparing the relative frequency of all attached devices to the clock connected to the SBOD.
• If one transmitter is transmitting at the slow end and its partner at the fast end of tolerance range then the two clocks are in specification but will have extreme difficulty in communicating
• Range: 0 - 4,294,967,294 = 232 Loop Cycle Count (loopCy / LCC)
• The loop cycle is the detection of a Loop transition. – Unlike Loop Up Count the Loop Cycle count does not require the loop to
transition to the up state.
• The Loop Cycle Count is more useful in understanding overhead of the FC protocol.
• Until Loop Up goes to 1 no data has been transmitted.
• Loop Cycle allows for an understanding that an attempt is being made to bring up the loop.
– Does not mean the loop has come up
• Range: 0 - 4,294,967,294 = 232
• Possible States: – Same as Loop States (LS)
• Up, Down, Transition states as loop is coming up
![Page 157: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/157.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 157
Ordered Set Error Count (OSErr / OSEC)
• Number of Ordered Sets that are received with an encoding error. • Ordered Sets include Idle, ARB, LIP, SOF, EOF, etc
• Range: 0 - 4,294,967,294 = 232 Port Connection Held Off Count (hldOff / PCHOC)
• Port connections held off count • The number of instances a device has attempted to connect to a specific port
and received busy.
• Range: 0 - 4,294,967,294 = 232 Port Utilization- Traffic Utilization (PUP)
• The percentage of bandwidth detected over a 240ms period of time. Other values
• Sample Time:
– Time in seconds in which that sample was taken General Rules of Thumb for Analysis
• It requires more energy to transmit (Tx) than receive (Rx)
• In some instances it is not possible to isolate the specific problematic component.
– The recommend replacement order is the following
1. SFP 2. Cable 3. ESM 4. Controller
![Page 158: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/158.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 158
Analysis of RLS/ SOC
• RLS is an error reporting mechanism that reports errors as seen by the devices on the array.
• SOC counters are controlled by the SOC chip
• SOC is an error reporting mechanism that monitors communication between two devices.
• SOC data does not render RLS information obsolete
• RLS & SOC need to be interpreted not merely read
• Different devices may not count errors at the same rate
• Different devices may have different expected thresholds
• Know the topology/ cabling of the storage array
• When starting analysis always capture both RLS and SOC
• Do not always expect the first capture of the RLS/ SOC to pin point the problematic device.
![Page 159: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/159.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 159
Analysis of SOC
• Errors are generally not propagated through the loop in a SOC Environment. – What is recorded is the communication statistics between two devices.
• The exception to the rule – loopUp Count – CRC Error Count – OS Error Count
• Focus emphasis on the following parameters – Insertion count – Loop up count – Loop cycle count – CRC error count – OS error count
• The component connected to the port with the highest errors in the aforementioned stats is the most likely candidate for a bad component
Known Limitations
• Non-optimal configurations – i.e. improper cabling
• SOC in hub mode
![Page 160: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/160.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 160
Field Case
• Multiple backend issues reported in MEL • readLinkStatus.csv
• RLS stats show drive tray 1 & 2 are on channel 1 & 3 (All counts zero)
Field Case (cont)
• socStatistics.csv (Amethyst 2 release)
– SOC stats shows problem on: (M = Million) • Focusing on Drive Tray 1 ESM-A the user can see that the SES (or
brains of ESM) is By-passed and the loop state is down.
• Recommendation was to replace ESM-A.
• The drive tray can continue to operate after it is up without the SES.
![Page 161: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/161.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 161
Drive Channel State Management This feature provides a mechanism for identifying drive-side channels where device paths (IT nexus) are experiencing channel related I/O problems. This mechanism’s goal is twofold:
1) It aims to provide ample notice to an administrator that some form of problem exists among the components that are present on the channel
2) It attempts to eliminate, or at least reduce, I/O on drive channels that are
experiencing those problems.
• There are two states for a drive channel – OPTIMAL and DEGRADED
• A drive channel will be marked degraded by the controller when a predetermined threshold has been met for channel errors
– Timeout errors – Controller detected errors: Misrouted FC Frames and Bad ALPA errors, for
example – Drive detected errors: SCSI Parity Errors, for example – Link Down errors
• When a drive channel is marked degraded a critical event will be logged to the
MEL and a needs attention condition set in Storage Manager What a degraded drive channel means
• When a controller marks a drive-side channel DEGRADED, that channel will be avoided to the greatest extent possible when scheduling drive I/O operations.
– To be more precise, the controller will always select an OPTIMAL channel
over a DEGRADED channel when scheduling a drive I/O operation.
– However, if both paths to a given drive are associated with DEGRADED channels, the controller will arbitrarily choose one of the two.
• This point further reinforces the importance of directing administrative attention
to a DEGRADED channel so that it can be repaired and returned to the OPTIMAL state before other potential path problems arise.
![Page 162: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/162.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 162
• A drive channel that is marked degraded will be persisted through a reboot as the surviving controller will direct the rebooting controller to mark the path degraded
– If there is no alternate controller the drive path will be marked OPTIMAL again
• The drive channel will not automatically transition back to an OPTIMAL state
(with the exception of the above situation) unless directed by the user via the Storage Manager software
![Page 163: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/163.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 163
SAS Backend
![Page 164: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/164.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 164
SAS Backend Overview and Analysis
• Statistics collected from PHYs – A SAS Wide port consists of multiple PHYs, each with independent error
counters
• Statistics collected from PHYs on: – SAS Expanders – SAS Disks – SAS I/O Protocol ASICs
• PHYs that do not maintain counters
– Reported as “N/A” or similar in User Interface – Including SATA Disks
• PHY counters do not wrap (per standard)
– Maximum value of 4,294,967,295 (232) – Must be manually reset
• Counters defined in SAS 1.1 Standard
– Invalid DWORDs – Running Disparity Errors – Loss of DWORD Synchronization
• After dword synchronization has been achieved, this state machine monitors invalid dwords that are received. When an invalid dword is detected, it requires two valid dwords to nullify its effect. When four invalid dwords are detected without nullification, dword synchronization is considered lost.
– PHY Reset Problems
• Additional information returned – Elapsed time since PHY logs were last cleared – Negotiated physical link rate for the PHY – Hardware maximum physical link rate for the PHY
![Page 165: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/165.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 165
SAS Error counts
• IDWC – Invalid Dword Count – A dword that is not a data word or a primitive (i.e., in the character
context, a dword that contains an invalid character, a control character in other that the first character position, a control character other than K28.3 or K28.5 in the first character position, or one or more characters with a running disparity error). This could mark the beginning of a loss of Dword synchronization. After the fourth non-nullified (if followed by a valid Dword) Invalid Dword, Dword synchronization is lost.
• RDEC – Running Disparity Error Count
– Cumulative encoded signal imbalance between one an zero signal state. Any Dwords with one or more Running Disparity Errors will be considered an invalid Dword.
• LDWSC – Loss of Dword synch Count
– After the fourth non-nullified (if followed by a valid Dword) Invalid Dword, Dword synchronization is lost.
• RPC – Phy Reset Problem Count
– Number of times a phy reset problem occurred. When a phy or link is reset, it will run through it’s reset sequence (OOB, Speed Negotiation, Multiplexing, Identification).
SAS Backend Overview and Analysis
• SAS error logs are gathered as part of the Capture all Support Data bundle – sasPhyErrorLogs.csv
• Not available through the GUI interface, only CLI or the support bundle.
• CLI command to collect SAS PHY Error Statistics
– save storageArray SASPHYCounts file=“<file>”;
• CLI command to reset SAS PHY Error Stastics – reset storageArray SASPHYCounts;
• Shell commands to collect SAS PHY Error Stastics
– sasShowPhyErrStats 0 • List phys with errors
– sasShowPhyErrStats 1 • List all phys
– getSasErrorStatistics_MT
• Shell commands to reset SAS PHY Error Statistics
– sasClearPhyErrStats – clearSasErrorStatistics_MT
![Page 166: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/166.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 166
![Page 167: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/167.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 167
![Page 168: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/168.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 168
![Page 169: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/169.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 169
![Page 170: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/170.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 170
![Page 171: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/171.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 171
SAS Backend Overview and Analysis
• Remember that SAS error statistics are gathered per PHY
• If a PHY has a high error count, look at the device that the PHY is directly attached to
![Page 172: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/172.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 172
Left Blank Intentionally
![Page 173: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/173.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 173
Appendix A: SANtricity Managed Storage Systems
• Fully-featured midrange storage designed for wide-ranging open systems environments
• Compute-intensive applications, consolidation, tiered storage • Fully-featured management software designed to provide administrators
with extensive configuration flexibility
• FC and IB connectivity with support for FC/SATA drives
Attribute 6998 | 6994 6498
Overview Flagship system targeted
at enterprises with compute-intensive applications and large consolidations
Targeted at HPC environments utilizing
InfiniBand for Linux server clustering interconnect
Key features
• Disk performance
• SANtricity robustness • Dedicated data cache
• 4 Gb/s interfaces • Switched-loop backend
• FC | SATA intermixing
• Native IB interfaces
• Switched-loop backend • FC | SATA intermixing
• SANtricity robustness
Host interfaces Eight 4 Gb/s FC Four 10 Gb/s IB
Drive interfaces Eight 4 Gb/s FC Eight 4 Gb/s FC
Drives 224 FC or SATA 224 FC or SATA
Data cache 4, 8, 16 GB (dedicated) 2 GB (dedicated)
Cache IOPS 575,000 | 375,000 IOPS ---
Disk IOPS 86,000 | 62,000 IOPS ---
Disk MB/s 1,600 | 1,280 MB/s 1,280 MB/s
![Page 174: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/174.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 174
6998 /6994 /6091 (Front)
6998 /6994 /6091 (Back)
![Page 175: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/175.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 175
Attribute 3994 | 3992
Overview Fully-featured systems targeted at midrange environments requiring high-end functionality
and performance value
Key features
• Performance value
• SANtricity robustness
• FC | SATA intermixing • 4 Gb/s interfaces
• Switched-loop backend
Host interfaces Eight | Four 4 Gb/s FC
Drive interfaces Four 4 Gb/s FC
Drives 112 FC or SATA
Data cache 4 GB | 2 GB (shared)
Cache IOPS 120,000 IOPS
Disk IOPS 44,000 | 28,000 IOPS
Disk MB/s 990 | 740 MB/s
3992 (Back)
![Page 176: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/176.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 176
3994 (Back)
4600 16-Drive Enclosure (Back)
4600 16-Drive Enclosure (Front)
![Page 177: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/177.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 177
Left Blank Intentionally
![Page 178: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/178.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 178
Appendix B: Simplicity Managed Storage Systems � Affordable and reliable storage designed for SMB, departmental and remote-site customers
� Intuitive, task-oriented management software designed for sites with limited IT resources that need to be self-sufficient
� FC and SAS connectivity with support SAS/SATA drives (SATA drive support mid-2007)
Attribute 1333 | 1331
Overview Shared DAS targeted at SMB and entry level environments
requiring ease of use and reliability.
Entry-point storage for Microsoft Cluster Server
Key features
• Shared DAS
• High availability/reliability
• SAS host interfaces • Robust, intuitive Simplicity software
• Snapshot / Volume Copy
Host interfaces Six | Two 3 Gb/s
“wide” SAS
Drive interfaces Two 3 Gb/s “wide” SAS
Drives 42 SAS
Data cache 2 GB | 1 GB (shared)
Cache IOPS 91,000 IOPS
Disk IOPS 22,000 IOPS
Disk MB/s 900 MB/s
1333
![Page 179: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/179.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 179
Attribute 1532
Overview
iSCSI connectivity-integration into
low-cost IP Networks – Pervasive and
well-understood interface technology – Simple to implement and manage
with intuitive easy-to-use storage software
Key features • Cost effective and reliable
• iSCSI host connectivity • Attach to redundant IP switches
Host interfaces Four 1Gb/s iSCSI
Drive interfaces Two 3 Gb/s “wide” SAS
Drives 42 SAS
Data cache 2 GB | 1 GB (shared)
Cache IOPS 64,000 IOPS
Disk IOPS 22,000 IOPS
Disk MB/s 320 MB/s
1532
![Page 180: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/180.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 180
Attribute 1932
Overview
Ideal for departments or remote offices
that need to integrate inexpensive
storage into existing FC networks. Also appealing
to smaller organizations planning initial SANs.
Key features
• High availability/reliability
• Robust, intuitive Simplicity software • 4 Gb/s host interfaces
• Snapshot / Volume Copy
Host interfaces Four 4 Gb/s FC
Drive interfaces Two 3 Gb/s “wide” SAS
Drives 42 SAS
Data cache 2 GB | 1 GB (shared)
Cache IOPS 114,000 IOPS
Disk IOPS 22,000 IOPS
Disk MB/s 900 MB/s
1932
![Page 181: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/181.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 181
SAS Drive Tray (Front)
SAS Expansion Tray (Back)
![Page 182: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/182.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 182
Left Blank Intentionally
![Page 183: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/183.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 183
Appendix C – State, Status, Flags (06.xx) Drive State, Status, Flags From pp 15 – 16, Troubleshooting and Technical Reference Guide – Volume 1 Drive State Values 0 Optimal
1 Non-existent drive
2 Unassigned, w/DACstore
3 Failed
4 Replaced
5 Removed – optimal pg2A = 0
6 Removed – replaced pg2A = 4
7 Removed – Failed pg2A = 3
8 Unassigned, no DACstore
Drive State Values 0x0000 Optimal
0x0001 Unknown Channel
0x0002 Unknown Drive SCSI ID
0x0003 Unknown Channel and Drive SCSI ID
0x0080 Format in progress
0x0081 Reconstruction in progress
0x0082 Copy-back in progress
0x0083 Reconstruction initiated but no GHS is integrated
0x0090 Mismatched controller serial number
0x0091 Wrong vendor – lock out
0x0092 Unassigned drive locked out
0x00A0 Format failed
0x00A1 Write failed
0x00A2 Start of Day failed
0x00A3 User failed via Mode Select
0x00A4 Reconstruction failed
0x00A5 Drive failed at Read Capacity
0x00A6 Drive failed for internal reason
0x00B0 No information available
0x00B1 Wrong sector size
0x00B2 Wrong capacity
0x00B3 Incorrect Mode parameters
0x00B4 Wrong controller serial number
0x00B5 Channel Mismatch
0x00B6 Drive Id mismatch
0x00B7 DACstore inconsistent
0x00B8 Drive needs to have a 2MB DACstore
0x00C0 Wrong drive replaced
0x00C1 Drive not found
0x00C2 Drive offline, internal reasons
![Page 184: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/184.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 184
Drive State (d_flags) 0x00000100 Drive is locked for diagnostics
0x00000200 Drive contains config. sundry_
0x00000400 Drive is marked deleted by Raid Mgr._0
0x00000800 Defined drive without drive
0x00001000 Drive is spinning or accessible
0x00002000 Drive contains a format or accessible
0x00004000 Drive is designated as HOT SPARE
0x00008000 Drive has been removed
0x00010000 Drive has an ADP93 DACstore
0x00020000 DACstore update failed
0x00040000 Sub-volume consistency checked during SOD
0x00080000 Drive is part of a foreign rank (cold added).
0x00100000 Change vdunit number
0x00200000 Expanded DACstore parameters
0x00400000 Reconfiguration performed in reverse VOLUME order
0x00800000 Copy operation is active (not queued).
![Page 185: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/185.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 185
Volume State, Status, Flags From pp 17 – 18, Troubleshooting and Technical Reference Guide – Volume 1 VOLUME State (vd_state) These flags are bit values, and the following flags are valid: 0x0000 optimal
0x0001 degraded
0x0002 reconstructing
0x0003 formatting
0x0004 dead
0x0005 quiescent
0x0006 non\existent
0x0007 dead, awaiting format
0x0008 not spun up yet
0x0009 unconfigured
0x000a LUN is in process of ADP93 upgrade
0x000b Optiaml state and reconfig
0x000c Degraded state and reconfig
0x000d Dead state and reconfig
VOLUME Status (vd_status) These flags are bit values, and the following flags are valid:
0x0000 No sub-state/status available
0x0020 Parity scan in progress
0x0022 Copy operation in progress
0x0023 Restore operation in progress
0x0025 Host parity scan in progress
0x0044 Format in progress on virtual disk
0x0045 Replaced wrong drive
0x0046 Deferred error
![Page 186: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/186.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 186
VOLUME Flags (vd_flags) These flags are bit values, and the following flags are valid:
0x00000001 Configured
0x00000002 Open
0x00000004 On-Line
0x00000008 Not Suspended
0x00000010 Resources available
0x00000020 Degraded
0x00000040 Spare piece - VOLUME has Global Hot Spare drive in use
0x00000080 RAID 1 ping-pong state
0x00000100 RAID 5 left asymmetric mapping
0x00000200 Write-back caching enabled
0x00000400 Read caching enabled
0x00000800 Suspension in progress while switching Global Hot Spare drive
0x00001000 Quiescence has been aborted or stopped
0x00010000 Prefetch enabled
0x00020000 Prefetch multiplier enabled
0x00040000 IAF not yet started, don't restart yet
0x00100000 Data scrubbing is enabled on this unit
0x00200000 Parity check is enabled on this unit
0x00400000 Reconstruction read failed
0x01000000 Reconstruction in progress
0x02000000 Data initialization in progress
0x04000000 Reconfiguration in progress
0x08000000 Global Hot Spare copy-back in progress
0x90000000 VOLUME halted; awaiting graceful termination of any reconstruction, verify, or copy-back
![Page 187: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/187.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 187
From p 27, Troubleshooting and Technical Reference Guide – Volume 1 3.2.5 Controller/RDAC Modify Commands
3.2.5.01 isp rdacMgrSetModeActivePassive This command sets the controller (that you are talking to) to active mode and the alternate
controller mode to passive.
WARNING*** This command does not modify the controller cache setup, only the controller states. This may be accomplished by issuing the following command:
isp ccmEventNotify,0x0f
3.2.5.02 isp rdacMgrSetModeDualActive
This command sets both array controller modes to dual active.
WARNING*** This command does not modify the controller cache setup, only the controller states. This may be accomplished by issuing the following command:
isp ccmEventNotify,0x0f
3.2.5.03 isp rdacMgrAltCtlFail Will fail the alternate controller and takes ownership of it’s volumes.
NOTE: In order to fail a controller, it may be necessary to set the controller to a passive state first.
3.2.5.04 isp rdacMgrAltCtlResetRelease
Will release the alternate controller if it is being held in reset or failed.
![Page 188: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/188.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 188
Left Blank Intentionally
![Page 189: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/189.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 189
Appendix D – Chapter 2 - MEL Data Format Major Event Log Specification 349-1053040 (Software Release 6.16) LSI Logic Confidential
Chapter 2: MEL Data Format The event viewer formats and displays the most meaningful fields of major event log entries from the controller. The data displayed for individual events varies with the event type and is described in the Events Description section. The raw data contains the entire major event data structure retrieved from the controller subsystem. The event viewer displays the raw data as a character string. Fields that occupy multiple bytes may appear to be byte swapped depending on the host system. Fields that may appear as byte swapped are noted in the table below.
2.1. Overview of the Major Event Log Fields
Table 2-1: MEL Data Fields
![Page 190: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/190.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 190
Table 2-1: MEL Data Fields
![Page 191: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/191.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 191
Table 2-1: MEL Data Fields
2.1.1. Constant Data Field format, No Version Number Note: If the log entry field does not have a version number, the format will be as shown below. Table 2-2: Constant Data Field format, No Version Number
2.1.2. Constant Data Field Format, Version 1 If the log entry field contains version 1, the format will be as shown below. Table 2-3: Constant Data Field Format, Version 1
![Page 192: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/192.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 192
Table 2-3: Constant Data Field Format, Version 1
2.2. Detail of Constant Data Fields 2.2.1. Signature (Bytes 0-3) Field Details The Signature field is used internally by the controller. The current value is ‘MELH.’
2.2.2. Version (Bytes 4 -7) Field Details When the Version field is present, the value should be 1 or 2, depending on the format of the MEL entry.
2.2.3. Sequence Number (Bytes 8 - 15) Field Details The Sequence Number field is a 64-bit incrementing value starting from the time the system log was created or last initialized. Resetting the log does not affect this value.
2.2.4. Event Number (Bytes 16 - 19) Field Details The Event Number is a 4 byte encoded value that includes bits for drive and controller inclusion, event priority, and the event value. The Event Number field is encoded as follows
Table 2-4: Event Number (Bytes 16 - 19) Encoding
![Page 193: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/193.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 193
2.2.4.1. Event Number - Internal Flags Field Details The Internal Flags are used internally within the controller firmware for events that require unique handling. The host application ignores these values. Table 2-5: Internal Flags Field Values
2.2.4.2. Event Number - Log Group Field Details The Log Group field indicates what kind of event is being logged. All events are logged in the system log. The values for the Log Group Field are described as follows: Table 2-6: Log Group Field Values
2.2.4.3. Event Number - Priority Field Details The Priority field is defined as follows: Table 2-7: Priority Field Values
![Page 194: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/194.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 194
2.2.4.4. Event Number - Event Group Field Details The Event Group field is defined as follows: Table 2-8: Event Group Field Values
2.2.4.5. Event Number - Component Type Field Details The Component Type Field Values are defined as follows:
![Page 195: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/195.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 195
![Page 196: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/196.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 196
2.2.5. Timestamp (Bytes 20 - 23) Field Details The Timestamp field is a 4 byte value that corresponds to the real time clock on the controller. The real time clock is set (via the boot menu) at the time of manufacture. It is incremented every second and started relative to January 1, 1970.
2.2.6. Location Information (Bytes 24 - 27 ) Field Details The Location Information field indicates the Channel/Drive or Tray/Slot information for the event. Logging of data for this field is optional and is zero when not specified.
2.2.7. IOP ID (Bytes 28-31) Field Details The IOP ID is used by MEL to associate multiple log entries with a single event or I/O. The IOP ID is guaranteed to be unique for each I/O. A valid IOP ID may not be available for certain MEL entries and some events use this field to log other information. The event descriptions will indicate if the IOP ID is being used for unique log information. Logging of data for this field is optional and is zero when not specified.
2.2.8. I/O Origin (Bytes 32-33) Field Details The I/O Origin field specifies where the I/O or action originated that caused the event. It uses one of the Error Event Logger defined origin codes:A valid I/O Origin may not be available for certain MEL entries and some events use this field to log other information. The event descriptions will indicate if the I/O Origin is being used for unique log information. Logging of data for this field is optional and is zero when not specified. Table 2-9: I/O Origin Field Values
A valid I/O Origin may not be available for certain MEL entries and some events use this field to log other information. The event descriptions will indicate if the I/O Origin is being used for unique log information. Logging of data for this field is optional and is zero when not specified. When decoding MEL events, additional FRU information can be found in the Software Interface Specification.
2.2.9. LUN/Volume Number (Bytes 36 - 39) Field Details The LUN/Volume Number field specifies the LUN or volume associated with the event being logged. Logging of data for this field is optional and is zero when not specified.
![Page 197: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/197.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 197
2.2.10. Controller Number (Bytes 40-43) Field Details The Controller Number field specifies the controller associated with the event being logged. Table 2-10: Controller Number (Bytes 40-43) Field Values
Logging of data for this field is optional and is zero when not specified.
2.2.11. Category Number (Bytes 44 - 47) Field Details This field identifies the category of the log entry. This field is identical to the event group field encoded in the event number. Table 2-11: Event Group Field Values
2.2.12. Component Type (Bytes 48 - 51) Field Details Identifies the component type associated with the log entry. This is identical to the Component Group list encoded in the event number
Table 2-12: Component Type Field Details
![Page 198: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/198.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 198
Table 2-12: Component Type Field Details
![Page 199: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/199.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 199
Table 2-12: Component Type Field Details
2.2.13. Component Location Field Details The first entry in this field identifies the component based on the Component Type field listed above. The definition of the remaining bytes is dependent on the Component Type
Table 2-13: Component Type Location Values
![Page 200: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/200.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 200
Table 2-13: Component Type Location Values
![Page 201: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/201.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 201
Table 2-13: Component Type Location Values
2.2.14. Location Valid (Bytes 120-123) Field Details This field contains a value of 1 if the component location field contains valid data. If the component location data is not valid or cannot be determined the value is 0.
2.2.15. Number of Optional Fields Present (Byte 124) Field Details The Number of Optional Fields Present specifies the number (if any) of additional data fields that follow. If this field is zero then there is no additional data for this log entry.
![Page 202: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/202.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 202
2.2.16. Optional Field Data Field Details The format for the individual optional data fields follows: Table 2-14: Optional Field Data Format
2.2.17. Data Length (Byte 128) Field Details The length in bytes of the optional data field data (including the Data Field Type)
2.2.18. Data Field Type (Bytes 130-131) Field Details See Data Field Types on page 14for the definitions for the various optional data fields.
2.2.19. Data (Byte 132) Field Details Optional field data associated with the Data Field Type. This data may appear as byte swapped when using the event viewer.
![Page 203: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/203.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 203
Appendix E – Chapter 30 – Data Field Types Major Event Log Specification 349-1053040 (Software Release 6.16) LSI Logic Confidential
Chapter 30: Data Field Types This table describes data field types.
Table 30-1: Data Field Types
![Page 204: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/204.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 204
Table 30-1: Data Field Types
![Page 205: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/205.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 205
Table 30-1: Data Field Types
![Page 206: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/206.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 206
Table 30-1: Data Field Types
![Page 207: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/207.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 207
Table 30-1: Data Field Types
![Page 208: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/208.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 208
Table 30-1: Data Field Types
![Page 209: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/209.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 209
Table 30-1: Data Field Types
![Page 210: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/210.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 210
Table 30-1: Data Field Types
![Page 211: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/211.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 211
Table 30-1: Data Field Types
![Page 212: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/212.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 212
Table 30-1: Data Field Types
![Page 213: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/213.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 213
Table 30-1: Data Field Types
![Page 214: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/214.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 214
Left Blank Intentionally
![Page 215: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/215.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 215
Appendix F – Chapter 31 – RPC Function Numbers Major Event Log Specification 349-1053040 (Software Release 6.16) LSI Logic Confidential
Chapter 31: RPC Function Numbers The following table lists SYMbol remote procedure call function numbers:
Table 31-1: SYMbol RPC Functions
![Page 216: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/216.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 216
Table 31-1: SYMbol RPC Functions
![Page 217: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/217.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 217
Table 31-1: SYMbol RPC Functions
![Page 218: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/218.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 218
Table 31-1: SYMbol RPC Functions
![Page 219: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/219.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 219
Table 31-1: SYMbol RPC Functions
![Page 220: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/220.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 220
Table 31-1: SYMbol RPC Functions
![Page 221: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/221.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 221
Table 31-1: SYMbol RPC Functions
![Page 222: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/222.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 222
Table 31-1: SYMbol RPC Functions
![Page 223: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/223.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 223
Table 31-1: SYMbol RPC Functions
![Page 224: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/224.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 224
Table 31-1: SYMbol RPC Functions
![Page 225: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/225.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 225
Table 31-1: SYMbol RPC Functions
![Page 226: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/226.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 226
Table 31-1: SYMbol RPC Functions
![Page 227: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/227.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 227
Table 31-1: SYMbol RPC Functions
![Page 228: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/228.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 228
Table 31-1: SYMbol RPC Functions
![Page 229: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/229.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 229
Appendix G – Chapter 32 – SYMbol Return Codes Major Event Log Specification 349-1053040 (Software Release 6.16) LSI Logic Confidential
Chapter 32: SYMbol Return Codes This section provides a description of each of the SYMbol return codes..
Return Codes
![Page 230: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/230.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 230
Return Codes
![Page 231: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/231.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 231
Return Codes
![Page 232: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/232.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 232
Return Codes
![Page 233: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/233.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 233
Return Codes
![Page 234: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/234.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 234
Return Codes
![Page 235: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/235.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 235
Return Codes
![Page 236: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/236.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 236
Return Codes
![Page 237: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/237.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 237
Return Codes
![Page 238: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/238.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 238
Return Codes
![Page 239: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/239.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 239
Return Codes
![Page 240: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/240.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 240
Return Codes
![Page 241: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/241.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 241
Return Codes
![Page 242: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/242.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 242
Return Codes
![Page 243: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/243.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 243
Return Codes
![Page 244: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/244.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 244
Return Codes
![Page 245: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/245.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 245
Return Codes
![Page 246: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/246.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 246
Return Codes
![Page 247: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/247.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 247
Return Codes
![Page 248: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/248.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 248
Return Codes
![Page 249: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/249.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 249
Return Codes
![Page 250: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/250.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 250
Return Codes
![Page 251: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/251.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 251
Return Codes
![Page 252: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/252.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 252
Return Codes
![Page 253: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/253.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 253
Return Codes
![Page 254: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/254.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 254
Return Codes
![Page 255: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/255.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 255
Return Codes
![Page 256: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/256.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 256
Return Codes
![Page 257: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/257.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 257
Return Codes
![Page 258: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/258.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 258
Return Codes
![Page 259: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/259.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 259
Return Codes
![Page 260: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/260.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 260
Left Blank Intentionally
![Page 261: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/261.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 261
Appendix H – Chapter 5 - Host Sense Data
Software Interface Specification 349-1062130 - Rev. A1 (Chromium 1 & 2) LSI Logic Confidential
Chapter 5: Host Sense Data
5.1. Request Sense Data Format Sense data returned by the Request Sense command is in one of two formats: Fixed format or Descriptor format. The format is based on the value of the D_SENSE bit (byte 2, bit 2) in the Control Mode Page. When this bit is set to 0, sense data is returned using Fixed format. When the bit is set to 1, then sense data is returned using Descriptor format. This parameter will default to 1b for volumes >= 2 TB in size. The parameter defaults to 0b for volumes < TB in size. This change is persisted on a logical unit basis See “6.11.Control Mode Page (Page A)” on page 6-232. The first byte of all sense data contains the response code field that indicates the error type and format of the sense data.: If the response code is 0x70 or 0x71, the sense data format is Fixed. See “5.1.1.Request Sense Data - Fixed Format” on page 5-189. f the response code is 0x72 or 0x73, the sense data format is Descriptor. See “5.1.2.Request Sense Data - Descriptor Format” on page 5-205. For more information on sense data response codes, see SPC-3, SCSI Primary Commands.
5.1.1. Request Sense Data - Fixed Format The table below outlines the Fixed format for Request Sense data. Information about individual bytes is defined in the paragraphs following the table
Table 5.1: Request Sense Data Format
![Page 262: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/262.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 262
![Page 263: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/263.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 263
5. 1. 1. 1. Incorrect Length Indicator (ILI) - Byte 2 This bit is used to inform the host system that the requested non-zero byte transfer length for a Read or Write Long command does not exactly match the available data length. The information field in the sense data will be set to the difference (residue) of the requested length minus the actual length in bytes. Negative values will be indicated by two's complement notation. Since the controller does not support Read or Write Long, this bit is always zero.
5. 1. 1. 2. Sense Key - Byte 2 Possible sense keys returned are shown in the following table:
Table 5.2: Sense Key - Byte 2
![Page 264: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/264.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 264
5. 1. 1. 3. Information Bytes - Bytes 3-6 This field is implemented as defined in the SCSI standard for direct access devices. The information could be any one of the following types of information: ² The unsigned logical block address indicating the location of the error being reported. ² The first invalid logical block address if the sense key indicates an illegal request.
5. 1. 1. 4. Additional Sense Length - Byte 7 This value will indicate the number of additional sense bytes to follow. Some errors cannot return valid data in all of the defined fields. For these errors, invalid fields will be zero-filled unless specified in the SCSI-2 standard as containing 0xFF if invalid. The value in this field will be 152 (0x98) in most cases. However, there are situations when only the standard sense data will be returned. For these sense blocks, the additional sense length is 10 (0x0A).
5. 1. 1. 5. Command Specific Information – Bytes 8-11 This field is only valid for sense data returned after an unsuccessful Reassign Blocks command. The logical block address of the first defect descriptor not reassigned will be returned in this field. These bytes will be 0xFFFFFFFF if information about the first defect descriptor not reassigned is not available or if all the defects have been reassigned. The command-specific field will always be zero-filled for sense data returned for commands other than Reassign Blocks.
5. 1. 1. 6. Additional Sense Codes - Bytes 12-13 See the information on supported sense codes and qualifiers in See “11.2.Additional Sense Codes and Qualifiers” on page 11-329. for details on the information returned in these fields.
5. 1. 1. 7. Field Replaceable Unit Code - Byte 14 A non-zero value in this byte identifies a field replaceable unit that has failed or a group of field replaceable modules that includes one or more failed devices. For some Additional Sense Codes, the FRU code must
![Page 265: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/265.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 265
be used to determine where the error occurred. As an example, the Additional Sense Code for SCSI bus parity error is returned for a parity error detected on either the host bus or one of the drive buses. In this case, the FRU field must be evaluated to determine if the error occurred on the host channel or a drive channel. Because of the large number of replaceable units possible in an array, a single byte is not sufficient to report a unique identifier for each individual field replaceable unit. To provide meaningful information that will decrease field troubleshooting and problem resolution time, FRUs have been grouped. The defined FRU groups are listed below.
5.1.1.7.1. Host Channel Group (0x01) A FRU group consisting of the host SCSI bus, its SCSI interface chip, and all initiators and other targets connected to the bus.
5.1.1.7.2. Controller Drive Interface Group (0x02) A FRU group consisting of the SCSI interface chips on the controller which connect to the drive buses.
5.1.1.7.3. Controller Buffer Group (0x03) A FRU group consisting of the controller logic used to implement the on-board data buffer.
5.1.1.7.4. Controller Array ASIC Group (0x04) A FRU group consisting of the ASICs on the controller associated with the array functions.
![Page 266: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/266.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 266
5.1.1.7.5. Controller Other Group (0x05) A FRU group consisting of all controller related hardware not associated with another group.
5.1.1.7.6. Subsystem Group (0x06) A FRU group consisting of subsystem components that are monitored by the array controller, such as power supplies, fans, thermal sensors, and AC power monitors. Additional information about the specific failure within this FRU group can be obtained from the additional FRU bytes field of the array sense.
5.1.1.7.7. Subsystem Configuration Group (0x07) A FRU group consisting of subsystem components that are configurable by the user, on which the array controller will display information (such as faults).
5.1.1.7.8. Sub-enclosure Group (0x08) A FRU group consisting of the attached enclosure devices. This group includes the power supplies, environmental monitor, and other subsystem components in the sub-enclosure.
5.1.1.7.9. Redundant Controller Group (0x09) A FRU group consisting of the attached redundant controllers.
5.1.1.7.10. Drive Group (0x10 - 0xFF) A FRU group consisting of a drive (embedded controller, drive electronics, and Head Disk Assembly), its power supply, and the SCSI cable that connects it to the controller; or supporting sub-enclosure environmental electronics. For SCSI drive-side arrays, the FRU code designates the channel ID in the most significant nibble and the SCSI ID of the drive in the least significant nibble. For Fibre Channel drive-side arrays, the FRU code contains an internal representation of the drive’s channel and id. This representation may change and does not reflect the physical location of the drive. The sense data additional FRU fields will contain the physical drive tray and slot numbers. NOTE: Channel ID 0 is not used because a failure of drive ID 0 on this channel would cause an FRU code of 0x00, which the SCSI-2 standard defines as no specific unit has been identified to have failed or that the data is not available.
5. 1. 1. 8. Sense Key Specific Bytes - Bytes 15-17 This field is valid for a sense key of Illegal Request when the sense-key specific valid (SKSV) bit is on. The sense-key specific field will contain the data defined below. In this release of the software, the field pointer is only supported if the error is in the CDB
![Page 267: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/267.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 267
² C/D = 1 indicates the illegal parameter is in the CDB. ² C/D = 0 indicates that the illegal parameter is in the parameters sent during a Data Out phase. ² BPV = 0 indicates that the value in the Bit Pointer field is not valid. ² BPV = 1 indicates that the Bit Pointer field specifies which bit of the byte designated by the Field Pointer field is in error. When a multiple-bit error exists, the Bit Pointer field will point to the most significant (left-most) bit of the field. The Field Pointer field indicates which byte of the CDB or the parameter was in error. Bytes are numbered from zero. When a multiple-byte field is in error, the pointer will point to the most-significant byte.
5. 1. 1. 9. Recovery Actions - Bytes 18-19 This is a bit-significant field that indicates the recovery actions performed by the array controller.
5. 1. 1. 10. Total Number Of Errors - Byte 20 This field contains a count of the total number of errors encountered during execution of the command. The ASC and ASCQ for the last two errors encountered are in the ASC/ASCQ stack field.
6 Downed LUN 5 Failed drive 5. 1. 1. 11. Total Retry Count - Byte 21
![Page 268: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/268.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 268
The total retry count is for all errors seen during execution of a single CDB set.
5. 1. 1. 12. ASC/ASCQ Stack - Bytes 22-25 These fields store information when multiple errors are encountered during execution of a command. The ASC/ASCQ pairs are presented in order of most recent to least recent error detected.
5. 1. 1. 13. Additional FRU Information - Bytes 26-33 These bytes provide additional information about the field replaceable unit identified in byte 14. The first two bytes are qualifier bytes that provide details about the FRU in byte 14. Byte 28 is an additional FRU code which identifies a second field replaceable unit. The value in byte 28 can be interpreted using the description for byte 14. Bytes 29 and 30 provide qualifiers for byte 28, just as bytes 26 and 27 provide qualifiers for byte 14. The table below shows the layout of this field. Following the table is a description of the FRU group code qualifiers. If an FRU group code qualifier is not listed below, this indicates that bytes 26 and 27 are not used in this release
5.1.1.13.1. FRU Group Qualifiers for the Host Channel Group (Code 0x01) FRU Group Qualifier - Bytes 26 (MSB) & 27 (LSB) - The most significant byte indicates which host channel is reporting the failed component. The least significant byte provides the device type and state of the device being reported
![Page 269: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/269.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 269
5.1.1.13.2. Mini-hub Port Mini-Hub Port indicates which of the Mini-Hub ports is being referenced. For errors where the Mini-Hub port is irrelevant port 0 is specified
5.1.1.13.3. Controller Number Controller Number indicates which controller the host interface is connected to.
5.1.1.13.4. Host Channel LSB Format The least significant byte provides the device type and state of the device being reported.
Host Channel Number indicates which channel of the specified controller. Values 1 through 4 are valid.
5.1.1.13.4.1. Host Channel Device State Host Channel Device State is defined as:
![Page 270: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/270.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 270
5.1.1.13.4.2. Host Channel Device Type Identifier The Host Channel Device Type Identifier is defined as:
5.1.1.13.5. FRU Group Qualifiers For Controller Drive Interface Group (Code 0x02) FRU Group Qualifier - Bytes 26 (MSB) & 27 (LSB) - The most significant byte indicates which drive channel is reporting the failed component. The least significant byte provides the device type and state of the device being reported.
5.1.1.13.5.1. Drive Channel MSB Format:
* = Reserved for parallel SCSI
5.1.1.13.5.2. Mini-Hub Port The Mini-Hub Port indicates which of the Mini-Hub ports is being referenced. For errors where the Mini- Hub port is irrelevant port 0 is specified.
![Page 271: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/271.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 271
5.1.1.13.5.3. Drive Channel Number Drive Channel Number indicates which channel. Values 1 through 6 are valid.
5.1.1.13.5.4. Drive Channel LSB Format Drive Channel LSB Format (Not used on parallel SCSI)
5.1.1.13.5.41. Drive Interface Channel Device State Device Interface Channel Device State is defined as:
5.1.1.13.5.42. Host Channel Device Type Identifier Host Channel Device Type Identifier is defined as
5.1.1.13.6. FRU Group Qualifiers For The Subsystem Group (Code 0x06) FRU Group Qualifier - Bytes 26 (MSB) & 27 (LSB) - The most significant byte indicates which primary component fault line is reporting the failed component. The information returned depends on the configuration set up by the user. For more information, see OLBS 349-1059780, External NVSRAM Specification for Software Release 7.10. The least significant byte provides the device type and state of the device being reported. The format for the least significant byte is the same as Byte 27 of the FRU Group Qualifier for the Sub-Enclosure Group (0x08).
![Page 272: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/272.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 272
5.1.1.13.7. FRU Group Qualifiers For The Sub-Enclosure Group (Code 0x08) FRU Group Qualifier - Bytes 26 (MSB) & 27 (LSB) - The most significant byte indicates which enclosure identifier is reporting the failed component. The least significant byte provides the device type and state of the device being reported. Statuses are reported such that the first enclosure for each channel is reported, followed by the second enclosure for each channel.
5.1.1.13.7.1. Sub-Enclosure MSB Format:
5.1.1.13.7.11. Tray Identifier Enable (TIE) Bit When the Tray Identifier Enable (TIE) bit is set to 01b, the Sub-Enclosure Identifier field provides the tray identifier for the sub-enclosure being described.
5.1.1.13.7.12. Sub-Enclosure Identifier When set to 00b, the Sub-Enclosure Identifier is defined as
5.1.1.13.7.2. Sub-Enclosure LSB Format
5.1.1.13.7.21. Sub-Enclosure Device State
![Page 273: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/273.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 273
The Sub-Enclosure Device Type Identifier is defined as
5.1.1.13.7.22. Sub-Enclosure Device Type Identifier
The Sub-Enclosure Device Type Identifier is defined as
5.1.1.13.8. FRU Group Qualifiers For The Redundant Controller Group (Code 0x09)
![Page 274: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/274.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 274
FRU Group Qualifier - Bytes 26 (MSB) & 27 (LSB) - The most significant byte indicates which tray contains the failed controller. The least significant byte indicates the failed controller within the tray.
5.1.1.13.8.1. Redundant Controller MSB Format:
5.1.1.13.8.2. Redundant Controller LSB Format:
5.1.1.13.8.21. Controller Number Field The Controller Number field is defined as:
5.1.1.13.9. FRU Group Qualifiers For The Drive Group (Code 0x10 – 0xFF) FRU Group Qualifier - Bytes 26 (MSB) & 27 (LSB) - The most significant byte indicates the tray number of the affected drive. The least significant byte indicates the drive’s physical slot within the drive tray indicated in byte 26.
5.1.1.13.9.1. Drive Group MSB Format:
![Page 275: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/275.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 275
5.1.1.13.9.2. Drive Group LSB Format:
5. 1. 1. 14. Error Specific Information - Bytes 34-36 This field provides information read from the array controller VLSI chips and other sources. It is intended primarily for development testing, and the contents are not specified.
5. 1. 1. 15. Error Detection Point - Bytes 37-40 The error detection point field will indicate where in the software the error was detected. It is intended primarily for development testing, and the contents are not specified.
5. 1. 1. 16. Original CDB - Bytes 41-50 This field contains the original Command Descriptor Block received from the host.
5. 1. 1. 17. Reserved - Byte 51 5. 1. 1. 18. Host Descriptor - Bytes 52-53 This bit position field provides information about the host. Definitions are given below.
![Page 276: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/276.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 276
5. 1. 1. 19. Controller Serial Number - Bytes 54-69 This sixteen-byte field contains the manufacturing identification of the array hardware. Bytes of this field are identical to the information returned by the Unit Serial Number page in the Inquiry Vital Product Data.
5. 1. 1. 20. Array Software Revision - Bytes 70-73 The Array Application Software Revision Level matches that returned by an Inquiry command.
5. 1. 1. 21. LUN Number - Byte 75 The LUN number field is the logical unit number in the Identify message received from the host after selection.
5. 1. 1. 22. LUN Status - Byte 76 This field indicates the status of the LUN. It's contents are defined in the logical array page description in the Mode Parameters section of this specification except for the value of 0xFF, which is unique to this field. A value of 0xFF returned in this byte indicates the LUN is undefined or is currently unavailable (reported at Start of Day before the LUN state is known).
5. 1. 1. 23. Host ID - Bytes 77-78 The host ID is the SCSI ID of the host that selected the array controller for execution of this command.
5. 1. 1. 24. Drive Software Revision - Bytes 79-82 This field contains the software revision level of the drive involved in the error if the error was a drive error and the controller was able to retrieve the information.
5. 1. 1. 25. Drive Product ID - Bytes 83-98 This field identifies the Product ID of the drive involved in the error if the error was a drive error and the controller was able to determine this information. This information is obtained from the drive Inquiry command.
5. 1. 1. 26. Array Power-up Status - Bytes 99-100 In this release of the software, these bytes are always set to zero.
5. 1. 1. 27. RAID Level - Byte 101
![Page 277: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/277.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 277
This byte indicates the configured RAID level for the logical unit returning the sense data. The values that can be returned are 0, 1, 3, 5, or 255. A value of 255 indicates that the LUN RAID level is undefined.
5. 1. 1. 28. Drive Sense Identifier - Bytes 102-103 These bytes identify the source of the sense block returned in the next field. Byte 102 identifies the channel and ID of the drive. Refer to the FRU group codes for physical drive ID assignments. Byte 103 is reserved for identification of a drive logical unit in future implementations and it is always set to zero in this release.
5. 1. 1. 29. Drive Sense Data - Bytes 104-135 For drive detected errors, these fields contain the data returned by the drive in response to the Request Sense command from the array controller. If multiple drive errors occur during the transfer, the sense data from the last error will be returned.
5. 1. 1. 30. Sequence Number - Bytes 136-139 This field contains the controller’s internal sequence number for the IO request.
5. 1. 1. 31. Date and Time Stamp - Bytes 140-155 The 16 ASCII characters in this field will be three spaces followed by the month, day, year, hour, minute, second when the error occurred in the following format: MMDDYY/HHMMSS
5. 1. 1. 32. Reserved - Bytes 156 – 159
![Page 278: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/278.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 278
Left Blank Intentionally
![Page 279: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/279.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 279
Appendix I – Chapter 11 – Sense Codes
Chapter 11: Sense Codes 11.1. Sense Keys
![Page 280: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/280.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 280
11.2. Additional Sense Codes and Qualifiers This section lists the Additional Sense Codes (ASC), and Additional Sense Code Qualifier (ASCQ) values returned by the array controller in the sense data. SCSI-2 defined codes are used when possible. Array specific error codes are used when necessary, and are assigned SCSI-2 vendor unique codes 0x80-0xFF. More detailed sense key information may be obtained from the array controller command descriptions or the SCSI-2 standard. Codes defined by SCSI-2 and the array vendor specific codes are shown below. The most probable sense keys (listed below for reference) returned for each error are also listed in the table. A sense key encapsulated by parentheses in the table is an indication that the sense key is determined by the value in byte 0x0A. See Section .
![Page 281: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/281.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 281
![Page 282: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/282.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 282
![Page 283: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/283.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 283
![Page 284: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/284.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 284
![Page 285: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/285.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 285
![Page 286: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/286.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 286
![Page 287: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/287.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 287
![Page 288: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/288.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 288
![Page 289: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/289.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 289
![Page 290: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/290.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 290
![Page 291: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/291.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 291
![Page 292: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/292.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 292
![Page 293: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/293.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 293
![Page 294: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/294.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 294
![Page 295: Storage_Diagnostics_and_Troubleshooting_Guide](https://reader034.vdocument.in/reader034/viewer/2022050801/549e5131ac795910768b4721/html5/thumbnails/295.jpg)
Storage System Diagnostics & Troubleshooting - Copyright © LSI 2008, All Rights Reserved Page 295