how-to_‎ > ‎

ServerHW_

Locate and replace a broken/degraded/predictive failure disk

posted May 22, 2017, 5:59 AM by Daniele Albrizio   [ updated May 22, 2017, 6:58 AM ]

# megaraidsas-status
# megaraid-status

-- Arrays informations --
-- ID | Type | Size | Status
a0d0 | RAID 5 | 272GiB | DEGRADED

-- Disks informations
-- ID | Model | Status | Warnings
a0e248s0 | FUJITSU MAX3147RC 136GiB | online
a0e248s1 | FUJITSU MAX3147RC 136GiB | rebuild
a0e248s3 | FUJITSU MAX3147RC 136GiB a0d0 | predictive-failure

There is at least one disk/array in a NOT OPTIMAL state.

root@vserver02:~# megacli -pdlist -aALL
                                    
Adapter #0

Enclosure Device ID: 248
Slot Number: 0
Drive's position: DiskGroup: 0, Span: 0, Arm: 0
Enclosure position: N/A
Device Id: 0
WWN:
Sequence Number: 2
Media Error Count: 0
Other Error Count: 0
Predictive Failure Count: 0
Last Predictive Failure Event Seq Number: 0
PD Type: SAS

Raw Size: 136.732 GB [0x11177330 Sectors]
Non Coerced Size: 136.232 GB [0x11077330 Sectors]
Coerced Size: 136.218 GB [0x11070000 Sectors]
Sector Size:  0
Firmware state: Online, Spun Up
Device Firmware Level: 5205
Shield Counter: 0
Successful diagnostics completion on :  N/A
SAS Address(0): 0x500000e0155d1a02
SAS Address(1): 0x0
Connected Port Number: 0
Inquiry Data: FUJITSU MAX3147RC       5205DQ37P7400YPH@#43CC0
FDE Capable: Not Capable
FDE Enable: Disable
Secured: Unsecured
Locked: Unlocked
Needs EKM Attention: No
Foreign State: None
Device Speed: Unknown
Link Speed: Unknown
Media Type: Hard Disk Device
Drive Temperature :33C (91.40 F)
PI Eligibility:  No
Drive is formatted for PI information:  No
PI: No PI
Port-0 :
Port status: Active
Port's Linkspeed: Unknown
Port-1 :
Port status: Active
Port's Linkspeed: Unknown
Drive has flagged a S.M.A.R.T alert : No



Enclosure Device ID: 248
Slot Number: 1
Drive's position: DiskGroup: 0, Span: 0, Arm: 1
Enclosure position: N/A
Device Id: 1
WWN:
Sequence Number: 9
Media Error Count: 0
Other Error Count: 0
Predictive Failure Count: 0
Last Predictive Failure Event Seq Number: 0
PD Type: SAS

Raw Size: 136.732 GB [0x11177330 Sectors]
Non Coerced Size: 136.232 GB [0x11077330 Sectors]
Coerced Size: 136.218 GB [0x11070000 Sectors]
Sector Size:  0
Firmware state: Rebuild
Device Firmware Level: 5205
Shield Counter: 0
Successful diagnostics completion on :  N/A
SAS Address(0): 0x500000e015437812
SAS Address(1): 0x0
Connected Port Number: 1
Inquiry Data: FUJITSU MAX3147RC       5205DQ37P7400Y76@#43CC0
FDE Capable: Not Capable
FDE Enable: Disable
Secured: Unsecured
Locked: Unlocked
Needs EKM Attention: No
Foreign State: None
Device Speed: Unknown
Link Speed: Unknown
Media Type: Hard Disk Device
Drive Temperature :30C (86.00 F)
PI Eligibility:  No
Drive is formatted for PI information:  No
PI: No PI
Port-0 :
Port status: Active
Port's Linkspeed: Unknown
Port-1 :
Port status: Active
Port's Linkspeed: Unknown
Drive has flagged a S.M.A.R.T alert : No



Enclosure Device ID: 248
Slot Number: 3

Drive's position: DiskGroup: 0, Span: 0, Arm: 2
Enclosure position: N/A
Device Id: 3
WWN:
Sequence Number: 2
Media Error Count: 0
Other Error Count: 0
Predictive Failure Count: 1
Last Predictive Failure Event Seq Number: 826628
PD Type: SAS

Raw Size: 136.732 GB [0x11177330 Sectors]
Non Coerced Size: 136.232 GB [0x11077330 Sectors]
Coerced Size: 136.218 GB [0x11070000 Sectors]
Sector Size:  0
Firmware state: Online, Spun Up
Device Firmware Level: 5205
Shield Counter: 0
Successful diagnostics completion on :  N/A
SAS Address(0): 0x500000e0155710e2
SAS Address(1): 0x0
Connected Port Number: 3
Inquiry Data: FUJITSU MAX3147RC       5205DQ37P7400YKE@#43CC0
FDE Capable: Not Capable
FDE Enable: Disable
Secured: Unsecured
Locked: Unlocked
Needs EKM Attention: No
Foreign State: None
Device Speed: Unknown
Link Speed: Unknown
Media Type: Hard Disk Device
Drive Temperature :31C (87.80 F)
PI Eligibility:  No
Drive is formatted for PI information:  No
PI: No PI
Port-0 :
Port status: Active
Port's Linkspeed: Unknown
Port-1 :
Port status: Active
Port's Linkspeed: Unknown
Drive has flagged a S.M.A.R.T alert : Yes




Exit Code: 0x00
root@vserver02:~# megacli -pdlocate -start -physdrv[248:3] -a0
                                    
Adapter: 0: Device at EnclId-248 SlotId-3  -- PD Locate Start Command was successfully sent to Firmware

Exit Code: 0x00


Disk led becomes to blink regularly


root@vserver02:~# megacli -pdlocate -stop -physdrv[248:3] -a0
                                    
Adapter: 0: Device at EnclId-248 SlotId-3  -- PD Locate Stop Command was successfully sent to Firmware

Exit Code: 0x00

Disk led stops blinking

Replace disk.

If the disk was previously on another raid array you will see it as BAD:

# megaraidsas-status -d
-- Disks informations
-- ID | Model | Status | Warnings
a0e248s0 | FUJITSU MAX3147RC 136GiB | online
a0e248s1 | FUJITSU MAX3147RC 136GiB | online
a0e248s3 | FUJITSU MAX3147RC | BAD

There is at least one disk/array in a NOT OPTIMAL state.


Make it good:

root@vserver02:~# megacli -PDMakeGood  -PhysDrv[248:3] -aALL
                                    
Adapter: 0: EnclId-248 SlotId-3 state changed to Unconfigured-Good.

Exit Code: 0x00

Clear foreign configuration

# megacli -CfgForeign -Clear -aALL
                                    
Foreign configuration 0 is cleared on controller 0.

Exit Code: 0x00

Make it hot-spare

# megacli -PDHSP -Set -PhysDrv[248:3] -aALL

Show rebuild progress:

# megacli -PDRbld -ShowProg -PhysDrv[248:3] -a0

1-1 of 1