{"id":3361,"date":"2026-02-05T11:39:12","date_gmt":"2026-02-05T16:39:12","guid":{"rendered":"https:\/\/blog.lufamily.ca\/kang\/?p=3361"},"modified":"2026-02-05T11:39:14","modified_gmt":"2026-02-05T16:39:14","slug":"replacing-fail-drive-in-existing-vdev","status":"publish","type":"post","link":"https:\/\/blog.lufamily.ca\/kang\/2026\/02\/05\/replacing-fail-drive-in-existing-vdev\/","title":{"rendered":"Replacing Fail Drive in Existing VDEV"},"content":{"rendered":"\n<p>In a previous <a href=\"https:\/\/blog.lufamily.ca\/kang\/2024\/08\/15\/replacing-vdev-in-a-zfs-pool\/\">post<\/a>, I discussed creating a brand new VDEV with new drives to replace an existing VDEV. However, there is another approach that I chose to use in a very recent event for my NAS (Network Attached Storage) hard drive when it started to encounter write errors and later checksum errors.<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"1176\" height=\"2112\" src=\"https:\/\/blog.lufamily.ca\/kang\/wp-content\/uploads\/sites\/3\/2026\/02\/Screenshot-2026-02-05-at-10.45.15-AM.png\" alt=\"\" class=\"wp-image-3363\" srcset=\"https:\/\/blog.lufamily.ca\/kang\/wp-content\/uploads\/sites\/3\/2026\/02\/Screenshot-2026-02-05-at-10.45.15-AM.png 1176w, https:\/\/blog.lufamily.ca\/kang\/wp-content\/uploads\/sites\/3\/2026\/02\/Screenshot-2026-02-05-at-10.45.15-AM-167x300.png 167w, https:\/\/blog.lufamily.ca\/kang\/wp-content\/uploads\/sites\/3\/2026\/02\/Screenshot-2026-02-05-at-10.45.15-AM-570x1024.png 570w, https:\/\/blog.lufamily.ca\/kang\/wp-content\/uploads\/sites\/3\/2026\/02\/Screenshot-2026-02-05-at-10.45.15-AM-768x1379.png 768w, https:\/\/blog.lufamily.ca\/kang\/wp-content\/uploads\/sites\/3\/2026\/02\/Screenshot-2026-02-05-at-10.45.15-AM-855x1536.png 855w, https:\/\/blog.lufamily.ca\/kang\/wp-content\/uploads\/sites\/3\/2026\/02\/Screenshot-2026-02-05-at-10.45.15-AM-1140x2048.png 1140w\" sizes=\"auto, (max-width: 709px) 85vw, (max-width: 909px) 67vw, (max-width: 1362px) 62vw, 840px\" \/><figcaption class=\"wp-element-caption\">The output of zpool status -v<\/figcaption><\/figure>\n\n\n\n<p>The affected VDEV is <code>mirror-4<\/code>. Since there are 16 hard drives involved in this storage pool, I had to find out which hard drive is having the issue. I had to perform the following command line operations to obtain the serial numbers of the drives within the VDEV.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"578\" src=\"https:\/\/blog.lufamily.ca\/kang\/wp-content\/uploads\/sites\/3\/2026\/02\/Screenshot-2026-02-05-at-10.51.43-AM-1024x578.png\" alt=\"\" class=\"wp-image-3364\" srcset=\"https:\/\/blog.lufamily.ca\/kang\/wp-content\/uploads\/sites\/3\/2026\/02\/Screenshot-2026-02-05-at-10.51.43-AM-1024x578.png 1024w, https:\/\/blog.lufamily.ca\/kang\/wp-content\/uploads\/sites\/3\/2026\/02\/Screenshot-2026-02-05-at-10.51.43-AM-300x169.png 300w, https:\/\/blog.lufamily.ca\/kang\/wp-content\/uploads\/sites\/3\/2026\/02\/Screenshot-2026-02-05-at-10.51.43-AM-768x434.png 768w, https:\/\/blog.lufamily.ca\/kang\/wp-content\/uploads\/sites\/3\/2026\/02\/Screenshot-2026-02-05-at-10.51.43-AM-1536x867.png 1536w, https:\/\/blog.lufamily.ca\/kang\/wp-content\/uploads\/sites\/3\/2026\/02\/Screenshot-2026-02-05-at-10.51.43-AM-1200x677.png 1200w, https:\/\/blog.lufamily.ca\/kang\/wp-content\/uploads\/sites\/3\/2026\/02\/Screenshot-2026-02-05-at-10.51.43-AM.png 1768w\" sizes=\"auto, (max-width: 709px) 85vw, (max-width: 909px) 67vw, (max-width: 1362px) 62vw, 840px\" \/><figcaption class=\"wp-element-caption\">Shell commands to get the Serial Number.<\/figcaption><\/figure>\n\n\n\n<p>It was the WD60EFRX drive that failed. This is a WD60EFRX Western Digital Red 6TB 5400RPM drive. I was curious to see how old is the drive, so I used the <code>smartctl<\/code> utility to find out the number of powered on hours that this particular drive endured.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"734\" height=\"1024\" src=\"https:\/\/blog.lufamily.ca\/kang\/wp-content\/uploads\/sites\/3\/2026\/02\/Screenshot-2026-02-05-at-10.59.51-AM-734x1024.png\" alt=\"\" class=\"wp-image-3365\" srcset=\"https:\/\/blog.lufamily.ca\/kang\/wp-content\/uploads\/sites\/3\/2026\/02\/Screenshot-2026-02-05-at-10.59.51-AM-734x1024.png 734w, https:\/\/blog.lufamily.ca\/kang\/wp-content\/uploads\/sites\/3\/2026\/02\/Screenshot-2026-02-05-at-10.59.51-AM-215x300.png 215w, https:\/\/blog.lufamily.ca\/kang\/wp-content\/uploads\/sites\/3\/2026\/02\/Screenshot-2026-02-05-at-10.59.51-AM-768x1071.png 768w, https:\/\/blog.lufamily.ca\/kang\/wp-content\/uploads\/sites\/3\/2026\/02\/Screenshot-2026-02-05-at-10.59.51-AM-1101x1536.png 1101w, https:\/\/blog.lufamily.ca\/kang\/wp-content\/uploads\/sites\/3\/2026\/02\/Screenshot-2026-02-05-at-10.59.51-AM-1468x2048.png 1468w, https:\/\/blog.lufamily.ca\/kang\/wp-content\/uploads\/sites\/3\/2026\/02\/Screenshot-2026-02-05-at-10.59.51-AM-1200x1674.png 1200w, https:\/\/blog.lufamily.ca\/kang\/wp-content\/uploads\/sites\/3\/2026\/02\/Screenshot-2026-02-05-at-10.59.51-AM.png 1600w\" sizes=\"auto, (max-width: 709px) 85vw, (max-width: 909px) 67vw, (max-width: 984px) 61vw, (max-width: 1362px) 45vw, 600px\" \/><\/figure>\n\n\n\n<p>The 4.2 years (37033 \/ 24 \/ 365 = 4.2) is well over the 3 years warranty promised by Western Digital, so I took this unfortunate opportunity to get two new <a href=\"https:\/\/www.amazon.ca\/dp\/B0B94KSFTH?ref=ppx_yo2ov_dt_b_fed_asin_title&amp;th=1\" target=\"_blank\" rel=\"noreferrer noopener\">Seagate IronWolf Pro 12TB Enterprise NAS Internal HDD Hard Drive<\/a>. The idea is not just to replace the drive with issue but also to expand the pool, and get an extra 6TB drive from the existing mirror that is still good, and use it as part of my offline backup strategy.<\/p>\n\n\n\n<p>Once the new drives arrived and connected to the system, I simply performed an <code>attach<\/code> command to add them to the mirror VDEV.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"934\" height=\"1024\" src=\"https:\/\/blog.lufamily.ca\/kang\/wp-content\/uploads\/sites\/3\/2026\/02\/Screenshot-2026-02-05-at-11.12.10-AM-934x1024.png\" alt=\"\" class=\"wp-image-3366\" srcset=\"https:\/\/blog.lufamily.ca\/kang\/wp-content\/uploads\/sites\/3\/2026\/02\/Screenshot-2026-02-05-at-11.12.10-AM-934x1024.png 934w, https:\/\/blog.lufamily.ca\/kang\/wp-content\/uploads\/sites\/3\/2026\/02\/Screenshot-2026-02-05-at-11.12.10-AM-274x300.png 274w, https:\/\/blog.lufamily.ca\/kang\/wp-content\/uploads\/sites\/3\/2026\/02\/Screenshot-2026-02-05-at-11.12.10-AM-768x842.png 768w, https:\/\/blog.lufamily.ca\/kang\/wp-content\/uploads\/sites\/3\/2026\/02\/Screenshot-2026-02-05-at-11.12.10-AM-1401x1536.png 1401w, https:\/\/blog.lufamily.ca\/kang\/wp-content\/uploads\/sites\/3\/2026\/02\/Screenshot-2026-02-05-at-11.12.10-AM-1200x1315.png 1200w, https:\/\/blog.lufamily.ca\/kang\/wp-content\/uploads\/sites\/3\/2026\/02\/Screenshot-2026-02-05-at-11.12.10-AM.png 1792w\" sizes=\"auto, (max-width: 709px) 85vw, (max-width: 909px) 67vw, (max-width: 1362px) 62vw, 840px\" \/><figcaption class=\"wp-element-caption\">commands to attach the new drive<\/figcaption><\/figure>\n\n\n\n<p>After attaching the new drives, the <code>zfs<\/code> pool begins to automatically resilver. The above image was taken several hours after the attachment, and we are now waiting for the last drive to complete its resilvering. Since one of the new drive has already completed its resilvering, this means we have regained full redundancy.<\/p>\n\n\n\n<p>After the resilvering is completed, I will then detach both old drives from the mirror using the detach command.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>zpool detach vault \/dev\/disk\/by-id\/wwn-0x50014ee2b9f82b35-part1\nzpool detach vault \/dev\/disk\/by-id\/wwn-0x50014ee2b96dac7c-part1<\/code><\/pre>\n\n\n\n<p>The first drive will be chucked into the garbage bin, and the second drive will be used for offline backup. Before I use the second drive for offline backup, I need to remove all <code>zfs<\/code> information and meta data from the drive to avoid any unintentional future conflicts. We do this using the <code>labelclear<\/code> command like below.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>zpool labelclear \/dev\/disk\/by-id\/wwn-0x50014ee2b96dac7c<\/code><\/pre>\n\n\n\n<p>For extra safety, we can also destroy the old partition by using <code>parted<\/code> and relabeling the disk and create a new partition table. If the above command fails, we can use the dd command to just zero out the first few blocks of the drive.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>dd if=\/dev\/zero of=\/dev\/disk\/by-id\/wwn-0x50014ee2b96dac7c bs=1M count=100<\/code><\/pre>\n\n\n\n<p>In summary, this is the general strategy moving forward. When a drive on my NAS pool starts to fail (before actual failure), I take the opportunity to replace all the drives in the entire mirror with higher capacity drives, and use the remaining good one to serve as offline backup.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>In a previous post, I discussed creating a brand new VDEV with new drives to replace an existing VDEV. However, there is another approach that I chose to use in a very recent event for my NAS (Network Attached Storage) hard drive when it started to encounter write errors and later checksum errors. The affected &hellip; <a href=\"https:\/\/blog.lufamily.ca\/kang\/2026\/02\/05\/replacing-fail-drive-in-existing-vdev\/\" class=\"more-link\">Continue reading<span class=\"screen-reader-text\"> &#8220;Replacing Fail Drive in Existing VDEV&#8221;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"jetpack_post_was_ever_published":false,"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"footnotes":""},"categories":[111],"tags":[5,28,159],"class_list":["post-3361","post","type-post","status-publish","format-standard","hentry","category-tech","tag-nas","tag-technology","tag-zfs"],"jetpack_featured_media_url":"","jetpack_shortlink":"https:\/\/wp.me\/p7V6i8-Sd","jetpack_sharing_enabled":true,"_links":{"self":[{"href":"https:\/\/blog.lufamily.ca\/kang\/wp-json\/wp\/v2\/posts\/3361","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/blog.lufamily.ca\/kang\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/blog.lufamily.ca\/kang\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/blog.lufamily.ca\/kang\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/blog.lufamily.ca\/kang\/wp-json\/wp\/v2\/comments?post=3361"}],"version-history":[{"count":2,"href":"https:\/\/blog.lufamily.ca\/kang\/wp-json\/wp\/v2\/posts\/3361\/revisions"}],"predecessor-version":[{"id":3367,"href":"https:\/\/blog.lufamily.ca\/kang\/wp-json\/wp\/v2\/posts\/3361\/revisions\/3367"}],"wp:attachment":[{"href":"https:\/\/blog.lufamily.ca\/kang\/wp-json\/wp\/v2\/media?parent=3361"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blog.lufamily.ca\/kang\/wp-json\/wp\/v2\/categories?post=3361"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blog.lufamily.ca\/kang\/wp-json\/wp\/v2\/tags?post=3361"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}