A few months ago, the boot disk of our media server begin to incur some errors, such as the ones below:
Dec 17 03:01:35 avs kernel: [32515.068669] EXT4-fs error (device nvme1n1p2): htree_dirblock_to_tree:1080: inode #10354778: comm tar: Directory block failed checksum
Dec 17 03:02:35 avs kernel: [32575.183005] EXT4-fs error (device nvme1n1p2): htree_dirblock_to_tree:1080: inode #13500463: comm tar: Directory block failed checksum
Dec 17 03:02:35 avs kernel: [32575.183438] EXT4-fs error (device nvme1n1p2): htree_dirblock_to_tree:1080: inode #13500427: comm tar: Directory block failed checksum
The boot disk is a NVME device and I thought it may be due to over heating, so I purchased a heat sink and installed it. Unfortunately the errors persisted after the heat sink.
I decided to replace the boot disk with the exact same model which was the Samsung 980Pro 1TB. This should have been a pretty easy maintenance task. We clone the drive, and swap in the new drive. However, Murphy is sure to strike!
My usual goto cloning utility is Clonezilla, unfortunately this utility did not like cloning NVME drives. The utility resulted in a kernel panic after trying multiple versions. I am not sure what is the problem here. It could be Clonezilla or the USB 3.0 NVME enclosure that I was using for the new disk.
I resigned to using the dd command:
dd if=/dev/source of=/dev/target status=progress
Unfortunately this would have taken way too long something like 20+ hours, so I gave up with this approach.
I decided to do a good old restore of the nightly backup. I started by cloning the partition table:
sfdisk -d /dev/olddisk | sfdisk /dev/newdisk
I then proceeded with the restore of the nightly backup. Murphy strikes twice! The nightly backup was corrupted! I guess it is not surprising when the root directory’s integrity is in question. The whole reason why we are doing this exercise.
Without the nightly backup, I had to resort to a live backup. I booted system again, and performed:
sudo su -
mount /dev/new_disk_root_partition /mnt/newboot
cd /
tar -cvpf - --exclude=/tmp --exclude=/home/kang/log --exclude=/span --exclude="/var/lib/plexmediaserver/Library/Application Support/Plex Media Server/Cache" --one-file-system / | tar xvpzf - -C /mnt/newboot --numeric-owner
The above took about an hour. I then copy the /span directory manually, because this directory tends to change while the server is up and running.
With all the contents copied, I forgot how to install grub and had to re-teach myself again. I had to use a live copy Ubuntu USB and use that to boot up the machine, and then mount both the root and efi partitions respectively.
nvme1n1 259:0 0 931.5G 0 disk
├─nvme1n1p1 259:1 0 1G 0 part /boot/efi
├─nvme1n1p2 259:2 0 915.4G 0 part /
└─nvme1n1p3 259:3 0 15.1G 0 part [SWAP]
And install GRUB.
sudo su -
mkdir /efi
mount /dev/nvme1n1p1 /efi
mount /dev/nvme1n1p2 /mnt
grub-install --efi-directory /efi --root-directory /mnt
I also have fix the /etc/fstab to ensure the root partition and /boot/efi partition are properly referenced by their corresponding, correct UUID. The blkid command came in handy to find the UUID. For the swap partition, I had to use the mkswap command before I get the UUID.
After I rebooted, I reinstalled GRUB one more time with the following as super user:
grub-install /dev/nvme1n1
I also updated the initramfs using:
update-initramfs -c -k all
For something that should have taken less than an hour, it took the majority of the day. The server is now running with the new NVME replacement disk. Hopefully this resolves the file system corruptions. We have to wait and see!
Update: The Day After
The same errors occurred again! I noticed that these corruptions occur when we do a system backup. How ironic! I later confirmed that performing the tar command on the root directory during the backup process can cause such an error. I now have to see why this is. I will disable the system backup for the next few days to see if the errors come back or not.
I have a membership with Audible and I sometimes also get other audio book sources. Recently I experimented with combining all of my audio books into a centralized place. Since I already have a Plex server running, I thought it would be a good place to do this.
I did a little research and came across a couple of very helpful articles:
I have a single folder to store all of my audio books. Inside the folder, each audio book is stored as an “m4b” file.
Ensure that audio books have a poster image and that its artist and album_artist tags are set to the author. Where appropriate, the audio book should also contain chapter metadata.
Create a music library on Plex by adding the audio book folder, and set the agent to Audnexus
Note the Agent setting
Ensure that the advanced option of “Store track progress” is checked.
Ensure that Store Track Progress is checked!
Each book in the library will be represented as an album, and the author will be mapped to album artist. Once the library is created, you can download and play the audio books from the desktop using the Plex app. However, the more common use case is to listen to the audio books while on the go.
Using Prologue to Play Audio Books
We first have to download the Prologue App. I did not get any of the In-App premium functionality, and just stayed with the free version.
Point the app to my Plex server’s URL, and all the audio books from the library should now be accessible and playable on the iPhone or iPad with chapter, bookmark, and last-left-position support.
This is a really neat solution, and I am impressed how Plex and Prologue together formed a dynamic duo in this manner.
In the first part of this post, I talked about making sure all the new hardware that I recently purchased works. Yesterday, upgrading from Ubuntu 20.04 LTS to 22.04 LTS was super simple. Unfortunately, that was the end of the easy part.
I thought I could just image by old boot drive and make a carbon copy of it on my new boot drive. My old boot drive is a simple SATA 512GB SSD, and my new boot drive is an NVMe M.2 1TB SSD plugged directly to the motherboard. The copying was pretty simple, but because the drives differ in size, I had to relayout the partition table with the new drive once the copy is completed. I did this with the parted command.
Unfortunately the new boot drive did not want to boot. At this point I had to do some research. The most helpful articles were:
Both of the above articles were an excellent refresher on how GRUB works. I have used GRUB since the beginning, but one gets super rusty when these types of tasks are only performed once every three or six years!
Instead of detailing what went wrong, I will just explain what I should have done. This way if I need it again in the future, it is here for my reference.
Step 1: Perform a backup of the old boot drive from a Live USB in shell mode. This is done on my server on a nightly basis. This method is clearly described on the Ubuntu Community Help Wiki.
Following this method I will end up with a compressed tar archive for my entire root directory, skipping some runtime and other unwanted directories.
Step 2: After installing a fresh install of the new Ubuntu LTS Server operating system on the new server and boot drive, I proceeded to backup the new boot with the same technique used in Step 1. I stored the backup of the new install on another external SSD drive that I have lying around. Also it is important that new boot drive partition layout of the new install contains a swap partition.
Step 3: I then restore the most recent backup (done in Step 1) of the old boot drive to the new boot drive. I then replaced the /boot/grub directory with the new contents from the new install which was backed up in Step 2. The new GRUB is already installed when we performed a brand new installation on the drive. We just want to make sure the boot partition matches the /boot/grub contents.
Step 4: We also need to fix up the /etc/fstab file because it contains references to drive devices from the old hardware. Paid special attention the main data partition and the swap partition. It should look something like this:
# /etc/fstab: static file system information.
#
# Use 'blkid' to print the universally unique identifier for a
# device; this may be used with UUID= as a more robust way to name devices
# that works even if disks are added and removed. See fstab(5).
#
# <file system> <mount point> <type> <options> <dump> <pass>
# / was on /dev/nvme1n1p2 during curtin installation
UUID=fc939be4-5292-4252-8120-7ef59b177e5b / ext4 defaults 0 1
# /boot/efi was on /dev/nvme0n1p1 during curtin installation
UUID=5187-A8C6 /boot/efi vfat defaults 0 1
# Swap partition
UUID=512d611e-6944-4a57-9748-ea68e9ec3fad none swap sw 0 0
# /dev/mapper/airvideovg2-airvideo /mnt/airvideo ext4 rw,noatime 0 0
UUID=9e78425c-c1f3-4285-9fa1-96cac9114c55 /mnt/airvideo ext4 rw,noatime 0 0
Noticed that I also added the LVM logical volume for /mnt/airvideo, which is my RAID-1 array. The UUID can be obtained by the blkid command. Below is a sample output:
Step 4B (Potentially): If the system boots in the “grub>” prompt, then we will have persuade grub to manually boot by providing the following at the prompt:
grub> set root=(hd9,gpt2)
grub> linux /boot/vmlinuz root=/dev/nvme1n1p2
grub> initrd /boot/initrd.img
grub> boot
To find the root value on the first line, you have use the ls command which is explained in this article. The root parameter on the linux line references the partition which the root directory is mounted. In my case, it was /dev/nvme1n1p2.
After I rebooted, I reinstalled GRUB with the following as super user:
grub-install /dev/nvme1n1
It may also be required to update our initramfs using:
update-initramfs -c -k all
Step 5: At this point the system should reboot and all of the old server’s content should now be on the old hardware. Unfortunately we will need to fix the network interface.
First obtain the MAC address of the network interface using:
% sudo lshw -C network | grep serial
serial: 04:42:1a:05:d3:c4
And then we will have to edit the /etc/netplan/00-installer-config.yaml file.
% cat /etc/netplan/00-installer-config.yaml
# This is the network config written by 'subiquity'
network:
ethernets:
enp6s0:
dhcp4: true
match:
macaddress: 04:42:1a:05:d3:c4
set-name: enp6s0
version: 2
Ensuring the MAC address matches from lshw and that the name is the same as the old system. The name in this example is enp6s0. We then need to execute the following commands to generate the interface.
netplan generate
netplan apply
We need to ensure the name matches because many services on the server have configurations that references the interface name, such as:
Configurations in /var/network/interfaces
Samba (SMB) (/etc/samba/smb.conf)
Pihole (/etc/pihole/setupVars.conf)
Homebridge (/var/lib/homebridge/config.json)
Step 6: Fix the router provisioning DHCP IP addresses so that the new server has the same fixed IP address as the old server. This is important because there may be firewall rules referencing this IP address directly. The hostname should have been automatically restored when we restored the partition in Step 3.
Step 7: Our final step is to test the various services and ensure they are working properly. These include:
On May 15th, 2019 (more than three years ago), I performed a performance boost to my media server by upgrading its CPU, Motherboard, and Memory. You can read that experience in this post.
Today, I am going to be doing the same. It looks like we are on a cadence of every 3 years or so to do a spec bump. This time around we are also changing the same items, but will include the power supply as well in the swap. I also decided to swap the boot drive hardware from an old SSD drive to an NVME drive. All of this resulted in the following hardware acquisitions, all from Amazon, which I find them to have lower pricing (when factoring free shipping through Prime), than Newegg even during Black Friday and Cyber Monday offers.
The plan is to spend the time today to roughly test out all the new hardware.
Test Setup
I quickly did a skeleton setup to make sure Ubuntu 22.04.01 Server Edition works with all hardware involved, especially the networking.
Memory Test
Once I know Ubuntu server is working good, I am now testing the server’s new 32GB DDR4 memory. This is running as I write this post and will let it run overnight.
The plan for tomorrow is to upgrade the current media server from Ubuntu 20.04.5 LTS to Ubuntu 22.04.1 LTS. Once this is done, I can then backup everything, and move the new hardware into the old casing and hope everything works.
Our home Network Attached Storage (NAS) media server is going below 4 Terabytes of free space. The Seagate IronWolf 12TB hard drives were on sale with Amazon offering them below $300. I figure that I swap out two old 6TB drives with these new 12TB drives resulting in a net increase of a further 6TB of storage.
The last time this was done was around two years ago when I replaced 4TB and 6TB hard drives with 10TB hard drives.
So far the mdadm and LVM storage architecture has proven to be very flexible. I am able to mix drives of different sizes and able to grow our media storage volume over time.
Previously I had to make two swaps, each swap for each drive in the array. Effectively I am changing two 6TB drives for two 12TB drives because they are in a Raid 1 array. I cannot swap both at the same time, because I have to incrementally sync the data from the old drives to the new ones.
This has always been inconvenient because it means opening the physical server twice. However, this time I used my USB 3.0 HDD dock. I inserted one of two 12TB new drives into the dock, and then I temporarily created a three disks Raid 1 array. Once the sync is completed, which took 10+ hours, I remove one 6TB drive from the array configuration and I then physically replace both 6TB drives with both 12TB new drives in the server chassis, and place one old 6TB drive into the dock. The 6TB drive in the dock is the one that is still in the array configuration. I then add the second 12TB drive that is already in the server chassis to the three disk array. Once again, a sync is required to accommodate the second 12TB drive. This also took 10+ hours. Once the second sync is completed, I can finally remove the second 6TB drive in the dock from the array and have the array returned back to a two disk Raid 1 array.
The above description is probably quite confusing, but this technique allowed me to just have a single down time for the server instead of two when swapping hard drives in the server chassis.
There will be an additional downtime when I grow or resize the LVM volume and file system.
After this upgrade I should have the following Raid 1 (fully mirrored) arrays:
An array with 2 x 8TB
An array with 2 x 10TB
An array with 2 x 10TB
An array with 2 x 12TB
The above four arrays are combined into a logical volume using LVM that results in a total volume size of 40TB (fully mirrored) or a little over 36TiB of usable space (increasing from the old 31TiB).
I had to increase the size of my media LVM logical volume again. In a previous post, I provided the instructions. I have done this many times. However, this time around, I ran into a snag.
Apparently this is the first time I try to increase the logical volume after I implemented LVM caching, which I wrote about in this post.
The steps in the “Linux LVM Super Simple to Expand” post are the same right up to and including the step involving the resizing of the physical volume. Afterwards, in order to resize the logical volume, we first have to disable the cache temporarily.
Note that e2fsck and resize2fs will take some time, between thirty minutes to an hour each. Once the file system is resized, we can reattach the cache.
This past weekend my media NAS server was intolerably slow. When I investigated, I found out that one of the RAID-1 partitions is experiencing read errors and is timing out. I decided to risk a reboot and to my surprise the RAID-1 partition did not recover with one fail drive, but mdstat recorded with an inactive status, something like this:
md2 : inactive sdc1[0](S)
After some Google search, I found that I had to do the following to resurrect the md2 device.
This reactivated the md2 partition. I replaced the failed drive and re-added the new drive to the md2 device. The RAID-1 partition is now rebuilding.
The inactive state is a new experience for me, so this was a bit of a surprise.
During this exercise I also found out that the SATA connectors on my SATA add-on card were loose causing intermittent connections. I will have to find a way to address this in the future.
Two and half years ago, I performed a CPU and motherboard upgrade to my media server. You can read the account here.
Although the AMD Athlon 5350 APU was energy efficient, it proved to be under power for on demand video encoding when Plex wanted to transcode video for a player on a device that is not compatible with the playing video. For example, when an Apple TV (not 4K) wants to play 4K material from Plex on my media server, the server will have to transcode the 4K material to a compatible 1080p format. Unfortunately, this is very CPU intensive and if more than one person in the house hold is trying to do the same thing, which is not unheard of, this causes stuttered playback issues.
Given the choice between saving a few dollars a year versus usability, I choose usability. Therefore I started to research what I need for the upgrade. My goal is upgrade the system so that transcoding will not be an issue and I can also use the system for future video encoding of security camera footages. We can also use the system for background video encoding of family videos as well.
I continue to prefer the AMD brand, and decided on the following combo:
Corsair Vengeance LPX 16GB (2x8GB) DDR4 DRAM 2666MHz (CMK16GX4M2A2666C16)
The above were all purchased through Amazon and cost me a grand total of $473.24. The AMD CPU was the most expensive part costing almost $190.
Taking out the old motherboard and CPU combo and replacing them with the new parts went smoothly. The side SATA connectors bucked against one of my HDD chassis so I opted not to use them, and decided to connect all of my RAID SATA connectors to the SATA accessory card that I purchased and discussed in this post.
Last time I did an upgrade like this, the Ubuntu operating system had no problems and booted without any issues. Unfortunately, this time is very different. After the machine posted, Ubuntu booted into a blank, black screen. After some research, I learned to reboot the Ubuntu kernel with the nomodeset option. I learned to press and hold the shift key so that I can select the desired kernel that I wanted via the GRUB menu, and I learned to press the ‘e’ key in the GRUB menu to modify the boot options. Finally pressing F10 to boot with the custom changes (effective for only one time).
The above trick got me a login prompt. After I gained access to the command prompt, I noticed that the kernel did not recognize any ethernet devices. I now have a machine that is not connected to the network. After some more Internet research I found out that the current 4.15 Linux Kernel that I have is insufficient to run on the Raven Ridge architecture, the AMD code name for the Zen CPU and Vega GPU combination on a single chip. I have to upgrade to the 4.18 Linux Kernel.
However I cannot upgrade through the Internet, because the machine is not on the Internet. I have to download the Debian packages on a USB stick with another machine and manually install them. At this point, I learned that you cannot simply download a single package for this. I had to decide whether to go with the Linux Mainline Kernel packages or go with the Ubuntu HWE (Hardware Enablement) packages. After reading through Ubuntu’s LTS Enablement Stack article, I decided to HWE packages. I found the linux-generic-hwe packages and their prerequisites on pkgs.org. This took several iterations as I did not get all the dependent packages on the first try.
Once all the packages were installed, the machine booted without the need for the nomodeset option. However, the internet interface device was still not there. I had to run the command netpath, to find out that new motherboard’s ethernet device’s logical name was em1. To register the new logical name, I had to edit /etc/network/interfaces file.
Finally, the machine booted with an active ethernet connection. As a sanity check, I executed:
Ensuring that my new media server has all the required kernel packages. We are still not done. The IP address of the server has changed, because we now have a different MAC address, so the DHCP server provisioned a different IP. I tried to change the Unifi Controller to provision a static IP address to this new server but I was unsuccessful. I suspect that the new server is also running the Unifi Controller may have something to do with it. Since the IP address has changed, I needed to update the following configurations:
Firewall rules
Unifi Controller name space configurations
Samba configurations because we only allow for local machines to share
All of this took from 4:30pm to 11:00pm last night, 6.5 hours worth of hardware assembly, research with Google, trial and error, and finally success. I cannot imagine if Google and the super helpful community forums did not existed. Fingers crossed that the new media server will run smoothly.
This past weekend I continued to upgrade our NAS server. Last weekend, I upgraded my raid array with an additional 8TB of mirrored storage. This yielded two old 4TB WD Blue HDD. I noticed that my case has a total of 9 internal storage bays. One was used by my 500GB SSD Boot Drive, and 6 were populated by HDD drives making up the current raid array. This means I have 2 more storage bays left. However these remaining bays were meant for 5.25″ storage devices like Optical Disc Players. For me to place my old 4TB WD Blue HDD into these bays, I will need a 5.25″ to 3.5″ bay converter. I had one, and purchased the other one on Amazon. I ended up buying the ORICO Aluminum 5.25 inch to 2.5 or 3.5 Inch Internal Hard Disk Drive Mounting Kit.
I also did not have enough SATA slots and purchased the IOCrest SI-PEX40071 SATA III 8 Port Controller Card. This card along with the 4 builtin SATA slots on the motherboard gave me enough SATA connections for my 9 drives.
Once I installed the old 4TB drives, I proceeded to create another md raid level 1 device and created a matching physical volume which I used to extend the current logical volume group. When the setup is completed, I ended up with a 20TB+ fully mirrored NAS server. I love LVM in combination of mdadm.
I figured while everything is fresh on my mind, I minus well proceed with the dreaded 16.04 to 18.04 Linux Ubuntu upgrade.
The upgrade was surprisingly very smooth. However the new version of OpenVPN caused some troubles. The new OpenVPN no longer works with my old PureVPN configuration files, because the certificate files that came from PureVPN used an outdated and deprecated hash algorithm. After getting the new configuration files from PureVPN, everything worked like a charm.
I also have to reinstall the Unifi Controller along with Let’s Encrypt certbot utility.
Super happy with the outcome and the upgrades should last another 2 to 3 years.
Our NAS has run out of space again. I saw a deal that the Seagate IronWolf 8TB NAS Hard Drive was on sale at newegg for $309 CAD. I jumped at the chance and purchased two.
I am now following the same step as I outlined in this post. Replacing two old 4TB drives with these two new 8TB drives.
So far so good. Hopefully when all is said and done, my NAS will have a total of 18TB in a RAID 1 configuration of 6 hard drives in total. Two 4TB, two 6TB, and the two new 8TB.
I noticed that I could fit two more drives in my chassis and may decide to re-add the two old 4 TB back in, but first I’ll have to check if my power supply can handle the demand.
I really like this mdadm and LVM setup.
Update: After 2 mdadm syncs, each of which was around 8 hours, and a pvresize that also took another 5 hours. I had to convert the filesystem from 32 bits to 64 bits using these very helpful instructions. Only after I converted to 64 bits can I then expand the existing filesystem to more than 16TB. It was a learning and yet rewarding experience. Next step is to reuse the 2 old 4TB drives in the same chassis and add them to the logical volume.