Tag: NAS

Processing Graphical Subtitles

In the past, when I got hold of a video that has hdmv_pgs_subtitle subtitle streams, I have always ignored it. Instead I tried to find a compatible subtitle in .srt format on the opensubtitles.org website. Today I came across a video that I am trying to archive that does not have the appropriate subtitles that I wanted. All of this would not have been an issue if my preferred mp4 format actually supports the hdmv_pgs_subtitle format.

I know an OCR (Optical Character Recognition) technique for extracting the subtitles from the hdmv_pgs_subtitle stream, but I am always in a hurry. This time, I bit the bullet and went down on this path.

Below are the steps that I had to go through.

First I had to download and install ffmpeg and mkvtoolnix packages on my Linux machine, and then execute the following commands to extract the Chinese subtitles that I wanted.

ffmpeg -y -i archive.mkv -map 0:s:1 -c:s dvdsub -f matroska chi.mkv
mkvextract chi.mkv tracks 0:mysub

After the above commands, I will have mysub.idx and mysub.sup files. The first are the time index codes and the latter are the subtitle images.

On a Windows virtual machine, I had to download Subtitle Edit, a subtitle editor tool that has the OCR functionality, and convert the mysub.idx and mysub.sup into mysub.srt, which I can then later use to re-incorporate back into the archive video file.

Above is a screenshot of the application after the OCR is completed. I found that the engine mode of Tesseract + LSTM worked the best. Of course, I had to select the matching language that is befitting of the subtitle. Once I saved the finished product as mysub.srt I can then use this file to create archive.mp4 using ffmpeg.

ffmpeg -i archive.mkv -i mysub.srt -map 0:v -map 0:a -map 1:s -c copy -c:s mov_text -metadata:s:s:0 language=chi archive.mp4

Video file successfully archived!

Linux Boot with No Networking

GLOTRENDS PA09-HS M.2 NVMe to PCIe 4.0 X4 Adapter

I recently wanted to install an M.2 NVMe to PCIe 4.0 X4 Adapter on an existing server. The idea was to install a new NVMe SSD drive, and the motherboard had no more M.2 sockets available.

The server is running Proxmox with Linux Kernel 6.8.12. I thought this should be a 15-minute exercise. How wrong I was. After installing all the hardware, the system booted up but there was no networking access. This was especially painful because I could no longer remote into the server. I had to go pull out an old monitor and keyboard and perform diagnostics.

I used the journalctl command to diagnose the issue, and found the following entry:

Feb 01 13:36:21 pvproxmox networking[1338]: error: vmbr0: bridge port enp6s0 does not exist
Feb 01 13:36:21 pvproxmox networking[1338]: warning: vmbr0: apply bridge ports settings: bridge configuration failed (missing ports)
Feb 01 13:36:21 pvproxmox /usr/sbin/ifup[1338]: error: vmbr0: bridge port enp6s0 does not exist
Feb 01 13:36:21 pvproxmox /usr/sbin/ifup[1338]: warning: vmbr0: apply bridge ports settings: bridge configuration failed (missing ports)

The above error message indicates that enp6s0 no longer exists. When I looked at earlier messages, I noticed this one:

Feb 01 13:36:15 pvproxmox kernel: r8169 0000:07:00.0 enp7s0: renamed from eth0

It looks like the interface name has been changed from enp6s0 to enp7s0. Therefore the correct remedy is to edit the /etc/network/interfaces to reflect the name change. Below is the new content of the file.

# cat /etc/network/interfaces
auto lo
iface lo inet loopback

iface enp7s0 inet manual

auto vmbr0
iface vmbr0 inet static
        address 192.168.188.2/24
        gateway 192.168.188.1
        bridge-ports enp7s0
        bridge-stp off
        bridge-fd 0

iface wlp5s0 inet manual

This would be very annoying if the old interface name was used in many other configuration files. There is one other reference that I found on the Internet (https://www.baeldung.com/linux/rename-network-interface) detailing a way to change the network interface name using the udev rules. I did not try this, but something to keep in mind in the future.

In a previous post and on another home server, I did fix the name using netplan, but Proxmox is not using it.

Simple File Transfer – NOT

Recently I needed to transfer a private binary file from one household to my server. We wanted this transfer to remain private because the file contains sensitive content.

In the past, I set up a WebDAV server using Apache2.4:

First I had to enable the DAV modules using the following command line on my Ubuntu server:

sudo a2enmod dav
sudo a2enmod dav_fs

I already had a directory set up on my file system called: /mnt/Sites/public_share. I made the following changes to my Apache2 configuration files.

<VirtualHost *:80>
    ServerName share.lufamily.ca
    RewriteEngine On
    RewriteCond %{HTTPS} off
    RewriteRule (.*) https://share.lufamily.ca
</VirtualHost>

<VirtualHost *:443>
    ServerName share.lufamily.ca
    ServerAdmin xxxxxxxx@gmail.com
    DocumentRoot /mnt/Sites/public_share

    <Directory /mnt/Sites/public_share>
        AllowOverride All
    </Directory>

    <Location />
        AuthType None
        DAV On
        Options +Indexes
        RewriteEngine off
    </Location>

    Include /home/xxxxx....xxxxxxx/ssl.lufamily.ca
</VirtualHost>

I did not have any authentication, because I restricted access to this directory with an override .htaccess file which contains the following:

<IfModule mod_headers.c>
    Header set X-XSS-Protection "1; mode=block"
    Header always append X-Frame-Options SAMEORIGIN
    Header set X-Content-Type-Options nosniff
    Header set X-Robots-Tag "noindex, nofollow"
</IfModule>

<Files ".htaccess">
  Order Allow,Deny
  Deny from all
</Files>

<RequireAny>
    Require ip 192.168.0.0/16
    Require ip 172.16.0.0/12
    Require ip 10.0.0.0/8

    # Sending computer external IP
    Require ip AAA.BBB.CCC.DDD
</RequireAny>

With the above setup, the other party just needs to open up a Finder on macOS or a Files Explorer on Windows with the above URL of https://share.lufamily.ca, and copy, delete, and open files like they normally would. The access will be private because it is restricted by their external IP address. With macOS, copying many gigabytes via WebDAV posed no issues.

Unfortunately, Windows is another matter. This worked for small files. For large files in the gigabytes range, Windows seemed to be stuck on 99% complete. This is because Windows locally caches the large transfer and reports it is 99% completed in a very short time, as the physical transfer catches up. But the actual time needed for the copying across the Internet is so long that Windows became confused thinking that we are copying a file that already exists yielding an unwanted error.

I had to come up with an alternative. We briefly dabbled with the idea of using FTP, but after a few minutes, this was simply a non-starter. The FTP passive mode requires ports to be opened on my firewall which is unrealistic for a long-term solution.

SFTP is a very secure protocol that uses OpenSSH. I also like this technique because the usage is more secure and will be governed by a pair of SSH Keys. The private key on the remote user side and the public key will be used to configure SSH on my server. I set up a ssh user called sftpuser. To prepare for this user to only have sftp access I made the following changes to the sshd configuration file /etc/ssh/sshd_config.

# Added the internal-sftp
Subsystem sftp /usr/lib/openssh/sftp-server internal-sftp

# Configure the local user scpuser to only do sftp
Match User sftpuser
    ChrootDirectory /home/sftpuser
    PasswordAuthentication no
    ForceCommand internal-sftp
    AllowTcpForwarding no
    X11Forwarding no
    AllowAgentForwarding no

I then created the sftpuser using the following command:

sudo adduser sftpuser                                                                                                                  sudo chown root:root /home/sftpuser                                                                                                    
sudo mkdir /home/sftpuser/uploads                                                                                                      sudo chown sftpuser:sftpuser /home/sftpuser/uploads                                                                                    
sudo chmod -R 0755 /home/sftpuser/uploads

This user will not be able to login into a shell and can only use sftp. I also disable the password authentication just in case. For the remote party to upload the file, they will need to provide a public ssh key which needs to be stored in the .ssh/authorized_keys file. The contents of which look something like this:

ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAACAQCliK6NZx6JJBcK0+1GtEe8H6QpN1BHDRgq/vtiEAfwzcjN1dBtQhfplyDxEXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXF+OLV9qWMsE/g+1H4oyLRqzQnD8w7S4RBUJzrrZIpLEzYRf43pWSW9Y3220swlIEYxIOIcJIc8prgzDbECt3CR/BsRDYNZA5uxdPYLwh1YtTX8GEqoctJifLrC4OomKkczDek9k/MHdFbWZ0LdK3AB287nr/Q4Lb8GgfU3bEhF+AMSWM8r/OHC1QBPYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYbH8npyFsC3rADnjfFsB4VkkiNDDIZbZkV2vBf3sJ49Q1Y3uHugWxITWImKjfl+YUdGMalbSfP8UueKSx3sDGQQDXZjzrwnX3KPie0Qiz2rQtrppB7dA5CvOb86Q== guest

The above is just a single line in the file.

With the above setup, a Linux user can simply do the following to transfer a file to my server in a very secure way.

sftp -P55522 sftpuser@lufamily.ca <<< 'put /usr/bin/bash uploads/sample.bin'

The above command will upload the bash binary to my server.

An attacker trying to login using ssh will get the following:

❯ ssh -p 55522 sftpuser@lufamily.ca
This service allows sftp connections only.
Connection to lufamily.ca closed.

On Linux or macOS, the remote user can use ssh-keygen to create the public key which by default resides in ~/.ssh/id_dsa.pub. All I need to do is copy the contents of the public key and add it to my .ssh/authorized_keys.

For Windows users, they can generate the key using Windows PowerShell. Below is an example:

> ssh-keygen -t rsa -b 4096
Generating public/private rsa key pair.
Enter file in which to save the key (C:\Users\kang/.ssh/id_rsa):
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in C:\Users\kang/.ssh/id_rsa
Your public key has been saved in C:\Users\kang/.ssh/id_rsa.pub
The key fingerprint is:
SHA256:hV6vcChUwpxXXXXXXXXXXXXXXXXXXXXXXXXXX0aTkJZ2M kang@win10
The key's randomart image is:
+---[RSA 4096]----+
|  . Eo.==..      |
|   * *+++=+      |
|  . @ oo.=+* .   |
| . = o..B+=.* .  |
|  .   .oSO.o..   |
|       ..oo.     |
|          .      |
|                 |
|                 |
+----[SHA256]-----+

> cat .\.ssh\id_rsa.pub
ssh-rsa AAAAB3NzaC1yc2EZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZQQWgIVShifqFxq78MWQEJrM2xrVQXlPHUncNosEm6P/l0LdWu1nRbIccKMNsmpPK7JOv9XF+CsrtlltnhwDqiuflCGftzhrlmBz8BOJRiwD0Fl1IfQ+Qg7Z1nvIo6+kpkBw7SGPN7fbJxDPPHmc9iPB4RnlG46v6ymd4KM0h1cGlReCly2PTxTG1dcPuDbrBIIdEHoN/40hojrooQf+cQNprvYZY59EjvC0NoZsfiKGDHHq3S7HRPGns9Oo4y8vFl1DrJZFIvBVdjjL28JsmIdeKbMhCynkzIkPLPvsiplxkEF0RQ9fFcIsucuD8leJmMDNPas+8EdueQ== kang@win10

To copy a binary you can do the following:

> sftp -P55522 sftpuser@lufamily.ca
Connected to lufamily.ca.
sftp> put "C:\Windows\System32\tar.exe" uploads/junk.exe
Uploading C:/Windows/System32/tar.exe to /uploads/junk.exe
tar.exe                                                                               100%   54KB  13.1MB/s   00:00
sftp> ls
uploads
sftp> cd uploads
sftp> ls
junk.exe    sample.bin
sftp>

The above is very similar to Linux and the Mac. Windows and its PowerShell have come a long way in terms of adopting Posix-like capabilities.

For those who want to use WinSCP, a much nicer GUI on Windows, you will need to convert the .ssh/id_rsa private key into ppk format. Use the command below to achieve this.

"c:\Program Files (x86)\WinSCP\WinSCP.com" /keygen id_rsa /output=id_rsa.ppk

You can then set up WinSCP authentication and load the ppk file.

So what I thought would be a simple matter turned out to be quite a deep rabbit hole. Hopefully with this in place, future transfers can be done quite quickly and securely.

Inadequate Power Causing ZFS Scrub Errors

Inadequate Power Causing ZFS Scrub Errors Download

Replacing VDEV in a ZFS Pool

Several months ago I had an old 3TB hard drive (HDD) crashed on me. Luckily it was a hard drive that is primarily used for backup purposes, so the data lost can quickly be duplicated from source by performing another backup. Since it was not critical that I replace the damaged drive immediately, it was kind of left to fester until today.

Recently I acquired four additional WD Red 6TB HDD, and I wanted to install these new drives into my NAS chassis. Since I am opening the chassis, I will irradicate the damaged drive, and also take this opportunity to swap some old drives out of the ZFS pool that I created earlier and add these new drives into the pool.

I first use the following command to add two additional mirror vdev’s each composed of the two new WD Red drives.

sudo zpool add vault mirror {id_of_drive_1} {id_of_drive_2}

The drive id’s is located in the following path: /dev/disk/by-id and is typically prefixed with ata or wwn.

This created two vdev’s into the pool, and I can remove an existing vdev. Doing so will automatically start redistributing the data on the removing vdev to the other vdev’s in the pool. All of this is performed while the pool is still online and running to service the NAS. To remove the old vdev, I execute the following command:

sudo zpool remove vault {vdev_name}

In my case, the old vdev’s name is mirror-5.

Once the remove command is given, the copying of data from the old vdev to the other vdev’s begins. You can check the status with:

sudo zpool status -v vault

The above will show the copying status and the approximate time it will take to complete the job.

Once the removal is completed, the old HDD of mirror-5 is still labeled for ZFS use. I had to use the labelclear command to clean the drive so that I could repurpose the drives for backup duty. Below is an example of the command.

sudo zpool labelclear sdb1

The resulting pool now looks like this:

sudo zpool list -v vault

(Output truncated)
                                                                                                                     NAME                                                    SIZE  ALLOC   FREE
vault                                                  52.7T  38.5T  14.3T
  mirror-0                                             9.09T  9.00T  92.4G
    ata-ST10000VN0008-2JJ101_ZHZ1KMA0-part1                -      -      -
    ata-WDC_WD101EFAX-68LDBN0_VCG6VRWN-part1               -      -      -
  mirror-1                                             7.27T  7.19T  73.7G
    wwn-0x5000c500b41844d9-part1                           -      -      -
    ata-ST8000VN0022-2EL112_ZA1E8S0V-part1                 -      -      -
  mirror-2                                             9.09T  9.00T  93.1G
    wwn-0x5000c500c3d33191-part1                           -      -      -
    ata-ST10000VN0004-1ZD101_ZA2964KD-part1                -      -      -
  mirror-3                                             10.9T  10.8T   112G
    wwn-0x5000c500dc587450-part1                           -      -      -
    wwn-0x5000c500dcc525ab-part1                           -      -      -
  mirror-4                                             5.45T  1.74T  3.72T
    wwn-0x50014ee2b9f82b35-part1                           -      -      -
    wwn-0x50014ee2b96dac7c-part1                           -      -      -
  indirect-5                                               -      -      -
  mirror-6                                             5.45T   372G  5.09T
    wwn-0x50014ee265d315cd-part1                           -      -      -
    wwn-0x50014ee2bb37517e-part1                           -      -      -
  mirror-7                                             5.45T   373G  5.09T
    wwn-0x50014ee265d315b1-part1                           -      -      -
    wwn-0x50014ee2bb2898c2-part1                           -      -      -
cache                                                      -      -      -
  nvme-Samsung_SSD_970_EVO_Plus_500GB_S4P2NF0M419555D   466G   462G  4.05G

The above indirect-5 can be safely ignored. It is just a reference to the old mirror-5.

This time we replaced the entire vdev, another technique is to replace the actual drives within the vdev. To do this, we will have to use the zpool replace command. We may also have to perform a zpool offline first before the replace command. This can be successively done on all the old drives in the mirror with newer drives with larger capacities to increase an existing vdev’s size.

Found Two HBA Cards for My NAS

About three weeks ago, I was casually browsing eBay and found this little gem, a Host Bus Adapter that can do PCIe 2.0 x8 (~4 to 8GB/s). This is way better than the one that I purchased earlier (GLOTRENDS SA3116J PCIe SATA Adapter Card) which can operate on a single lane of PCIe 3.0 yielding only 1GB/s. I could not pass it up at a price of only $ 40 CAD, so I purchased two of these to replace the old adapter card I had.

**LSI 6Gbps SAS HBA 9200-8i IT Mode ZFS FreeNAS unRAID + 2*SFF-8087 SATA**

This new card LSI 6Gbps SAS HBA 9200-8i only supports 8 SATA ports per card, so I had to get two of them to support all of the hard drives that I have. These SAS HBA cards must have the IT (initiator target) mode firmware because the default firmware (IR mode) supports a version of hardware RAID, which I did not want. With the IT mode, the hard drives will be logically separated on the card and only share the physical bandwidth of the PCIe bus. This is a must for ZFS.

With these new cards, my write throughput to my NAS hard drives now averages around 500MB/s. Previously, I was only getting about half of this.

I wish I would have found these sooner. Now I have two spare PCIe SATA expansion cards, one supporting 8 ports, and the other supporting 16 ports. I will place them on another server. Perhaps in a future Proxmox cluster project.

LVM to ZFS Migration

In a previous post, I described the hardware changes that I made to facilitate additional drive slots on my NAS Media Server.

We now need to migrate from an LVM system consisting of 40TB of redundant mirrored storage using mdadm to a ZFS system consisting of a single pool and a dataset. Below is a diagram depicting the logical layout of the old and the intended new system.

Before the migration, we must backup all the data from the LVM system. I cobbled together a collection of old hard drives and then proceeded to create another LVM volume as the temporary storage of the content. This temporary volume will not have any redundancy capability, so if any one of the old hard drives fails, then out goes all the content. The original LVM system is mounted on /mnt/airvideo and the temporary LVM volume is mounted on /mnt/av2.

I used the command below to proceed with the backup.

sudo rsync --delete -aAXv /mnt/airvideo /mnt/av2 > ~/nohup.avs.rsync.out 2>&1 &

I can then monitor the progress of the backup with:

tail -f ~/nohup.avs.rsync.out

The backup took a little more than 7 days to copy around 32 TB of data from our NAS server. During this entire process, all of the NAS services continued to run, so that downtime was almost non-existent.

Once the backup is completed, I wanted to move all the services to the backup before I started to dismantle the old LVM volume. The following steps were done:

Stop all services on other machines that were using the NAS;
Stop all services on the NAS that were using the /mnt/airvideo LVM volume;
- sudo systemctl stop apache2 smbd nmbd plexmediaserver
Unmount the /mnt/airvideo volume, and create a soft-link of the same name to the backup volume at /mnt/av2;
- sudo umount /mnt/airvideo
- sudo ln -s /mnt/av2 /mnt/airvideo
Restart all services on the NAS and the other machines;
- sudo systemctl start apache2 smbd nmbd plexmediaserver
Once again, the downtime here was minimal;
Remove or comment out the entry in the /etc/fstab file that automatically mounts the old LVM volume on boot. This is no longer necessary because ZFS is remounted by default;

Now that the services are all up and running, we can then start destroying the old LVM volume (airvideovg2/airvideo) and volume group (airvideovg2). We can obtain a list of all the physical volumes that make up the volume group.

sudo pvdisplay -C --separator ' | ' -o pv_name,vg_name

  PV | VG
  /dev/md1 | airvideovg2
  /dev/md2 | airvideovg2
  /dev/md3 | airvideovg2
  /dev/md4 | airvideovg2
  /dev/nvme0n1p1 | airvideovg2

The /dev/mdX devices are the mdadm mirror devices, each consisting of a pair of hard drives.

sudo lvremove airvideovg2/airvideo
Do you really want to remove and DISCARD active logical volume airvideovg2/airvideo? [y/n]: y
  Flushing 0 blocks for cache airvideovg2/airvideo.
Do you really want to remove and DISCARD logical volume airvideovg2/lv_cache_cpool? [y/n]: y
  Logical volume "lv_cache_cpool" successfully removed
  Logical volume "airvideo" successfully removed

sudo vgremove airvideovg2
  Volume group "airvideovg2" successfully removed

At this point, both the logical volume and the volume group are removed. We say a little prayer to ensure nothing happens with our temporary volume (/mnt/av2), that is currently in operation.

We now have to disassociate the mdadm devices from LVM.

sudo pvremove /dev/md1
Labels on physical volume "/dev/md1" successfully wiped.
sudo pvremove /dev/md2
Labels on physical volume "/dev/md2" successfully wiped.
sudo pvremove /dev/md3
Labels on physical volume "/dev/md3" successfully wiped.
sudo pvremove /dev/md4
Labels on physical volume "/dev/md4" successfully wiped.
sudo pvremove /dev/nvme0n1p1
Labels on physical volume "/dev/nvme0n1p1" successfully wiped.

You can find the physical hard drives associated with each mdadm device using the following:

sudo mdadm --detail /dev/md1
#or
sudo cat /proc/mdstat

We then have to stop all the mdadm devices and zero their superblock so that we can reuse the hard drives to set up our ZFS pool.

sudo mdadm --stop /dev/md1
mdadm: stopped /dev/md1
sudo mdadm --stop /dev/md2
mdadm: stopped /dev/md2
sudo mdadm --stop /dev/md3
mdadm: stopped /dev/md3
sudo mdadm --stop /dev/md4
mdadm: stopped /dev/md4

# Normally you also need to do a --remove after the --stop,
# but it looks like the 6.5 kernel did the remove automatically.
#
# For all partitions used in the md device

for i in sdb1 sdc1 sdp1 sda1 sdo1 sdd1 sdg1 sdn1
do
	sudo mdadm --zero-superblock /dev/${i}
done

Now with all of the old hard drives freed up, we can repurpose them to create our ZFS pool. Instead of using the /dev/sdX reference of the physical device, it is recommended to use /dev/disk/by-id with the manufacturer’s model and serial number so that the ZFS pool can be moved to another machine in the future. We also used the -f switch to let ZFS know that it is okay to erase the existing content on those devices. The command to create the pool we named vault is this:

zpool create -f vault mirror /dev/disk/by-id/ata-ST10000VN0008-2JJ101_ZHZ1KMA0-part1 /dev/disk/by-id/ata-WDC_WD101EFAX-68LDBN0_VCG6VRWN-part1 mirror /dev/disk/by-id/ata-ST8000VN0022-2EL112_ZA1E8GW4-part1 /dev/disk/by-id/ata-ST8000VN0022-2EL112_ZA1E8S0V-part1 mirror /dev/disk/by-id/ata-ST10000VN0004-1ZD101_ZA2C69FN-part1 /dev/disk/by-id/ata-ST10000VN0004-1ZD101_ZA2964KD-part1 mirror /dev/disk/by-id/ata-ST12000VN0008-2YS101_ZRT008SC-part1 /dev/disk/by-id/ata-ST12000VN0008-2YS101_ZV701XQV-part1

# The above created the pool with the old drives from the old LVM volume group
# We then added 4 more drives, 2 x 6TB, and 2 x 4TB drives to the pool

# Adding another 6TB mirror:

sudo zpool add -f vault mirror /dev/disk/by-id/ata-WDC_WD60EFRX-68L0BN1_WD-WX31D87HDU09-part1 /dev/disk/by-id/ata-WDC_WD60EZRZ-00GZ5B1_WD-WX11D374490J-part1

# Adding another 4TB mirror:

sudo zpool add -f vault mirror /dev/disk/by-id/ata-ST4000DM004-2CV104_ZFN0GTAK-part1 /dev/disk/by-id/ata-WDC_WD40EZRX-00SPEB0_WD-WCC4E0354579-part1

We also want to add the old NVMe as ZFS L2ARC cache.

ls -lh /dev/disk/by-id/nvme-Samsung_SSD_970_EVO_Plus_500GB_S4P2NF0M419555D

lrwxrwxrwx 1 root root 13 Mar  2 16:02 /dev/disk/by-id/nvme-Samsung_SSD_970_EVO_Plus_500GB_S4P2NF0M419555D -> ../../nvme0n1

sudo zpool add vault cache /dev/disk/by-id/nvme-Samsung_SSD_970_EVO_Plus_500GB_S4P2NF0M419555D

We can see the pool using this command:

sudo zpool list -v vault

NAME                                                    SIZE  ALLOC   FREE  CKPOINT  EXPANDSZ   FRAG    CAP  DEDUP    HEALTH  ALTROOT
vault                                                  45.4T  31.0T  14.4T        -         -     0%    68%  1.00x    ONLINE  -
  mirror-0                                             9.09T  8.05T  1.04T        -         -     0%  88.5%      -    ONLINE
    ata-ST10000VN0008-2JJ101_ZHZ1KMA0-part1                -      -      -        -         -      -      -      -    ONLINE
    ata-WDC_WD101EFAX-68LDBN0_VCG6VRWN-part1               -      -      -        -         -      -      -      -    ONLINE
  mirror-1                                             7.27T  6.49T   796G        -         -     0%  89.3%      -    ONLINE
    ata-ST8000VN0022-2EL112_ZA1E8GW4-part1                 -      -      -        -         -      -      -      -    ONLINE
    ata-ST8000VN0022-2EL112_ZA1E8S0V-part1                 -      -      -        -         -      -      -      -    ONLINE
  mirror-2                                             9.09T  7.54T  1.55T        -         -     0%  82.9%      -    ONLINE
    ata-ST10000VN0004-1ZD101_ZA2C69FN-part1                -      -      -        -         -      -      -      -    ONLINE
    ata-ST10000VN0004-1ZD101_ZA2964KD-part1                -      -      -        -         -      -      -      -    ONLINE
  mirror-3                                             10.9T  8.91T  2.00T        -         -     0%  81.7%      -    ONLINE
    ata-ST12000VN0008-2YS101_ZRT008SC-part1                -      -      -        -         -      -      -      -    ONLINE
    ata-ST12000VN0008-2YS101_ZV701XQV-part1                -      -      -        -         -      -      -      -    ONLINE
  mirror-4                                             5.45T  23.5G  5.43T        -         -     0%  0.42%      -    ONLINE
    ata-WDC_WD60EFRX-68L0BN1_WD-WX31D87HDU09-part1         -      -      -        -         -      -      -      -    ONLINE
    ata-WDC_WD60EZRZ-00GZ5B1_WD-WX11D374490J-part1         -      -      -        -         -      -      -      -    ONLINE
  mirror-5                                             3.62T  17.2G  3.61T        -         -     0%  0.46%      -    ONLINE
    ata-ST4000DM004-2CV104_ZFN0GTAK-part1                  -      -      -        -         -      -      -      -    ONLINE
    ata-WDC_WD40EZRX-00SPEB0_WD-WCC4E0354579-part1         -      -      -        -         -      -      -      -    ONLINE
cache                                                      -      -      -        -         -      -      -      -  -
  nvme-Samsung_SSD_970_EVO_Plus_500GB_S4P2NF0M419555D   466G  3.58G   462G        -         -     0%  0.76%      -    ONLINE

Once the pool is created, we wanted to set some pool properties so that in the future when we replace these drives with bigger drives, the pool will automatically expand.

zpool set autoexpand=on vault

With the pool created, we can then create our dataset or filesystem and its associated mount point. We also want to ensure that the filesystem also supports posixacl.

zfs create vault/airvideo
zfs set mountpoint=/mnt/av vault/airvideo
zfs set acltype=posixacl vault
zfs set acltype=posixacl vault/airvideo

We mount the new ZFS filesystem on /mnt/av because the /mnt/airvideo is soft-linked to the temporary /mnt/av2 volume that is still in operation. We first have to re-copy all our content from the temporary volume to the new ZFS filesystem.

sudo rsync --delete -aAXv /mnt/av2/ /mnt/av > ~/nohup.avs.rsync.out 2>&1 &

This took around 4 days to complete. We can all breathe easy again because all the data now have redundancy again! We can now bring the new ZFS filesystem live.

sudo systemctl stop apache2.service smbd nmbd plexmediaserver.service
sudo rm /mnt/airvideo
sudo zfs set mountpoint=/mnt/airvideo vault/airvideo
sudo systemctl start apache2.service smbd nmbd plexmediaserver.service

zfs list

NAME             USED  AVAIL     REFER  MOUNTPOINT
vault           31.0T  14.2T       96K  /vault
vault/airvideo  31.0T  14.2T     31.0T  /mnt/airvideo

The above did not take long and the migration is completed!

df -h /mnt/airvideo

Filesystem      Size  Used Avail Use% Mounted on
vault/airvideo   46T   32T   15T  69% /mnt/airvideo

Getting the capacity of our new ZFS filesystem shows that we now have 46TB to work with! This should last for at least a couple of years I hope.

I also did a quick reboot of the system to ensure it can come back up with the ZFS filesystem in tack and without issues. It has now been running for 2 days. I have not collected any performance statistics, but the services all feel faster.

Media Server Storage Hardware Reconfiguration

Our media server has reached 89% utilization and needs storage expansion. The storage makeup on the server uses Logical Volume Manager (LVM) and software RAID called mdadm. I can expand the storage by swapping out the hard drives with the least capacity with new hard drives with a larger capacity like I have previously done.

I thought I try something different this time around. I would like to switch from LVM to ZFS, an LVM alternative that is very popular with modern mass storage systems, especially with TrueNAS.

Before I can attempt the conversion, I will first need to backup all of the content from the media server. The second issue is that I needed more physical expansion space on the server to house more hard drives. The existing housings are all filled except for a single slot, which is going to be insufficient.

A related issue is that I no longer have any free SATA slots available for the new hard drives, so I purchased GLOTRENDS SA3116J PCIe SATA Adapter Card with 16 SATA Ports. Once this is installed, I have more than enough SATA ports for additional storage.

One downside of the SATA card is that it is limited to PCIe 3.0 x1 speed. This means data transfer is limited to a theoretical maximum of 1GB/s. Given that the physical hard drives top out at 200MB/s, I don’t think we need to be too concerned about this bottleneck. We will see in terms of practical usage in the future.

I am so lucky to have extra SATA power cables and extension cables laying around and my 850W existing power supply has ample power for the additional hard drives.

How do we store the additional hard drives with a full cabinet? I went to Amazon again and purchased a hard drive cage, Jaquiain 3.5 Inch HDD Hard Drive Cage 8X3.5 Inch HDD Cage. I did not have to buy any new hard drives yet, because I had plenty of old hard drives laying around. After I put together the cage with 8 really old and used hard drives, it looks something like this:

With this new additional storage, I am now able to backup the media content from my media server. However, before I do that there is one last thing that I need to do, and that is to experiment with an optimal ZFS pool configuration that will work with my content and usage. I will perform this experimentation with the additional storage before reconfiguring the old storage with ZFS. Please stay tuned for my findings.

After booting the system with 16 hard drives, I measured the power usage and it was hovering around 180W. This is not too bad, less than 2 traditional incandescent light bulbs.

Addendum:

During my setup, I had to spend hours deciphering an issue. My system did not recognize my old hard drives. After many trials, I finally narrowed down that the GLOTRENDS card is not compatible with an old 2TB Western Digital Enterprise Drive. This is the first time that I came across SATA incompatibilities.

There is another possibility that these drives were damaged by the usage of an incorrect modular power cable. I found that these drives also do not work with my USB3.0 HDD external dock as well. This gives additional credence that the physical drive has been damaged.

All my other drives worked fine with the card.

Another discovery is that not all modular power cables will work with my ASUS ROG STRIX 850W power supply. Initially, I thought I would use an 8-pin PCIe to 6-pin adapter along with a 6-pin to SATA power cable designed for Corsair power supplies.

OwlTree PCI-e 6 Pin Male to 4 SATA 1 to 4 SATA Female Power Supply Splitter Supply Cable for Corsair Modular RM650X RM750X RM850X RM1000X

Using the above cables will cause the power supply not to start. I had to hunt for the original cables that came with the STRIX power supply.

Learned a lot from rejigging this media server. My reward is to see my server boot up with 16 hard drives and 2 NVMe SSD drives recognized. I have never built a system with so many drives and storage before.

EXT4-fs Errors on NVME SSD

In my previous post, I replaced my NVME boot disk on our media server thinking that the disk was defective because the file system (EXT4-fs) was reporting numerous htree_dirblock_to_tree:1080 errors.

The errors continue to persist with the new disk, so I can eliminate the possibility of hardware as the cause of the issue.

I noticed that the htree_dirblock_to_tree:1080 errors were caused by the tar command and the time in which these errors occur coincided when the media server is being backed up. Apparently, the backup process is causing these errors with the tar command.

This backup process has remained unchanged for quite some time and has worked really well for us. I guess for some reason there is a bug in the kernel or in the tar command that is not quite compatible with NVME devices.

I had to resort to finding an alternative backup methodology. I ended up using the rsync method instead.

sudo rsync --delete \
  --exclude 'dev' \
  --exclude 'proc' \
  --exclude 'sys' \
  --exclude 'tmp' \
  --exclude 'run' \
  --exclude 'mnt' \
  --exclude 'media' \
  --exclude 'cdrom' \
  --exclude 'lost+found' \
  --exclude 'home/kang/log' \
  -aAXv / /mnt/backup

It looks like this method is faster and can perform incremental backup. However, instead of backing up to an archive file, which I later need to extract and prepare during the restoration process, I have to back it up to a dedicated backup device. Since the old NVME disk is perfectly fine, I reused it as my backup device. I have partitioned this backup device in the same layout as the current boot disk.

Device          Start        End    Sectors   Size Type
/dev/sdi1        2048    2203647    2201600     1G Microsoft basic data
/dev/sdi2     2203648 1921875967 1919672320 915.4G Linux filesystem
/dev/sdi3  1921875968 1953523711   31647744  15.1G Linux swap

The only exception is that the first partition is not marked as boot and esp, so during the restoration process I will have to mark that partition accordingly with the parted command by using the following commands:

set 1 boot on
set 1 esp on

The idea is that at 3am every night/morning, I will backup the root filesystem to the second partition of the backup drive. If anything happens with the current boot disk, the backup drive can act as an immediately available replacement, after a grub-install preparation as mentioned in the previous article.

Let us see how this new backup process works and hopefully, we can bid a final farewell to the htree_dirblock_to_tree:1080 errors!

Update: 2023-12-22

It looks like even with the rsync command, the htree_dirblock_to_tree:1080 errors still came back during the backup process. I decided to upgrade the kernel from vmlinuz-5.15.0-91-generic to vmlinuz-6.2.0-39-generic. Last night (2023-12-23 early morning) was the first backup after the kernel upgrade, and no errors were recorded. I hope this behavior persists and it is not a one-off.

Replacing NVME Boot Disk

A few months ago, the boot disk of our media server begin to incur some errors, such as the ones below:

Dec 17 03:01:35 avs kernel: [32515.068669] EXT4-fs error (device nvme1n1p2): htree_dirblock_to_tree:1080: inode #10354778: comm tar: Directory block failed checksum
Dec 17 03:02:35 avs kernel: [32575.183005] EXT4-fs error (device nvme1n1p2): htree_dirblock_to_tree:1080: inode #13500463: comm tar: Directory block failed checksum
Dec 17 03:02:35 avs kernel: [32575.183438] EXT4-fs error (device nvme1n1p2): htree_dirblock_to_tree:1080: inode #13500427: comm tar: Directory block failed checksum

The boot disk is a NVME device and I thought it may be due to over heating, so I purchased a heat sink and installed it. Unfortunately the errors persisted after the heat sink.

I decided to replace the boot disk with the exact same model which was the Samsung 980Pro 1TB. This should have been a pretty easy maintenance task. We clone the drive, and swap in the new drive. However, Murphy is sure to strike!

My usual goto cloning utility is Clonezilla, unfortunately this utility did not like cloning NVME drives. The utility resulted in a kernel panic after trying multiple versions. I am not sure what is the problem here. It could be Clonezilla or the USB 3.0 NVME enclosure that I was using for the new disk.

I resigned to using the dd command:

dd if=/dev/source of=/dev/target status=progress

Unfortunately this would have taken way too long something like 20+ hours, so I gave up with this approach.

I decided to do a good old restore of the nightly backup. I started by cloning the partition table:

sfdisk -d /dev/olddisk | sfdisk /dev/newdisk

I then proceeded with the restore of the nightly backup. Murphy strikes twice! The nightly backup was corrupted! I guess it is not surprising when the root directory’s integrity is in question. The whole reason why we are doing this exercise.

Without the nightly backup, I had to resort to a live backup. I booted system again, and performed:

sudo su -
mount /dev/new_disk_root_partition /mnt/newboot
cd /
tar -cvpf - --exclude=/tmp --exclude=/home/kang/log --exclude=/span --exclude="/var/lib/plexmediaserver/Library/Application Support/Plex Media Server/Cache" --one-file-system / | tar xvpzf - -C /mnt/newboot --numeric-owner

The above took about an hour. I then copy the /span directory manually, because this directory tends to change while the server is up and running.

With all the contents copied, I forgot how to install grub and had to re-teach myself again. I had to use a live copy Ubuntu USB and use that to boot up the machine, and then mount both the root and efi partitions respectively.

nvme1n1                              259:0    0 931.5G  0 disk
├─nvme1n1p1                          259:1    0     1G  0 part  /boot/efi
├─nvme1n1p2                          259:2    0 915.4G  0 part  /
└─nvme1n1p3                          259:3    0  15.1G  0 part  [SWAP]

And install GRUB.

sudo su -
mkdir /efi
mount /dev/nvme1n1p1 /efi
mount /dev/nvme1n1p2 /mnt
grub-install --efi-directory /efi --root-directory /mnt

I also have fix the /etc/fstab to ensure the root partition and /boot/efi partition are properly referenced by their corresponding, correct UUID. The blkid command came in handy to find the UUID. For the swap partition, I had to use the mkswap command before I get the UUID.

After I rebooted, I reinstalled GRUB one more time with the following as super user:

grub-install /dev/nvme1n1

I also updated the initramfs using:

update-initramfs -c -k all

For something that should have taken less than an hour, it took the majority of the day. The server is now running with the new NVME replacement disk. Hopefully this resolves the file system corruptions. We have to wait and see!

Update: The Day After

The same errors occurred again! I noticed that these corruptions occur when we do a system backup. How ironic! I later confirmed that performing the tar command on the root directory during the backup process can cause such an error. I now have to see why this is. I will disable the system backup for the next few days to see if the errors come back or not.