On Friday, Carol and I, along with our neighbours, attended the Zhou Shen concert at the Coca Cola Coliseum in downtown Toronto inside the Exhibition Place.
It was really exciting to see Zhou Shen in person. We all enjoyed his heavenly vocals. The three-hour concerts started at 8 pm without any intermissions. Time flew by really fast.
In the past, when I got hold of a video that has hdmv_pgs_subtitle subtitle streams, I have always ignored it. Instead I tried to find a compatible subtitle in .srt format on the opensubtitles.org website. Today I came across a video that I am trying to archive that does not have the appropriate subtitles that I wanted. All of this would not have been an issue if my preferred mp4 format actually supports the hdmv_pgs_subtitle format.
I know an OCR (Optical Character Recognition) technique for extracting the subtitles from the hdmv_pgs_subtitle stream, but I am always in a hurry. This time, I bit the bullet and went down on this path.
Below are the steps that I had to go through.
First I had to download and install ffmpeg and mkvtoolnix packages on my Linux machine, and then execute the following commands to extract the Chinese subtitles that I wanted.
After the above commands, I will have mysub.idx and mysub.sup files. The first are the time index codes and the latter are the subtitle images.
On a Windows virtual machine, I had to download Subtitle Edit, a subtitle editor tool that has the OCR functionality, and convert the mysub.idx and mysub.sup into mysub.srt, which I can then later use to re-incorporate back into the archive video file.
After the OCR is completed.
Above is a screenshot of the application after the OCR is completed. I found that the engine mode of Tesseract + LSTM worked the best. Of course, I had to select the matching language that is befitting of the subtitle. Once I saved the finished product as mysub.srt I can then use this file to create archive.mp4 using ffmpeg.
Effectively, this article reports that after February 26th, you can only download books from the Kindle store to your Kindle device over Wi-Fi. This means that ebooks you purchased can no longer be downloaded and converted into an epub format to be consumed by another e-reader like Apple’s Books.app, which I normally do on my iPad or iPhone.
Of course, this policy change poses an immediate problem for me when consuming or reading my ebooks. However, it also in my view crosses an ethical boundary. Digital media such as ebooks, which you have paid full prices for, is no longer yours. The buyer of such content is at the mercy of the distribution platform, in this case Amazon. This simply does not sit well with me. In the past, Amazon has also been known to remove purchased content due to changes in distribution rights, which is normally outside of the buyer’s control.
I now have to adopt a new process that I will use whenever I buy ebooks from the Kindle platform. I will describe this process in detail below so that in the future should I need to refer to it, it is here.
This process will remove the Digital Rights Management from the ebook that you just purchased and allows us to store a DRM-free ebook in Calibre, an ebook management software. This process will only work on Windows, so I had to spin up a Windows virtual machine for this purpose.
Software required:
Figure 1 (Click image to enlarge)
Calibre (use the link to download the software for Windows);
DeDRM plugin (use the link to download the zip file);
KFX Input plugin (use the Calibre Preferences & Plugins to search);
Install the above Calibre plugins with the Preferences –> Plugins management;
Kindle for PC (must be version 2.4.0 (70904));
Use Kindle for PC to browse and access the ebooks that have been purchased from Amazon, and download the ebook you would like to convert.
Use Calibre and its Add books functionality to import the azw file from the My Kindle Content. See Figure 1 for details.
Once the book has been imported, Calibre should have a KFX format of the book. We need to convert it to the epub format for other reader devices using Calibre’s Convert books functionality.
I then move the epub formatted ebook onto my macOS Calibre version for long term storage and management.