Pi-hole Installed – Prerequisite for UDM Pro

We are all in with Unbiquiti’s Unifi lined of networking products. To date with the exception of a few unmanaged switches, all of our home networking switches and Wi-Fi access points are Unifi products.

Current Network Layout (click above to enlarge)

At the heart of our network is our USG Firewall (USG 3P). We’ve been using the secured gateway since December of 2017. During the holiday break, I plan to replace it with the new Unifi Dream Machine Pro (UDM Pro). The main reasons are:

  • The current USG 3P is limited to 85 Mbps when threat detection is turned on;
  • Run Intrusion Prevention System (IPS) and Intrusion Detection System (IDS) with the UDM Pro at full IPS speeds which for me is at 1 Gbps download and 30 Mbps upload;
  • Upgrade my current Unifi Video solution to Unifi Protect, since Unifi Video is no longer supported and I have five Unifi G3 Flex cameras;
  • Will most likely replace my current Kuna solution with Unifi Protect external cameras, so it is totally private, and my security video solution is unified;
  • The ability to connect Unifi Protect with my other devices using Homebridge;
  • The device has a capacity of up to 3.5 Gbps so we are future proofed for faster ISP speeds;
  • Move the UniFi Network Controller software that is currently running on my NAS server to a dedicated piece of hardware, the UDM Pro;

During my research with the UDM Pro, there is a slight regression in functionality that I currently use in my home. I currently use a functionality called static host mapping that allows me to give my NAS server several different host names so that I can use those host names to route to different in-home services, that is hosted by Apache2 server on my NAS. For example, the host name media.home routes to my home Plex server, and books.home routes to my Calibre server. Long story short, I need to run my own Domain Name Service to get this functionality back.

I have deployed and configured a DNSmasq installation before, but when I was exploring ways to reconfigure (hack) UDM Pro’s DNSmasq implementation, I came to the realization that it is probably best to offload the DNS functionality from the UDM Pro and run a dedicated instance on my NAS server. This way the upgrade path of the UDM Pro by Ubiquity in the future is nice and clean, and I can fully configure my own DNS the way I needed it.

With further research, I came across an alternative to DNSmasq, and that is Pi-hole with unbound, a full recursive DNS resolver. With this combination, I can essentially maintain my own local DNS records, satisfying my static host mappings, and also have a more private Internet experience for my entire home network when domain names are resolved by the authorized party.

I chose the Alternative 1 installation methodology on my NAS, but with a twist. Since I already have Apache2 installed, after the installation, I purged lighttpd from the installation, which came with the Pi-hole installation process. I also added the following to my Apache 2 configuration:

<VirtualHost *:80>
    ServerName pi.hole
    DocumentRoot /var/www/html
</VirtualHost>

For extra security, I also added an .htaccess file containing:

% cat /var/www/html/.htaccess
<RequireAny>
    Require ip 192.168.167.0/24
    Require ip 172.17.168.1/24
</RequireAny>

The above ensures that only local computers or VPN clients on my network can gain access to Pi-hole. I also had to enable the above .htaccess file by adding the following in the main Apache 2 configuration file (/etc/apache2/apache2.conf):

<Directory /var/www/html/>
	AllowOverride All
</Directory>

After a quick restart, Pi-hole is installed and operational. With an afternoon of testing, I noticed that some Google based shopping links have been blocked. We cannot have that during the holiday season, so I had to white list the following sites for them to work:

Click above to enlarge

Configuring unbound was really easy, I just had to follow the instructions here.

Now that I have my own DNS server running, I am no longer using Cloudfare or Google’s domain services. This translates to a little more privacy. I can also track which domains are being blocked and which domains are being accessed when I visit sites. Since this solution is site wide within my house, no additional work is required on iPhones, iPads, and client computers.

With this prerequisite step completed, I am now all set to migrate to the UDM Pro.

Simple Home Networking

I thought it would be a good idea for me to give a a small tutorial on basic home networking issues, which many may find useful when diagnosing connectivity issues.

A modern, typical home network may look something like this:

Typical Network (click to enlarge)

Most people in our neighbourhood will have a cable based Internet access.1 Internet comes through via the coaxial cable, like the traditional cable that you used for your cable TV. This cable is connected to a cable modem, which in our case is the Hitron CDA3-35. The cable modem then makes the Internet accessible via classic networking cables with RJ45 plugs. Think of the cable modem as your main door to the Internet and nothing else. Since this box is typically provided by your cable company, you should probably not trust it, so it is a main door without a lock.

Some cable modems also do Wi-Fi, like the new Rogers Ignite Hubs. For best performance and better security,2 I would recommend configuring the cable modem in bridge mode and not in gateway or router mode. This means that it should not be the box provisioning and managing your network, and it will have its Wi-Fi functionality turned off. This also avoids double NAT-ing, something to be avoided in your home, in my opinion.3

You should invest in your own Wi-Fi access by purchasing something like the TP-Link AX1800 WiFi Router.4 This box provisions your residential network and your local Wi-Fi. You can purchase more advance / expensive Wi-Fi solutions here depending on the size and complexity of your residential layout.

If you have more than one Wi-Fi access point, I would recommend that they all have the same SSID but on different Wi-Fi channels. This will make it convenient and optimal for your Wi-Fi devices. Also keep in mind that some old / cheap IoT devices only like the 2.4GHz band. If you are in that situation, then you should create a specific 2.4GHz network with a different SSID.

If you want to try out VOIP (Voice-Over IP), you may also connect a VOIP Adapter. In our example, we have the Linksys PAP2T box. I am not going to go into details of how to acquire VOIP, or set it up, but this box effectively converts Internet traffic into voice traffic. Traditional landline phones can be linked up to the VOIP adapter using normal phone cables.

Okay, now that we have the different parts of the network defined, let us present a basic diagnostic workflow.

Basic Diagnostic Workflow

I hope the above introduction to the different parts of your home network and a workflow that you can follow will assist you in resolving some common connectivity issues in your home.

My Mistress has a Problem with Its Bottom!

I brought my No. 22 Great Divide Titanium bike to Evolution when I found out that my bottom bracket was making creaking and crunching noises during my last few rides. Finally the crank nearly seized up on my last ride.

I was not sure whether it was the bottom bracket or the crank shaft. I have had wonderful services from Evolution before, so instead of taking apart the crank and the bottom bracket myself, I decided to leave my precious with the capable gentlemen at Evolution.

They did not disappoint. They treated my bike with respect and we had excellent communication in terms of expectation setting; what needs to be done; and the replacement parts that were required. Chris was very knowledgeable and thorough and made sure that I knew all the options.

In the end I got my bike in the best time possible, under these pandemic schedules with scarce parts. Super thankful to the entire team at Evolutions and especially Chris for making it all happen without any surprises.

I will not hesitate in bringing my bike to Evolution again for any type of issues in the future.

Wi-Fi 6 Upgrade with HomeKit Headaches

I recently upgraded all my WiFi access points to the Unifi UAP-U6-LR and UAP-U6-Lite. This will elevate my home to Wi-Fi 6 capable.

This was extremely exciting as my 802.11ax capable devices can now get between 100Mbps to 400Mbps depending on where we are in the house. It seems even the 802.11ac devices got about a 30% speed bump.

As a result of this upgrade, two UAP-AC-M mesh and one UAP-AC-Pro access points were retired from my house. I don’t recommend buying these devices any more since the Wi-Fi 6 devices from Ubiquiti are way more capable with higher performance and increase range than their 802.11ac access points.

However, the honeymoon period did not last long. After about a week, HomeKit devices started to show the dreaded “No Response” labels. Specifically, I had connectivity problems with Leviton Smart Decora Dimmers. In the past, all I had to do was recycle the HomeKit device and it was all good. Another episode of HomeKit and Leviton dimmer switch nightmare was experienced and documented by my previous blog post.

In this particular instance, the Leviton dimmers were able to join the Wi-Fi network and I can validate that with the Unifi Controller software. However, our HomeKit App was not able to connect to the dimmer switches. It took me sometime to figure out that the dimmers were unreachable by other Wi-Fi clients, but was reachable by computers that were physically wired to our network.

I found out which access point the dimmer switches were connected to and ssh into the access point to see if I can ping those devices, and sure enough they were unreachable. Below is a screen capture of the ARP listing from the access point.

Normal ARP listing from the Wi-Fi Access Point

When the dimmers were unreachable, the HW address was set to 00:00:00:00:00:00. After rebooting the culprit access point, I was able to access the offline dimmer switches again from the HomeKit App.

In summary, when HomeKit devices are offline with the dreaded “No Response” labels, here are the following things to try:

  • Ensure that local DNS is working properly and caches are emptied so that the latest data are available;
  • Ensure the device itself has acquired a valid IP address that is from your network;
  • Ensure that the device is reachable from the HomeKit App, typically from your iPhone or iPad;
  • Back trace the physical upstream networking equipment that is connected to your HomeKit device such as switches and access points and see which requires rebooting;

Apple could improve the HomeKit experience by allowing users to perform a full backup of the HomeKit configuration and reset the Home and perform a restore. Unfortunately, the closest thing that I found was from the Home+ App, but they do not restore device connectivity just their configurations.

When HomeKit works, you are literally like god, able to command light and switches with your voice in your home. When it does not work, it is extremely difficult to debug, due to a lack of diagnostics and logging.

After this update, my current networking layout now looks like this:

2021 October Network Layout (Click to Enlarge)

Residential Solar Project Initiated

This spring, I installed solar panels on our green house. This project gave me the experience and knowledge of what I wanted for our house. In August of this year, we finally took the plunge and initiated our solar project for our house.

After much research, I settled with the following three vendors:

They all had a web presence and I initiated contact either by phone or with their online registration. For all three, I provided my postal code, my utility bill or usage, and they were able to prepare a quote for me to review. My initial request was for a grid-tie hybrid solution consisting of: Solar panels, and batteries. Specifically, I wanted to perform a full backup of my house electrical demands in the case of power outages. I wanted to avoid a typical solar only, net-metering, grid-tie solution. I also did not want a partial backup solution where certain high inductive loads such as air conditioners and dryers will not be available.

All three vendors came back with a simple solar net metering solution, the one that I specifically said I did not want. New Dawn Energy Solutions was the only vendor that gave me multiple options, one of which was a partial backup solution, which did not meet my full house backup requirement. With this initial misunderstanding, I thought it would be best that I spent sometime detailing exactly what my requirements are. I proceeded to create a slide deck with this purpose.

Long story short, getting a common understanding of my requirements was still a challenge for the vendors with the exception of New Dawn Energy Solutions. I was able to directly contact the engineer who prepared and designed the solution. This was during the weekend, and we were able to quickly clarify what I wanted and what New Dawn Energy Solutions can provide.

I decided to select New Dawn Energy Solutions and proceeded with a contract with them. While we await for permits, New Dawn Energy Solutions also helped me to start my energy audit for the Canada Greener Home Grant Program. Under this program, we can potentially get up to $5000 CAD back. The first of two audits was already performed by EnerTest. The auditor was super friendly, detailed, informative, and efficient. I would recommend EnerTest if you are going after the same program.

The current solution look something like this, but it is subject to change after an on site engineering assessment.

Our Solar Setup

As of this writing, the first energy audit is now completed. Now we will await for the engineering assessment and the required permits.

I am excited to generate clean energy and will no longer be guilty of enjoying the full capabilities of my air conditioner during the summer heat.

Ontario Covid-19 Vaccine Receipts

This morning I found out that the Ontario Government has made official Covid-19 Vaccine Receipts available through their website (https://covid19.ontariohealth.ca).

To download your receipts, all you will need is your Health Card. If you are like me and took a photo of your Health Card on your phone, it may not be sufficient, because information from both the front and back of the Health Card is required.

The service will allow you to download a PDF document for each dose of Covid-19 vaccine that you have received. To make it more convenient, I use the Preview app on my Mac to aggregate the data from both doses into a single PDF document. I then attached the PDF document on my iPhone’s Notes App for easy access should I require to show proof of my vaccination in 2021.

I also placed the information on my secured NAS server so that it can be accessed by any devices that has a secret link from the Internet, and generated a QR code for easy access for any third party that requires my proof of vaccination.

In the end, it looks something like what is shown above/right.

I also created a Siri Shortcut on the phone for easy access.

Now I have to repeat and rinse for all members of the family.

Ultrasonic Cleaner for Bike Chains

Generic Chain Cleaner

I like riding my bike but not cleaning my bike. Unfortunately cleaning my road bike especially the drive train is a necessity. Of course the most difficult part, the chain, is notoriously difficult to clean correctly.

In the past I have tried chain cleaners that look like the one on the right. In short, they don’t work.

SRAM Powerlink

The next evolution is to adopt a chain like the SRAM Powerlink or the Connex link, which can be easily taken apart. I still have to manually scrub the chain and it seems like no matter how many times you scrub the chain, it is still super dirty. Finally I came across the following YouTube video:

The host uses an ultrasonic cleaner and his result was really impressive. I went to Amazon and got myself one.

Flexzion Commercial Ultrasonic Cleaner 2L

I took off the chain and put it in the ultrasonic cleaner with a “cap” full of Simple Green all-purpose cleaner from Canadian Tire with hot tap water. I then run the cleaner for 10 minutes. After the first cleaning, the chain already look pretty spectacular. I lift the chain and repositioned it in the cleaner and run it for another 10 minutes. Took the chain out, rinse thoroughly with my garden hose, and put it back on. Here is the result with no scrubbing:

Click above to enlarge

Another nice thing about using this technique is that while the ultrasonic cleaner is doing its job, you can scrub the bike down. This bike cleaning session is the easiest one yet.

In summary, I highly recommend that you get an ultrasonic cleaner!

Update: Someone asked about cleaning the ultrasonic cleaner. There was no issue whatsoever. The grease did not stick to the container, and all I had to do was pour the dirty liquid out and give it a quick rinse. That was it. Simple. I see others on YouTube use a ziplock bag to contain the chain and the detergent, but I opted not to do that.

The Second Jab

At this point in time, all of our family members have our first dose of the Pfizer vaccine, and we are awaiting our second dose. As the Delta variant of the Covid-19 vaccine makes its way around the world, numerous reports are indicating that those who only have their first dose are only 33% protected from this variant.

Of course knowing this fact creates an a certain anxiety and urgency to get our second dose. Although we all have our second dose already scheduled when we got our first dose, those original schedules are weeks away. York Region today from 8am started taking appointments for rebooking second doses. This is of course very welcoming.

Unfortunately, as expected the scheduling web site experience leave very little to be desired.

I visited the site at around 7:50am, and it told me that I was in line and I had a 10 minutes wait. At this point, it is very reasonable, since it officially starts at 8am. I took the screen shot above, and as you can see the waiting time hasn’t really gone down and we have already passed our initial 10 minutes. To make matters worse, at around the 5 minutes mark, it immediately presented the booking form.

My sons who visited the site at 8am sharp got a waiting period of 1 hour. You can see how discouraging for some can be. Never mind vaccine hesitancy, it is these types of frictional booking experience that probably also discourages people from getting their doses. I really don’t understand why it is taking the organization so long to get there act together. Perhaps I’m being too harsh.

On the plus side, I did finally manage to book all members of my family for our second dose, because the site did allow me to make multiple bookings without having to virtually re-queue.

Automatic Transfer Switch

This is an update to my Sunroom Project that I detailed in a previous post.

After the installation of the solar panels, there was a question of whether the solar panels had enough juice to keep the batteries charge for cloudy days or night time operations. After a few days of operation, the observation is a definitive “no”.

The overall load of water pumps, fans, and temperature sensors amounted to be about 80W. The two solar panels (100W each, equaling 200W) during sunny days will only yield enough to cover this load. The less than optimal positioning on the sunroom’s roof will deprive the solar panels in operating in their optimal efficiency. Even on a really sunny day the panels may have a little surplus to give back to the battery, but nothing close to offer a continuous charge to the battery. Most of the time, the solar power is just enough to power the load and leave the battery as is. Left unattended, the batteries will slowly drain to nothing, as it is being discharged during cloudy days and at night times.

Automatic Transfer Switch from Amazon

So much for going entirely off grid! Since it is a sunroom, the roof based real estate is a bit precious, and two panels is as much as we wanted to take away from the sunshine that feed the plants. Not all bad news, we still have the opportunity to time shift our power needs from the grid, from peak time to non-peak time.

For the past few days, I had to go out to the sunroom during the evenings and switch the battery from solar power and back to the grid via the 480W DC Power Supply, so that it will charge itself back during the night. This is of course super inconvenient. What I needed is an Automatic Transfer Switch (ATS). Something like the one shown on the left offered by Amazon.

The price was pretty exorbitant, over $150.00, much more than what I wanted to spend. When there is a need, there is an opportunity to invent. I thought this is such a good opportunity to create my own ATS.

My ATS will be a simple 12V relay powered by the battery itself, and controlled by a WiFi capable microcontroller like the ESP32S that is perfect for the job. With a bit of searching on Amazon, I found this gem, a simple 12V relay that is only around $16.

URBEST 8 Pin JQX-12F 2Z DC 12V 30A DPDT 

At most, the relay only needed to handle 10A, so the 30A rating is a total overkill, but good enough for my purpose.

I already have an ESP32S that I purchased earlier from AliExpress or Ebay. This is a WiFi enabled micro controller that can be programmed with the popular Arduino IDE. They were less than $5 a piece when I got them, and was literally sitting on my shelf awaiting for a project such as this. The master plan is as follows:

click to enlarge

The EPS32S will remotely control the relay, which physically makes contact between the battery with either of the solar controller or the power supply. The when is determined by a remote server on the same home network. The ESP32S will periodically post the battery voltage status and the current state of the relay to the server.

Using this approach, I can place the majority of the smarts on the server instead of the micro controller. I can also change the logic without having to reprogram the ESP32S.

The ESP32S comes with many GPIO pins. We will make use of three of the GPIO pins, one to detect battery voltage via a voltage divider. The other two will send a pulse to a simple latch that will drive the switch position of the relay. The latch will use a popular NE555 chip in bistable mode. Here is a simple schematic that I put together.

click to enlarge

I prototyped the above circuit on a breadboard and using a desktop power supply to simulate the battery.

Everything worked as expected, and I proceeded to solder everything up on a PCB.

Below is the Arduino sketch that I wrote for the ESP32S to report the battery voltage and the relay switch state to my server.

#include <WiFi.h>
#include <HTTPClient.h>

/* Server and WiFi configurations are fake */

#define SERVER_IP "192.168.1.5"

#ifndef STASSID
#define STASSID "##############-iot"
#define STAPSK  "##################"
#endif

#define uS_TO_S_FACTOR 1000000

#define BATT_VOLTAGE_GPIO 36
#define GRID_PULSE_GPIO 25
#define SOLAR_PULSE_GPIO 26

float volt = -1.0;
RTC_DATA_ATTR int currentState = -1;

void setup() {

    pinMode(BATT_VOLTAGE_GPIO, INPUT);
    pinMode(GRID_PULSE_GPIO, OUTPUT);
    pinMode(SOLAR_PULSE_GPIO, OUTPUT);

    digitalWrite(GRID_PULSE_GPIO, LOW);
    digitalWrite(SOLAR_PULSE_GPIO, LOW);

    Serial.begin(115200);

    Serial.printf("Connecting to %s..\n", STASSID);

    WiFi.begin(STASSID, STAPSK);
    while (WiFi.status() != WL_CONNECTED) {
        delay(250);
        Serial.print(".");
    }

    Serial.print("\nConnected with IP: ");
    Serial.println(WiFi.localIP());
}

void pulse(int pin) {
    digitalWrite(pin, HIGH);
    delay(200);
    digitalWrite(pin, LOW);
}

// the loop function runs over and over again forever
void loop() {

    WiFiClient client;
    HTTPClient http;

    digitalWrite(LED_BUILTIN, HIGH);

    float v = 0;
    for (int i = 0; i < 200; i++) {
        v += analogRead(BATT_VOLTAGE_GPIO);
        delay(2);
    }
    volt = v / 200.0;

    Serial.print("Voltage Read: ");
    Serial.println(volt);

    http.begin(client, "http://" SERVER_IP "/autoTransferSwitch.php"); //HTTP
    http.addHeader("Content-Type", "application/x-www-form-urlencoded");

    // start connection and send HTTP header and body
    String postStr = "id=sunroom";
    postStr += "&volt=" + String(volt);
    postStr += "&state=" + String(currentState);

    int httpResponseCode = http.POST(postStr);
    digitalWrite(LED_BUILTIN, LOW);

    Serial.printf("Response code: %d\nResult: ", httpResponseCode);

    String instructions = http.getString();
    Serial.println(instructions);

    String result = "nothing";
    long sleepTime = 5L;

    char buffer[100];
    strncpy(buffer, instructions.c_str(), 100);

    const char d[2] = ",";
    int field = 0;
    char* token = strtok(buffer, d);
    while ( token != NULL ) {
        if (field == 0) {
            result = String(token);
        }

        if (field == 1) {
            sleepTime = atol(token);
        }
        token = strtok(NULL, d);
        field++;
    }

    if (result.equals("solar")) {
        pulse(SOLAR_PULSE_GPIO);
        currentState = 1;
    } else if (result.equals("grid")) {
        pulse(GRID_PULSE_GPIO);
        currentState = 0;
    } else {
        digitalWrite(GRID_PULSE_GPIO, LOW);
        digitalWrite(SOLAR_PULSE_GPIO, LOW);
    }

    http.end();

    Serial.print("Sleeping for seconds: ");
    Serial.println(sleepTime);

    esp_sleep_enable_timer_wakeup(uS_TO_S_FACTOR * sleepTime);
    esp_deep_sleep_start();
}

One nice thing about the ESP32S is its ability to go into a deep sleep where it can keep its state with almost zero power. This way, the micro controller doesn’t act as a power sink for the entire system. I took advantage of this feature, so that the server can also tell the ESP32S how long it should sleep.

On the server side, I have a simple PHP script that will take into account time of day, and the current battery charge.

<?php

date_default_timezone_set('America/Toronto');

// I've changed the location to protect my own privacy

$lat    = 42.9293;
$log    = -102.9478;
$zenith = 90;

$nextWait = 900;

// The voltage levels are ADC readings from ESP32 (divided by 10) and not actual volts

// voltage level when battery is fully charged and can either be used or be charged with solar
$solarChargeLevel = 280;

// voltage level when battery is okay to be used with or without solar (come off of grid)
$okChargeLevel    = 260;

// voltage level when battery must be charged (get on grid)
$mustChargeLevel  = 252;

$now = time();

$sr = date_sunrise($now, SUNFUNCS_RET_TIMESTAMP, $lat, $log, $zenith);
$ss = date_sunset($now, SUNFUNCS_RET_TIMESTAMP, $lat, $log, $zenith);

$srStr = date("D M d Y - h:i:s", $sr);
$ssStr = date("D M d Y - h:i:s", $ss);

$logFile = "/home/kang/log/autoTransferSwitch.log";
$dateStr = date("Y-m-d H:i:s");

header("Content-Type: text/plain");

$id      = isset($_POST["id"]) ? $_POST["id"] : null;
$batVolt = isset($_POST["volt"]) ? $_POST["volt"] : null;

// -1 = initial, 0 = grid state, 1 = solar state
$currentState = isset($_POST["state"]) ? intval($_POST["state"]) : -1;

file_put_contents($logFile, "$dateStr, " . 
    "device reported: battery voltage: $batVolt and current state: $currentState\n",
    FILE_APPEND);

if (!is_null($id)) {

    if ($id === "sunroom") {
        $action = "nothing";
        $b      = intval($batVolt) / 10.0;

        // During day is defined by one hour after sunrise and one hour after sunset
        $duringDay   = (($sr + 3600) <= $now && $now <= ($ss - 3600));
        $duringNight = (!$duringDay);

        if ($currentState == -1) {
            if ($b <= $mustChargeLevel) {
                $action       = "grid";
                $currentState = 0;
            } else {
                $action       = "solar";
                $currentState = 1;
            }
        } else if ($currentState == 0) {

            // We are charging from the grid

            if ($duringDay && $b >= $okChargeLevel) {

                // During the day we want to use solar or the battery as much as possible;
                // This is a trickle charge so that we can take advantage of the sun.

                $action       = "solar";
                $currentState = 1;

            } else {

                // Otherwise charge until battery is full

                if ($b >= $solarChargeLevel) {
                    $action       = "solar";
                    $currentState = 1;
                }

            }

            if ($b >= 0.985 * $solarChargeLevel) {
                // We are getting closer to fully charge so
                // reduce communication interval from 15 min to 5 min
                $nextWait = 300;
            }

        } else if ($currentState == 1) {

            // We are either using the battery or the solar panels

            if ($b <= $mustChargeLevel || $now > $ss + 3600) {

                // Charge the batteries if we must or one hour past sunset

                $action       = "grid";
                $currentState = 0;
            }

            if ($b <= 1.025 * $mustChargeLevel) {
                // We are getting closer to require charging so reduce
                // communication interval from 15 min to 5 min
                $nextWait = 300;
            }

        }

        // for testing purpose
        // $nextWait = 15;
        echo ($action . "," . $nextWait);

        file_put_contents($logFile, "$dateStr, $batVolt, $action, " .
            "sun: [$srStr - $ssStr], new state: $currentState, wait: $nextWait\n",
            FILE_APPEND);
    }

}

The above state transition logic is pretty simple to follow so I am not going to explain it in depth here. There are a couple of features that I like to expand on.

By default, the sleep time is 15 minutes, but the server will shortened it to a shorter interval of 5 minutes when the battery is near empty or full. This sample frequency should be enough for the server to make the appropriate switching decision. Once a switch in the relay has occurred, the sleep time can be reverted back to 15 minutes. For debugging purposes, I can also change the server script to a very fast sample rate of every 15 seconds.

The other feature is the account of day and night time. This first version of the algorithm will attempt to use solar and/or battery during the day, and only charge from the grid when it is absolutely necessary. If we do charge from the grid during the day, we don’t need to fill it up, but only charge it to a level that can be used again. We will then attempt to top up the battery about 1 hour after sunset.

My ATS is now installed and operating for an entire day, so far so good and I don’t have to go into the sunroom any more to perform a manual switch. The algorithm can be further enhanced by getting additional readings from the solar controller, but I didn’t want to go through the trouble. I think what I have so far should be sophisticated enough. We’ll see.

2021-06-18 Update: After several days of operation, I noticed that the ATS, more specifically the ESP32 micro-controller hangs or fails to wake from deep sleep after a few hours of operation. Upon further investigation, it may be a combination of unstable supply voltage (from the battery), memory leaks of the standard WiFi libraries or the usage of String types. I am not sure. I had to re-write my ESP32 Arduino sketch to include a watch dog reset as well as perform a timed, software triggered hardware reset of the controller itself every 30 minutes or so. I also eliminated the deep sleep functionality and simply resorting to delays and resets. It has been running for about a week without any hiccups.

Unifi Access Point Disconnected

I have the following Unifi networking setup:

  • Unifi Secure Gateway (USG-3P) running 4.4.55.5377096 firmware
  • 3 Unifi Access Points AP-AC-PRO running 4.3.28.11361 firmware
  • 3 Unifi AP-AC-M running 4.3.28.11361 firmware
  • Unifi Controller (6.2.25) running on Ubuntu
  • The AP’s are connected to Unifi Switches running 5.43.36.12724 firmware

The AP’s are separated into two AP Groups. One of the AP Groups contains the 3 AP-AC-PRO and one of the AP-AC-M. This latter group was the problematic group. The other AP Group which contains the remain 2 AP-AC-M continue to work flawlessly throughout incident. From here on when I reference an AP Group, it is the problematic group that contains the 3 AP-AC-PRO and 1 AP-AC-M.

It all started yesterday when I noticed that my WiFi was a bit slow in the backyard and I wanted to change my radio configurations on one of the AP-AC-PRO and one AP-AC-M in the same AP Group. After the provision, the AP went disconnected as shown by the controller.

I ssh into the problematic AP-AC-PRO and discovered in /var/log/messages that there were many instances of the following log entry:

syswrapper: [state is locked] waiting for lock

I attempted to reboot the device but the device remained in the same “locked” state.

Since it is inconvenient for me to physically reset the AP, I attempted to reset the device via ssh using the command:

syswrapper.sh restore-default

Unfortunately, this did not always work because it immediately just shows the same locking message:

syswrapper: [state is locked] waiting for lock

I had to reboot the device via the command line reboot. As soon as I can ssh into the device after reboot, I immediately execute the restore command as above. This took a lot of trial and error because my timing is often off. When I miss the window, I will get the lock message again. I find that my chances are higher if I first forget the device on the controller first.

A quick suggestion to the Unifi team. It would be nice that the restore-default command if not able to restore immediately due to the lock, would at least set a flag in persistent flash memory of the device so that on the next reboot it will perform the restore then. This feature will safe me A LOT OF TIME!

Once the device is reset to factory default I proceeded to reconfigure it to the WiFi networks that I had. Unfortunately, when I try to provision the changes (adding the re-adopted device back into its original AP Group that is associated with the WiFi networks), it went into the disconnected state again. To make matters worse, the other AP’s in the same AP Group started to misbehave. Some would go into a provisioning state and followed by a disconnected state, while others go into an adopting state. This is of course very unnerving and frustrating. However, this observation lead me to remember a previous episode that I experienced a few weeks ago.

When I updated the controller to 6.2 and upgraded the AP’s firmware, the AP exhibited a similar locking issue. The solution that I employed was to restore to factory default and re-adopt the device. However after readopting, I assigned the AP to a brand new AP Group which I associated with the original WiFi networks. Simply adding the device to the original AP Group did not solve the issue.

When I tried this solution yesterday, it did not work. The device continues to go into a disconnected state immediately after provisioning when I added to the new AP Group. After many hours and much experimentation, I decided to erase all the WiFi networks and the problematic AP Group. I recreated the WiFi networks, and created a new AP Group and proceeded to add each AP one by one (after a reset to factory default). In summary here are the final steps that got me out of this pickle:

  1. Forget all AP’s in the affected AP Group.
  2. Remove all WiFi networks from the AP Group.
  3. Delete the AP Group.
  4. Delete all the WiFi networks that was associated with the above AP Group.
  5. Re-create all the WiFi networks and associate with a brand new AP Group.
  6. For each AP, use ssh to reset to factory default, adopt, and add them one at a time to the AP Group using the controller web UI.
  7. Since there were four AP’s (3 AC-Pro and 1 AC-M), I waited until the AP is fully connected and can service WiFi clients before I continue with the next one.

I am documenting this so that I can share with Unifi support. This has happened twice now, and each time I spent multiple hours to try to get my WiFi network working. In these pandemic times, WiFi is as important as electricity and plumbing. Since Tier-1 support was unable to resolve this issue, waiting for Tier-2 support (around 24 hours) is a bit “hard to swallow”.

Any ways I am glad that I was able to resolve this and brought my WiFi networks back up and running with the 4 affected AP’s, FOR NOW. However I must admit, these two episodes have made me apprehensive of making configurations to these AP’s, thinking that the next provisions will result in many more lost hours.

I hope the Unifi team can use this information and see if there is an issue relating to AP Group provisioning, since this seem to have triggered the issue in both cases.