GoAccess Bulk Script

June 27, 2026 5:54:50 pm

After putting up the Documentation for TaskFiend, by far my most checked out code, I got curious if anybody was actually accessing it. As a Free Software Zealot™, I wasn’t about to use Google Analytics. I don’t really even want to put more than the bare minimum JavaScript (grumble grumble cookie notice), much less trakcing cookies, so I was basically looking for something to make my Apache logs pretty. Enter GoAccess.

GoAccess’ page has a big picture of it as a CLI utility, but it will render an HTML file in real time if so requested:

goaccess [path to access log] -o [desired output file path - must be in an htdocs folder so it can be served]   --log-format=COMBINED   --real-time-html   --daemonize

I wanted to keep it private from you snoops on the internet, so I set up Apache Basic Auth to protect the directory. That was kind of irritating to do, but I just have to do it once.

Except this is totally cool, I should do this for all the sites on this server! I do not want to open all those config files. As a not perfect workaround, all the sites’ analytics are served from one website. Then I only need to protect one directory.

After attempting to run this twice (for two different sites), I found that it requires the use of a port, so running it a second time will fail silently because the port is already in use.

I don’t really need up to the second analytics - I’m just satisfying mild curiosity. I decided¹ to have it write a script that generates a file once and put it on a cron job. Even every 15 minutes is more than I need², but I decided on that based on the scientific principle of yeah sure why not:

*/15 *  *   *   *     /home/kj/bin/update_all_goaccess_sites.sh

At first I hard-coded an array of (site1 site2 etc), but that was unamusing. Instead, I wrote a script to go through all the directories in my vhosts directory searching for files named access.log. It sends all those paths along to GoAccess. The resultant file is named sitename.html:

The need for the Dr. Binocs functionality (a list of directories not to process) got refactored away, but I kept it in here because it seemed like the kind of thing that might one day be useful.

#!/bin/bash

base=/var/www/vhosts

dr_binocs=(dead)


for full_path in `find $base/*/logs/ -name access.log`; do

    if [ `printf '%s\0' "${dr_binocs[@]}" | grep -F -x -z -- $full_path` ]; then
        echo SKIP - Skipping $full_path because Dr. Binocs told me to.
    else 

        # https://stackoverflow.com/questions/918886/how-do-i-split-a-string-on-a-delimiter-in-bash
        IFS='/' read -ra site <<< "$full_path"
        echo Found "${site[4]}"

        time goaccess "$full_path" -o "$base/[whatever directory corresponds to the site you want to host them from]/htdocs/analytics/${site[4]}.html" --log-format=COMBINED
    fi

    echo '';
done

echo ''
echo Fin! htts://[your website]/index.php

Note that it won’t create directories if the out path doesn’t exist. mkdir before you get started.

Although I could deduce any given site’s path, I want to be able to just troll a list of sites whose analytics I can peer into. I wrote a PHP script that basically just runs ls. That’s the index.php file referred to on the last line of the prior script:

<?php

// Create a list of files in this directory
$files = scandir(__DIR__);

$ignore = ['.', '..'];


foreach ($files as $file) {
    if (in_array($file, $ignore)) {
        continue;
    }

    echo '<p><a href="'.$file.'">'.$file.'</a></p>';
}

A couple obvious improvements could be made but I don’t care that much:

A full HTML file rather than just a random snippet
Remove the .html off the name of the site

Will this give me actionable insights? No. Will it provide amusement for a few days before I forget about it completely? Yes. I share this in case anybody out there is in a similar boat and wants a couple days of unproductive amusement. Enjoy!

Claude decided ↩︎
Well, any is more than I need ↩︎