Got a warning for my blog going over 100GB in bandwidth this month… which sounded incredibly unusual. My blog is text and a couple images and I haven’t posted anything to it in ages… like how would that even be possible?
Turns out it’s possible when you have crawlers going apeshit on your server. Am I even reading this right? 12,181 with 181 zeros at the end for ‘Unknown robot’? This is actually bonkers.


What is that log analysis tool you are using in the picture? Looks pretty neat.
It’s a mix, I put two screenshots together. On the left is my monthly bandwidth usage from CPanel on the right is Awstats (though I hid some sections so the Robots/Spiders section was closer to the top).
I thought I recognized it. Hell of a blast from the past, haven’t seen it in fifteen years at least.
I think they’re winding down the project unfortunately, so I might have to get with the times…
I mean, I thought it was long dead. It’s twenty-five years old, and the web has changed quite a bit in that time. No one uses Perl anymore, for starters. I used Open Web Analytics, Webalizer, or somesuch by 2008 or so. I remember Webalizer being snappy as heck.
I tinkered with log analysis myself back then, peeping into the source of AWStats and others. Learned that a humongous regexp with like two hundred alternative matches for the user-agent string was way faster than trying to match them individually — which of course makes sense seeing as regexps work as state-machines in sort of a bytecode close to machine code. My first attempts, in comparison, were laughably naive and slow. Ah, what a time.