Got a warning for my blog going over 100GB in bandwidth this month… which sounded incredibly unusual. My blog is text and a couple images and I haven’t posted anything to it in ages… like how would that even be possible?

Turns out it’s possible when you have crawlers going apeshit on your server. Am I even reading this right? 12,181 with 181 zeros at the end for ‘Unknown robot’? This is actually bonkers.

    • benagain@lemmy.mlOP
      link
      fedilink
      English
      arrow-up
      1
      ·
      2 months ago

      I think they’re winding down the project unfortunately, so I might have to get with the times…

      • [object Object]@lemmy.world
        link
        fedilink
        English
        arrow-up
        1
        ·
        2 months ago

        I mean, I thought it was long dead. It’s twenty-five years old, and the web has changed quite a bit in that time. No one uses Perl anymore, for starters. I used Open Web Analytics, Webalizer, or somesuch by 2008 or so. I remember Webalizer being snappy as heck.

        I tinkered with log analysis myself back then, peeping into the source of AWStats and others. Learned that a humongous regexp with like two hundred alternative matches for the user-agent string was way faster than trying to match them individually — which of course makes sense seeing as regexps work as state-machines in sort of a bytecode close to machine code. My first attempts, in comparison, were laughably naive and slow. Ah, what a time.