• VitoRobles@lemmy.today
    link
    fedilink
    English
    arrow-up
    11
    ·
    6 hours ago

    In emails sent to Patokallio after the DDoS began, “Nora” from Archive.today threatened to create a public association between Patokallio’s name and AI porn and to create a gay dating app with Patokallio’s name. These threats were discussed by Wikipedia editors in their deliberations over whether to blacklist Archive.today, and then editors noticed that Patokallio’s name had been inserted into some Archive.today captures of webpages.

    “Honestly, I’m kind of in shock,” one editor wrote. “Just to make sure I’m understanding the implications of this: we have good reason to believe that the archive.today operator has tampered with the content of their archives, in a manner that suggests they were trying to further their position against the person they are in dispute with???”

    That and their refusal to talk to any journalist who references information about Patokallio’s blog makes archive.today unreliable.

    Fuck them.

  • Doug Holland@lemmy.world
    link
    fedilink
    English
    arrow-up
    18
    ·
    17 hours ago

    Crap. Obviously, I’m gonna gotta stop using archive.today, but it’s the only way around paywalls at numerous sites.

    Removepaywalls.com (plural) inserts ads, often for shady operations.

    Removepaywall.com (singular) usually works, but it’s tricky sharing the links (i.e., “choose option 2” or “choose option 4”).

    Byebyepaywall.com has old, dead options.

    Wayback Machine bombs out a lot.

    And ghostarchive.org is successful so rarely it’s really a last resort.

    Anyone know of any others?

    • CharlesDarwin@lemmy.world
      link
      fedilink
      English
      arrow-up
      4
      ·
      30 minutes ago

      The thing that has always annoyed me about archive.is is that using Firefox + VPN seems to result in endless Captcha. But works in Chrome, go figure. I’m very suspicious of sites that somehow only work properly under Chrome.

    • Trudge@piefed.social
      link
      fedilink
      English
      arrow-up
      20
      ·
      16 hours ago

      Possibly irrelevant, but some browsers have a “reading mode” which, in conjunction with the ol’ Hitting F11 and Then Esc Trick, will produce the whole article before a paywall can finish loading.

  • paraphrand@lemmy.world
    link
    fedilink
    English
    arrow-up
    10
    ·
    15 hours ago

    Is there a reason self hosted paywall bypass tools don’t exist? Is it because these services pay for access?

      • paraphrand@lemmy.world
        link
        fedilink
        English
        arrow-up
        5
        ·
        5 hours ago

        At first glance, this does not bypass paywalls. It archives web pages.

        People conflate the two services because some of them bypass paywalls as they archive.

        I specifically asked for about paywall bypass on purpose.

        • deceiver@infosec.pub
          link
          fedilink
          English
          arrow-up
          3
          ·
          4 hours ago

          the archiving mechanism itself is what bypasses paywalls. it archives by fetching pages server-side before client-side JavaScript enforces paywalls

          • paraphrand@lemmy.world
            link
            fedilink
            English
            arrow-up
            0
            ·
            edit-2
            2 hours ago

            Can this be done in a browser extension? I’m basically wondering why people don’t tell other people about Paywall bypass software on Lemmy. Is it because it sucks? Doesn’t exist?

            Such software seems like it would be very Lemmy, and very Linux, and very piracy, and very anarchic. So why am I not already aware of any?

            • deceiver@infosec.pub
              link
              fedilink
              English
              arrow-up
              1
              ·
              2 hours ago

              it absolutely can! there’s Bypass Paywalls Clean developed by magnolia1234. the reason you don’t see them shared often is that they’re repeatedly taken down from official extension stores like the Chrome Web Store and Firefox Add-ons, and platforms like GitHub, due to legal and political pressure from publishers, which pushes them to increasingly obscure and/or questionable hosting platforms that most normal users wouldn’t touch - case in point, Bypass Paywalls Clean itself is currently hosted on GitFlic, a Russian code hosting platform, as it’s been pushed outside the reach of Western legal frameworks

    • hemko@lemmy.dbzer0.com
      link
      fedilink
      English
      arrow-up
      3
      arrow-down
      1
      ·
      14 hours ago

      I think a subscribed user of the news site has to upload the “unlocked” article to the archive website.

      • deceiver@infosec.pub
        link
        fedilink
        English
        arrow-up
        3
        ·
        7 hours ago

        no, archive.today (and similar services like the Wayback Machine) work by fetching the page directly through their own servers, essentially acting as a headless browser that renders the page and saves a snapshot. the archive service itself makes the HTTP request, executes JavaScript, and captures the resulting document object model - no subscriber involvement required

          • deceiver@infosec.pub
            link
            fedilink
            English
            arrow-up
            0
            ·
            4 hours ago

            soft paywalls are enforced by JavaScript running in your browser - the server sends the full article content regardless, and then the JavaScript checks if you’re a subscriber and hides or blocks it if not. when archive.today or a self-hosted tool like ArchiveBox fetches the page, it gets the full content directly from the server before any of that JavaScript enforcement runs. the server doesn’t know or care whether you’re a subscriber, it just responds to the request

            • paraphrand@lemmy.world
              link
              fedilink
              English
              arrow-up
              1
              ·
              2 hours ago

              Thanks!

              I always assumed that wasn’t the case because Paywall bypass extensions are not linked in a reply when someone screams about paywalls in a thread on Reddit or Lemmy. Why is that possible, but not possible with a browser extension?

              Are soft paywalls uncommon?

              • Fiery@lemmy.dbzer0.com
                link
                fedilink
                arrow-up
                2
                ·
                14 minutes ago

                Soft paywalls only exist on badly made sites (which make up a large part of all sites so it’s still more effective than it has any right to be).

                Many news sites with paywalls have a proper hard paywall. The only way to get around those is with an account or with an exploit. Neither of those two are going to be published for use in an extension though (as it’d get deactivated very fast).

    • dan@upvote.au
      link
      fedilink
      arrow-up
      9
      ·
      edit-2
      19 hours ago

      It works well because they use paid accounts to scrape a bunch of paywalled sites, which is why publishers are trying to figure out who runs it.

      It’s completely untrustworthy now that they’ve shown that they can (and do) edit archived pages.

    • SolacefromSilence@fedia.io
      link
      fedilink
      arrow-up
      3
      arrow-down
      1
      ·
      19 hours ago

      I used to find dead links annoying until I realized that many dead links are also saved in the wayback machine. This comment isn’t only about Wikipedia.

    • dan@upvote.au
      link
      fedilink
      arrow-up
      3
      arrow-down
      2
      ·
      edit-2
      19 hours ago

      Why do you need an archive of Wikipedia though? Each page retains its entire history, so you can easily go back to old versions without using a third-party site (especially one that DDoSes people)

      Wikimedia also provide downloads of the whole of Wikipedia, including page history. You can easily have your own copy of the entirety of Wikipedia if you want to, as long as you’ve got enough disk space and patience to download it.

      Edit: I’m an idiot but I’m leaving this comment here. I didn’t realise you meant dead links on Wikipedia, not to Wikipedia.

      • solrize@lemmy.ml
        link
        fedilink
        arrow-up
        9
        ·
        19 hours ago

        Wikipedia was using archive.today to link to off-wiki articles such as news articles, where the links had stopped working. Similar to how it also uses the archive.org Wayback machine for the same purpose.