not digg
  • Communities
  • Create Post
  • Create Community
  • heart
    Support Lemmy
  • search
    Search
  • Login
  • Sign Up
Innerworld@lemmy.world to News@lemmy.worldEnglish · 18 hours ago

At least 3 major outlets — The New York Times, The Guardian, and Reddit — have blocked the Internet Archive’s Wayback Machine from accessing their content

www.mediapost.com

external-link
message-square
41
fedilink
365
external-link

At least 3 major outlets — The New York Times, The Guardian, and Reddit — have blocked the Internet Archive’s Wayback Machine from accessing their content

www.mediapost.com

Innerworld@lemmy.world to News@lemmy.worldEnglish · 18 hours ago
message-square
41
fedilink
Not In Our Back Yard: Publishers Block Wayback Machine
www.mediapost.com
external-link
They are afraid the Wayback Machine is serving as a back door for AI content scrapers.
alert-triangle
You must log in or register to comment.
  • traxex@lemmy.dbzer0.com
    link
    fedilink
    arrow-up
    12
    ·
    3 hours ago

    Reminder to donate to the Internet Archive so they can keep fighting the good fight.

  • Saryn@lemmy.world
    link
    fedilink
    arrow-up
    27
    ·
    5 hours ago

    Content scraping is harming the information business in ways that could not have been foreseen.

    What an absolute ridiculous thing to say.

    • ameancow@lemmy.world
      link
      fedilink
      English
      arrow-up
      8
      arrow-down
      1
      ·
      4 hours ago

      “This isn’t letting us shape reality, that’s our entire business model, we are working tirelessly to shape people’s reality so this is definitely a no-go.”

  • Takeshidude@lemmy.world
    link
    fedilink
    arrow-up
    9
    ·
    5 hours ago

    Start self-hosting archive box They cant block everyone

  • green_goglin@thelemmy.club
    link
    fedilink
    arrow-up
    25
    ·
    9 hours ago

    Nobody tell NYT about being able to add another “.” Subsequent to”.com” to bypass their paywall.

    • gAlienLifeform@lemmy.world
      link
      fedilink
      arrow-up
      2
      ·
      5 hours ago

      I’m probably screwing it up here, but neither of these are working for me

      https://www.nytimes.com.2026/02/04/us/politics/supreme-court-california-congressional-map.html

      https://www.nytimes…com/2026/02/04/us/politics/supreme-court-california-congressional-map.html

      • SocialMediaRefugee@lemmy.world
        link
        fedilink
        arrow-up
        5
        ·
        4 hours ago

        Put the an extra “.” after the “.com” so “.com.”

        • gAlienLifeform@lemmy.world
          link
          fedilink
          arrow-up
          3
          ·
          3 hours ago

          Ah, https://www.nytimes.com/./2026/02/04/us/politics/supreme-court-california-congressional-map.html won’t work on my usual browser (which just ends up loading NYTs homepage) but it does work in a Chrome incognito window

          Thank you!

      • green_goglin@thelemmy.club
        link
        fedilink
        arrow-up
        2
        ·
        3 hours ago

        you’re welcome:

        
        https://www.nytimes.com/./2026/02/04/us/politics/supreme-court-california-congressional-map.html
        
        
        • gAlienLifeform@lemmy.world
          link
          fedilink
          arrow-up
          4
          ·
          3 hours ago

          I think auto complete or something might have messed with what you intended to post, that link still hits the paywall for me, but using your guidance I was eventually able to figure out that

          nytimes.com./2026 etc.

          works in a Chrome incognito window. The “.” after “com” and the “/” after that “.” are apparently the critical bits

    • stegosaur@lemmy.world
      link
      fedilink
      arrow-up
      7
      ·
      7 hours ago

      Awesome, this is the best paywall hack I have ever seen!

  • tackleberry@thelemmy.club
    link
    fedilink
    arrow-up
    19
    ·
    10 hours ago

    Fuck Reddit. That website has been selling our data and using it to train AI… I say fuck 'em

    • ameancow@lemmy.world
      link
      fedilink
      English
      arrow-up
      6
      ·
      4 hours ago

      about 1 out of every 5 posts is an advertisement in disguise, and about 15% or more of users are actually bots.

      All of this is expected as a consequence of partnership with AI companies and google, and the site is basically walking dead, just a shell of corporate interests, manufactured conversations, algorithmically fed bait posts and so on… but it is a tad creepy how many of the AI bots keep making posts in “explain the joke” subreddits. We are so fucked.

      • tackleberry@thelemmy.club
        link
        fedilink
        arrow-up
        2
        ·
        2 hours ago

        great catch! you can actually see the AI slop when it pops up. REddit is dead, and you should delete your data from that cesspool

        • ameancow@lemmy.world
          link
          fedilink
          English
          arrow-up
          1
          ·
          2 hours ago

          I got 12 years of some of the top submissions and comments of all time, I will leave my data there because I want our granddroids to learn the very best from us.

  • gAlienLifeform@lemmy.world
    link
    fedilink
    arrow-up
    12
    ·
    10 hours ago

    Is the Guardian actually blocking the Internet Archive? Seems to work for me

    https://web.archive.org/web/20260224104430/https://www.theguardian.com/us-news/2026/feb/23/trump-iran-airstrikes-nuclear-deal

    Meanwhile,

    https://web.archive.org/web/20260224121247/https://www.mediapost.com/publications/article/413017/ai-basic-training-newsrooms-offer-little-practica.html?initial_article=412911&es_index_start=3&es_index=0

  • CombatWombat@feddit.online
    link
    fedilink
    English
    arrow-up
    37
    arrow-down
    2
    ·
    15 hours ago

    I’m certain they’ve wanted to do this for a long time, and AI is a convenient way to justify it, rather than admitting they don’t want humans using it to circumvent the paywall. It does solidify for me personally that the LA Times is the paper of record for the United States going forward, rather than the New York Times.

    • hector@lemmy.today
      link
      fedilink
      arrow-up
      1
      ·
      3 hours ago

      I just got a gift subscription to the NYTimes, for the first time since I quit in 2018, and it’s really gone downhill. I am learning about more big scoops from the guardian from lemmy posts than I see in their paper. I think Israel’s final solution for gaza here broke their brain, they had an identity crisis and sided with Israel and fascism over all the fourth estate democracy mumbo jumbo.

      They haven’t broken a single big story that I recall in the past year. Not a single one, even the wall street journal published epstein’s birthday letter from the president. The NYTimes gave up, they are no longer the paper of record, whatever problems before they covered events more thoroughly and had courage to break big stories, and now they don’t.

    • gAlienLifeform@lemmy.world
      link
      fedilink
      arrow-up
      11
      arrow-down
      1
      ·
      10 hours ago

      The LA Times also blocks the Internet Archive unfortunately. I’d recommend PBS NPR ProPublica or some other nonprofit organization for your US paper of record.

      • CombatWombat@feddit.online
        link
        fedilink
        English
        arrow-up
        4
        ·
        6 hours ago

        Ugh. Thanks for the heads’ up — I’ve definitely posted archive links without noticing they’re blocked before. PBS and NPR have really gone downhill with the budget cuts. ProPublica is great, but their coverage is pretty narrow, so there’s a lot of stories they don’t cover at all. It’s getting harder and harder to find a quality source.

        • cecinestpasunbot@lemmy.ml
          link
          fedilink
          English
          arrow-up
          1
          ·
          2 hours ago

          Unfortunately, I think most quality sources with broad coverage aren’t free. Even the paid sources almost always have a corporate bias. Of those the financial times probably does the least to editorialize. Beyond that I think you just have to find independent journalists or outlets with a narrower investigative focus that you can trust.

    • WesternInfidels@feddit.online
      link
      fedilink
      English
      arrow-up
      4
      ·
      8 hours ago

      The South African billionaire paper that wouldn’t endorse Harris? Well, our options all suck, I guess.

  • Tony Bark@pawb.social
    link
    fedilink
    English
    arrow-up
    107
    arrow-down
    1
    ·
    18 hours ago

    Really, they think Internet Archive is the problem?

    • ameancow@lemmy.world
      link
      fedilink
      English
      arrow-up
      3
      ·
      edit-2
      4 hours ago

      Yah it’s a problem for their agenda of manufacturing culture, social discourse and consent for hundreds of millions of people.

    • AmbitiousProcess (they/them)@piefed.social
      link
      fedilink
      English
      arrow-up
      53
      arrow-down
      1
      ·
      18 hours ago

      They think AI companies are using it as a “backdoor” to scrape their content. Which is patently ridiculous, but that won’t stop them.

    • ohulancutash@feddit.uk
      link
      fedilink
      English
      arrow-up
      17
      ·
      17 hours ago

      They think they want their revenue streams

  • 9tr6gyp3@lemmy.world
    link
    fedilink
    English
    arrow-up
    70
    ·
    18 hours ago

    Wait until they find out that AI is scraping their web sites.

    • ameancow@lemmy.world
      link
      fedilink
      English
      arrow-up
      4
      ·
      4 hours ago

      All of these companies only benefit from AI being employed to manufacture consent, alter reality and shape people’s social trends and habits. This is why they don’t want their data archived, they want to be able to use mobs of AI agents disguised as people to shape narrative and decide what people think is true.

      It’s already in massive progress across Reddit because it’s so easy to disperse undercover AI instances and create conversations to influence people.

    • The Velour Fog @lemmy.world
      link
      fedilink
      arrow-up
      10
      ·
      11 hours ago

      Well, Reddit’s got a contract for AI companies to scrape their content, so pig boy Spez is getting paid, he don’t give a fuck

    • Fuckfuckmyfuckingass@lemmy.world
      link
      fedilink
      arrow-up
      15
      arrow-down
      3
      ·
      18 hours ago

      I’m sure they don’t care, or are all about it.

  • user314_lemmus_v3s@lemmy.world
    link
    fedilink
    arrow-up
    12
    ·
    15 hours ago

    I wander what happened to Archive in 2024 when it was “hacked” and some pages “disappeared”…

  • TrackinDaKraken@lemmy.world
    link
    fedilink
    English
    arrow-up
    25
    arrow-down
    1
    ·
    18 hours ago

    Gotta control the press before you can rewrite history.

  • Formfiller@lemmy.world
    link
    fedilink
    arrow-up
    15
    ·
    18 hours ago

    That’s very 1984 of them

  • SpicyLizards@reddthat.com
    link
    fedilink
    arrow-up
    9
    arrow-down
    4
    ·
    15 hours ago

    Buuuut they all say that we need to donate to save free speech! It can’t be a lie right?

    • FishFace@piefed.social
      link
      fedilink
      English
      arrow-up
      5
      arrow-down
      1
      ·
      12 hours ago

      By “donate” you mean “buy a subscription”?

  • turdburglar@piefed.social
    link
    fedilink
    English
    arrow-up
    6
    ·
    18 hours ago

    that’s fuken lame.

  • LibertyLizard@slrpnk.net
    link
    fedilink
    arrow-up
    4
    ·
    17 hours ago

    Should be fairly easy to defeat, no?

News@lemmy.world

news@lemmy.world

Subscribe from Remote Instance

Create a post
You are not logged in. However you can subscribe from another Fediverse account, for example Lemmy or Mastodon. To do this, paste the following into the search field of your instance: !news@lemmy.world

Welcome to the News community!

Rules:

1. Be civil

Attack the argument, not the person. No racism/sexism/bigotry. Good faith argumentation only. This includes accusing another user of being a bot or paid actor. Trolling is uncivil and is grounds for removal and/or a community ban. Do not respond to rule-breaking content; report it and move on.


2. All posts should contain a source (url) that is as reliable and unbiased as possible and must only contain one link.

Obvious biased sources will be removed at the mods’ discretion. Supporting links can be added in comments or posted separately but not to the post body. Sources may be checked for reliability using Wikipedia, MBFC, AdFontes, GroundNews, etc.


3. No bots, spam or self-promotion.

Only approved bots, which follow the guidelines for bots set by the instance, are allowed.


4. Post titles should be the same as the article used as source. Clickbait titles may be removed.

Posts which titles don’t match the source may be removed. If the site changed their headline, we may ask you to update the post title. Clickbait titles use hyperbolic language and do not accurately describe the article content. When necessary, post titles may be edited, clearly marked with [brackets], but may never be used to editorialize or comment on the content.


5. Only recent news is allowed.

Posts must be news from the most recent 30 days.


6. All posts must be news articles.

No opinion pieces, Listicles, editorials, videos, blogs, press releases, or celebrity gossip will be allowed. All posts will be judged on a case-by-case basis. Mods may use discretion to pre-approve videos or press releases from highly credible sources that provide unique, newsworthy content not available or possible in another format.


7. No duplicate posts.

If an article has already been posted, it will be removed. Different articles reporting on the same subject are permitted. If the post that matches your post is very old, we refer you to rule 5.


8. Misinformation is prohibited.

Misinformation / propaganda is strictly prohibited. Any comment or post containing or linking to misinformation will be removed. If you feel that your post has been removed in error, credible sources must be provided.


9. No link shorteners or news aggregators.

All posts must link to original article sources. You may include archival links in the post description. News aggregators such as Yahoo, Google, Hacker News, etc. should be avoided in favor of the original source link. Newswire services such as AP, Reuters, or AFP, are frequently republished and may be shared from other credible sources.


10. Don't copy entire article in your post body

For copyright reasons, you are not allowed to copy an entire article into your post body. This is an instance wide rule, that is strictly enforced in this community.

Visibility: Public
globe

This community can be federated to other instances and be posted/commented in by their users.

  • 3.13K users / day
  • 6.99K users / week
  • 11.3K users / month
  • 21.1K users / 6 months
  • 1 local subscriber
  • 36.1K subscribers
  • 33.8K Posts
  • 222K Comments
  • Modlog
  • mods:
  • JonsJava@lemmy.world
  • gedaliyah@lemmy.world
  • 🌱 🐄🌱 @lemmy.world
  • jeffw@lemmy.world
  • enu@lemmy.world
  • rjc@lemmy.world
  • Tenthrow@lemmy.world
  • BE: 0.19.5
  • Modlog
  • Instances
  • Docs
  • Code
  • join-lemmy.org