It’s getting a bit ridiculous out here. I’m using DuckDuckGo but since it aggregates its search from other sources, it’s also gotten bad recently. Is there a search out there that blocks domains that spam AI? Extra points if there’s something like Ublock Origin that filters things based on a community-made list.

Edit: I’m aware of Kagi but it’s pretty expensive and I’m not a fan that they, too, host their own AI tools.

  • CrowAirbrush@lemmy.world
    link
    fedilink
    English
    arrow-up
    15
    ·
    2 days ago

    Man, i was looking up info about arrow rests for recurve/olympic archery yesterday and stumbled on a website that use some sort of AI fever dream for their images.

    One kinda looked like a violins neckbrace (i don’t know what those things are called) with some strings attached, but it looked like it should look like a thing but after closer inspection it was actually nothing sensible.

    I think we’ve all seen those images that look like a room filled with itema but when you look at a specific item your mind figures out it’s just weird shapes and colors.

    What a nightmare that was.

  • FourPacketsOfPeanuts@lemmy.world
    link
    fedilink
    English
    arrow-up
    54
    arrow-down
    2
    ·
    2 days ago

    Search is eventually going to be so enshitified that the way to actually find out things is going to fall back on “ask someone you trust who knows things you don’t”. At least by that point those trustworthy people should be better informed than in the past…

    • Echo Dot@feddit.uk
      link
      fedilink
      English
      arrow-up
      11
      ·
      2 days ago

      It’s ultimately self-defeating as well because any future AI is going to be polluted by past AI’s garbage content. Making it even harder to develop intelligent AI systems.

        • rumba@lemmy.zip
          link
          fedilink
          English
          arrow-up
          7
          ·
          2 days ago

          I tried doing some of this. I trained on a corpus of data I wanted it to read, with such a small amount of training data, I found it was overall too lossy. If I asked it a question about something that was in there and it responded there was a really good chance that it was in there. But there was a lot of not knowing something that was definitely in there. It wasn’t completely useless but I wouldn’t say that it was at the level of being truly helpful.

          I worry that there’s not enough verified data out there to set up for proper training.

          • FourPacketsOfPeanuts@lemmy.world
            link
            fedilink
            English
            arrow-up
            5
            arrow-down
            1
            ·
            2 days ago

            I suspect such a model would have to be far more attuned to its data being smaller but trustworthy. Something like chatGPT for example requires a huge volume because it’s weakly affected by any particular datum going in. It’s designed to adapt to general conversation norms, rather than specific facts. If you could take a generalist like chatGPT and combine it with an expert model that’s been told everything it’s told has a huge weighting then that would probably be a big step forward.

  • NutWrench@lemmy.world
    link
    fedilink
    English
    arrow-up
    23
    ·
    2 days ago

    I think the best way to make the Internet less sh*tty is to get away from Google search.

    I like the SearX search engine. It gives old-school, relevant search results, not google ranked ones.

    https://search.inetol.net/

    It’s also spread out over many separate instances, so you can pick the one that best suits your search needs:

    https://searx.space/

    • MalReynolds@slrpnk.net
      link
      fedilink
      English
      arrow-up
      8
      ·
      2 days ago

      I selfhost it on my laptop, pretty easy, and I always have it just the way I want it. Still pushing shit uphill with the AI crap, but better than any one search engine (it amalgamates many). Relevant to OP I have a large block list enabled, but it’s very much a moving target.

    • pelespirit@sh.itjust.works
      link
      fedilink
      English
      arrow-up
      8
      ·
      2 days ago

      I’ve had good luck as a back up to Duck Duck Go with Mojeek. It’s so old school, it doesn’t always know what you want, but I sometimes want that.

      • blind3rdeye@lemm.ee
        link
        fedilink
        English
        arrow-up
        5
        ·
        2 days ago

        I’ve found Mojeek to be a bit hit and miss; but one thing I really appreciate is that they actually do the indexing and searching themselves (whereas pretty much every other search site uses Bing or Google behind the scenes). So although Mojeek may not be ideal, they are at least making an effort to be independent.

  • plm00@lemmy.ml
    link
    fedilink
    English
    arrow-up
    23
    arrow-down
    6
    ·
    2 days ago

    Kagi! You can block websites so they don’t show up. It’ll also flag websites that contain a lot of spam or ads.

    • tal@lemmy.today
      link
      fedilink
      English
      arrow-up
      9
      arrow-down
      1
      ·
      edit-2
      2 days ago

      Kagi lets you blacklist individual domains yourself, but I think what OP is asking is “is there a search engine that identifies and blacklists AI generated content itself”.

      I think that the answer is probably that yes, probably all search engines try to block spam websites of any sort, AI-generated or no, and will do so all the time, or at least downrank them. Trying to present relevant, useful material at the top of the results is basically the business that search engines are in.

      Now, do any do so to a level sufficient to fully eliminate them? I’d guess not. SEO spammers have been trying to pollute top results with their hits for about as long as search engines have been around, and trying to cheaply bulk-generate content that looks like something that the user might want is just the latest form this takes. My guess is that that’ll be a cat-and-mouse game for some time to come.

  • Grenfur@lemmy.one
    link
    fedilink
    English
    arrow-up
    5
    ·
    2 days ago

    It won’t block them, but I started to feel like recently DDG’s results were awful. I couldn’t find simple things. I’ve switched to startpage and had a much better experience. The results feel more aligned with what I want and I feel like there’s less crap. Its probably confirmation bias hah, but its working.

  • 93maddie94@lemm.ee
    link
    fedilink
    English
    arrow-up
    13
    ·
    2 days ago

    Unless I need something recent whenever I search I update the results to dates from like 1999 to 2021. Filters out a lot of unnecessary crap.

  • Imgonnatrythis@sh.itjust.works
    link
    fedilink
    English
    arrow-up
    20
    arrow-down
    12
    ·
    2 days ago

    Have you given Kagi an actual shake? If you are not interested in saving preferences longer term, you can keep cycling through free accounts. Now more than ever, it is a breath of fresh air. If I want a quick AI answer without scrolling through some ad-ridden web page, I just put a “?” at the end of my query. If not, I have no AI garbage on my results.

    • CubitOom@infosec.pub
      link
      fedilink
      English
      arrow-up
      1
      ·
      2 days ago

      I love kagi but I don’t think it actively filters out ai generated content.

      I know when searching for pictures you can disable AI generated images.

      I think the hard part for a search engine is that unless there is some kind of identifying mark on the content, how do they know that an ai didn’t write a top 10 list of pastebin alternatives?

      • Pringles@lemm.ee
        link
        fedilink
        English
        arrow-up
        1
        ·
        2 days ago

        It’s not immune to it. If you are looking for something highly specific you will get slob for sure. To give an actual example, a buddy of mine told me that the walls of your house act like a sponge when you have the outer walls insulated but not the basement walls on the outside, at least against water. So I went looking on kagi for stuff to back that up (not that I didn’t believe him, I just wanted to know more). A lot of the results were completely ai generated crap websites. There were good and somewhat relevant results, but in the end I gave up (also because we got confirmation that it’s done on our house, so it became irrelevant).

    • Ganbat@lemmy.dbzer0.com
      link
      fedilink
      English
      arrow-up
      2
      ·
      2 days ago

      Looks super cool. Too bad they don’t have a way to add custom SearX instances other than modifying and building the extension yourself.