I don’t fathom paying to have your search history catalogued in correlation to your payment info. This will end as it always does, either hacked or enshittified.
Kagi has started using search results from Brave’s search index. The LGBT community disapproved of this because of past homophobic actions by Brave’s CEO Brendan Eich.
Twitter has a similar problem. The more the CEO injects personal politics into the function of the site, the less confidence people have that a new search won’t be fucked with. Whatever you might say about Google, Bing, and Yahoo, their owners have at least kept their politics closer to the chest.
Thats a terrible reason to not use something that works well though. I mean the founder or CEO of any major bank is probably a shit person with bad takes like racism but does it make their banking service any less useful?
No joke, I’ve been using Bing’s GPT-4 search and it’s helped me much more frequently than Google lately. AI might actually be where Bing out-competes Google.
I don’t think so. Wiby limits its index to specific kinds of websites by design.
I imagine it’s great for entertainment purposes, but not for the things you’d usually use a search engine for (gathering information, troubleshooting issues, etc.)
How are you supposed to self-host a web crawler and indexer without getting a giant server bill?
Having this service at least slightly centralised makes sense ressource-wise - but assuming crawling and indexing is free is just foolish. I’d choose something like kagi but I guess many people will rather cheap out and go for the next free service not realising that that company has to make money another way to make up for the high cost of running a search engine
The Internet was tiny in 1998 but so were Google’s servers. A little searching seems to show they ran everything on a dozen Pentium PC’s with at total of 100GB of drives. That’s less power than a single Raspberry Pi today with a $30 SD memory card.
I’d choose something like kagi but I guess many people will rather cheap out
I often feel as though these paid-for services aren’t delivering a meaningfully better product. After all, it isn’t as though Google’s problem is that they don’t have enough cash to spend on optimization. The problem is that they’re a profit-motivated firm fixated on minimizing their cost and maximizing their revenue. Kagi has far less money to optimize than Google and the same profit-chasing incentives.
If there was a Github / Linux distro equivalent to a modern search engine - or even a Wikipedia-style curated collaborative effort - I’d be happy to kick in for that (like I donate to these projects). For all Wiki gets shit on ask Spook-o-pedia, they do at least have a public change history and an engaged community of participants. If Kagi is just going to kick me back the same Wiki article at a higher point in the return list than Google, why get their premium service when I can just donate to Wiki and search there directly?
If I’m just getting a feed of paywalled news journals like the NYT or WaPo, its the same question? Why not just pay them directly and use their internal search?
Other than screening out the crap that Google or Bing vomit up, what is the value-add of Kagi? And why shouldn’t I expect to see the same shit-creep in Kagi that I’ve seen in Google or Bing over the last decade? Because I’m paying them? Fuck, I subscribe to Google and Amazon services, and they haven’t gotten any better.
The problem is that it’s just incredibly expensive to keep scanning and indexing the web over and over in a way that makes it possible to search within seconds.
And the problem with search engines is that you can’t make the algorithm completely open source since that would make it too easy to manipulate the results with SEO which is exactly what’s destroying google
you can’t make the algorithm completely open source since that would make it too easy to manipulate
I don’t think “security through obscurity” has ever been an effective precautionary measure. SEO optimization works today because it is possible to intuit the function of the algorithms without ever seeing the interior code.
Knowing the interior of the code gives black hats a chance to manipulate the algorithm, but it also gives white hats the chance to advise alternative optimization strategies. Again, consider an algorithm that biases itself to websites without ads. The means by which you game the system would be contrary to the incentives for click-bait. What’s more, search engines and ad-blockers would now have a common cause, which would have their own knock-on effects.
But this would mean moving towards an internet model that was more friendly to open-sourced, collaboratively managed, and not-for-profit content. That’s not something companies like Google and Microsoft want to encourage. And that’s the real barrier to such an implementation.
Before we had Google, we had Altavista and before that we had indexes like Yahoo. Maybe we should consider going back. With the help of AI (I know…) it seems feasible to keep up with the ever growing content.
You can’t really go back. Those old engines worked on more naive algorithms against a significantly smaller pool of websites.
The more modern iteration of Altavista/AOL/Yahoo has been the aggregation sites like Reddit, where people still post and interact with the site to establish relevancy. Even that’s been enshittified, but its a far better source than some basic web crawler that just scans website text and metadata for the word “Horse” and returns a big listical of results based on a hash weighted by number of link-backs.
That system was gamed decades ago and is almost trivial to undermine in the modern moment. Nevermind how hard you’d need to work to recreate the original baseline hash tables that these old engines built up over their own decades of operation.
It’s time for Google to die. They are a truly awful company now so it’s time to take her down to the shed like ol’ blockbuster
What will be replacing it? Bing?
Kagi (although recent drama leaves me soured)
I don’t fathom paying to have your search history catalogued in correlation to your payment info. This will end as it always does, either hacked or enshittified.
Love Kagi. What happened with recent drama? Must have missed that.
Kagi has started using search results from Brave’s search index. The LGBT community disapproved of this because of past homophobic actions by Brave’s CEO Brendan Eich.
Oh, that. Yeah, I’m not personally worried that they used it very lightly as one of a dozen sources and then stopped.
The problem was mainly their questionable response
Fool me once…
Twitter has a similar problem. The more the CEO injects personal politics into the function of the site, the less confidence people have that a new search won’t be fucked with. Whatever you might say about Google, Bing, and Yahoo, their owners have at least kept their politics closer to the chest.
Thats a terrible reason to not use something that works well though. I mean the founder or CEO of any major bank is probably a shit person with bad takes like racism but does it make their banking service any less useful?
No joke, I’ve been using Bing’s GPT-4 search and it’s helped me much more frequently than Google lately. AI might actually be where Bing out-competes Google.
http://wiby.me/
I don’t think so. Wiby limits its index to specific kinds of websites by design.
I imagine it’s great for entertainment purposes, but not for the things you’d usually use a search engine for (gathering information, troubleshooting issues, etc.)
LMAO as if.
Selfhost
Are we expecting normal people to learn how to self-host?
How are you supposed to self-host a web crawler and indexer without getting a giant server bill?
Having this service at least slightly centralised makes sense ressource-wise - but assuming crawling and indexing is free is just foolish. I’d choose something like kagi but I guess many people will rather cheap out and go for the next free service not realising that that company has to make money another way to make up for the high cost of running a search engine
The Internet was tiny in 1998 but so were Google’s servers. A little searching seems to show they ran everything on a dozen Pentium PC’s with at total of 100GB of drives. That’s less power than a single Raspberry Pi today with a $30 SD memory card.
I often feel as though these paid-for services aren’t delivering a meaningfully better product. After all, it isn’t as though Google’s problem is that they don’t have enough cash to spend on optimization. The problem is that they’re a profit-motivated firm fixated on minimizing their cost and maximizing their revenue. Kagi has far less money to optimize than Google and the same profit-chasing incentives.
If there was a Github / Linux distro equivalent to a modern search engine - or even a Wikipedia-style curated collaborative effort - I’d be happy to kick in for that (like I donate to these projects). For all Wiki gets shit on ask Spook-o-pedia, they do at least have a public change history and an engaged community of participants. If Kagi is just going to kick me back the same Wiki article at a higher point in the return list than Google, why get their premium service when I can just donate to Wiki and search there directly?
If I’m just getting a feed of paywalled news journals like the NYT or WaPo, its the same question? Why not just pay them directly and use their internal search?
Other than screening out the crap that Google or Bing vomit up, what is the value-add of Kagi? And why shouldn’t I expect to see the same shit-creep in Kagi that I’ve seen in Google or Bing over the last decade? Because I’m paying them? Fuck, I subscribe to Google and Amazon services, and they haven’t gotten any better.
The problem is that it’s just incredibly expensive to keep scanning and indexing the web over and over in a way that makes it possible to search within seconds.
And the problem with search engines is that you can’t make the algorithm completely open source since that would make it too easy to manipulate the results with SEO which is exactly what’s destroying google
I don’t think “security through obscurity” has ever been an effective precautionary measure. SEO optimization works today because it is possible to intuit the function of the algorithms without ever seeing the interior code.
Knowing the interior of the code gives black hats a chance to manipulate the algorithm, but it also gives white hats the chance to advise alternative optimization strategies. Again, consider an algorithm that biases itself to websites without ads. The means by which you game the system would be contrary to the incentives for click-bait. What’s more, search engines and ad-blockers would now have a common cause, which would have their own knock-on effects.
But this would mean moving towards an internet model that was more friendly to open-sourced, collaboratively managed, and not-for-profit content. That’s not something companies like Google and Microsoft want to encourage. And that’s the real barrier to such an implementation.
Before we had Google, we had Altavista and before that we had indexes like Yahoo. Maybe we should consider going back. With the help of AI (I know…) it seems feasible to keep up with the ever growing content.
You can’t really go back. Those old engines worked on more naive algorithms against a significantly smaller pool of websites.
The more modern iteration of Altavista/AOL/Yahoo has been the aggregation sites like Reddit, where people still post and interact with the site to establish relevancy. Even that’s been enshittified, but its a far better source than some basic web crawler that just scans website text and metadata for the word “Horse” and returns a big listical of results based on a hash weighted by number of link-backs.
That system was gamed decades ago and is almost trivial to undermine in the modern moment. Nevermind how hard you’d need to work to recreate the original baseline hash tables that these old engines built up over their own decades of operation.