Lutris now being built with Claude AI, developer decides to hide it after backlash

Tony Bark@pawb.social · 16 days ago

Lutris now being built with Claude AI, developer decides to hide it after backlash

Tony Bark@pawb.social · 16 days ago

I tried fitting AI into my workloads just as an experiment and failed. It’ll frequently reference APIs that don’t even exist or over engineer the shit out of something could be written in just a few lines of code. Often it would be a combo of the two.

Vlyn@lemmy.zip · 16 days ago

You might genuinely be using it wrong.

At work we have a big push to use Claude, but as a tool and not a developer replacement. And it’s working pretty damn well when properly setup.

Mostly using Claude Sonnet 4.6 with Claude Code. It’s important to run /init and check the output, that will produce a CLAUDE.md file that describes your project (which always gets added to your context).

Important: Review everything the AI writes, this is not a hands-off process. For bigger changes use the planning mode and split tasks up, the smaller the task the better the output.

Claude Code automatically uses subagents to fetch information, e.g. API documentation. Nowadays it’s extremely rare that it hallucinates something that doesn’t exist. It might use outdated info and need a nudge, like after the recent upgrade to .NET 10 (But just adding that info to the project context file is enough).

P03 Locke@lemmy.dbzer0.com · edit-2 16 days ago

Agreed, I don’t understand people not even giving it a chance. They try it for five minutes, it doesn’t do exactly what they want, they give up on it, and shout how shit it is.

Meanwhile, I put the work in, see it do amazing shit after figuring out the basics of how the tech works, write rules and skills for it, have it figure out complex problems, etc.

It’s like handing your 90-year-old grandpa the Internet, and they don’t know what the fuck to do with it. It’s so infuriating.

Probably because, like your 90-year-old grandpa with the Internet, you have to know how to use the search engine. You have to know how to communicate ideas to an LLM, in detail, with fucking context, not just “me needs problem solvey, go do fix thing!”

Vlyn@lemmy.zip · 16 days ago

It’s not really that simple. Yes, it’s a great tool when it works, but in the end it boils down to being a text prediction machine.

So a nice helper to throw shit at, but I trust the output as much as a random Stackoverflow reply with no votes :)

dream_weasel@sh.itjust.works · edit-2 15 days ago

I feel like there needs to be a dedicated post (and I don’t want to write it, but maybe I eventually will) that outlines what a model really is. It is not just a statistical text prediction machine unless you are being so loose with the definition of “statistical” that it doesn’t even mean anything anymore.

A decent example of a statistical text prediction machine is the middle word suggested by your phone when you’re using the keyboard. An LLM is not that.

In the most general terms, this kind of language model tokenizes a corpus of text based on a vocabulary (which is probably more than just the words in the dictionary), uses an embedding model to translate these tokens into a vector of semantic “meaning” which minimized loss in a bidirectional encoding (probably), that is then trained against a rubric for one or more topic area questions, retrained for instruction and explainability, retrained with reinforcement learning and human feedback to provide guardrails, and retrained again to make use of supplemental materials not part of the original training corpus (resource augmented generation), then distilled, then probably scaled and fine tuned against topic areas of choice (like coding or Korean or whatever) and maybe THEN made available to people to use. There are generally more parts to curriculum learning even than that but it’s a representative-ish start.

My point being that, yes, it would be nuts to pose ANY question to a predictor that says “with 84% probability, the word that is most likely follows ‘I really like’ is ‘gooning’ on reddit”, but even Grok is wildly more sophisticated than that and Grok is terrible.

Edit: And also I really like your take at the start of this thread: user error is a pretty huge problem in this space.

Vlyn@lemmy.zip · edit-2 15 days ago

The training is sophisticated, but inference is unfortunately really a text prediction machine. Technically token prediction, but you get the idea.

For every single token/word. You input your system prompt, context, user input, then the output starts.

The

Feed the entire context back in and add the reply “The” at the end.

The capital

Feed everything in again with “The capital”

The capital of

Feed everything in again…

The capital of Austria

…

It literally works like that, which sounds crazy :)

The only control you as a user can have is the sampling, like temperature, top-k and so on. But that’s just to soften and randomize how deterministic the model is.

Edit: I should add that tool and subagent use makes this approach a bit more powerful nowadays. But it all boils down to text prediction again. Even the tools are described per text for what they are for.

dream_weasel@sh.itjust.works · 5 days ago

Unless that’s how people are designing front ends for models, it literally DOESN’T work like that. It works like that until you finish training an embedding model with masking related tasks, but that’s the tip of the iceberg. The input vector, after being tokenized, is ingested wholesale. Now there’s sometimes funny business to manage the size of a context window effectively but this isn’t that unless you’re home-rolling and you’re caching your own inputs or something before you give it to the model.

P03 Locke@lemmy.dbzer0.com · 16 days ago

but in the end it boils down to being a text prediction machine.

And we’re barely smarter than a bunch of monkeys throw piles of shit at each other. Being reductive about its origins doesn’t really explain anything.

I trust the output as much as a random Stackoverflow reply with no votes :)

Yeah, but that’s why there’s unit tests. Let it run its own tests and solve its own bugs. How many mistakes have you or I made because we hate making unit tests? At least the LLM has no problems writing the tests, after you know it works.

moseschrute@lemmy.world · 15 days ago

Most people on Lemmy probably haven’t given it a single minute let alone 5 minutes.

Fatal@piefed.social · 16 days ago

At a minimum, the agent should be compiling the code and running tests before handing things back to you. “It references non-existent APIs” isn’t a modern problem.

Scrollone@feddit.it · 16 days ago

Yeah I mean. It’s not like AI can think. It’s just a glorified text predictor, the same you have on your phone keyboard

yucandu@lemmy.world · 16 days ago

It’s like having an idiot employee that works for free. Depending on how you manage them, that employee can either do work to benefit you or just get in your way.

daikiki@lemmy.world · edit-2 16 days ago

Only it’s not free. If you run it in the cloud, it’s heavily subsidized and proactively destroying the planet, and if you run it at home, you’re still using a lot of increasingly unaffordable power, and if you want something smarter than the average American politician, the upfront investment is still very significant.

yucandu@lemmy.world · 16 days ago

Yeah I’m not buying the “proactively destroying the planet” angle. I’d imagine there’s a lot of misinformation around AI, given that the products surrounding it are mostly Western, like vaccines…

TachyonTele@piefed.social · 16 days ago

Vaccines are misinformation? What.

BackgrndNoize@lemmy.world · 16 days ago

Not even free, just cheaper than an actual employee for now, but greed is inevitable and AI is computationally expensive, it’s only a matter of time before these AI companies start cranking up the prices.

aloofPenguin@piefed.world · 16 days ago

I had the same experience. Asked a local LLM about using sole Qt Wayland stuff for keyboard input, a the only documentation was the official one (which wasn’t a lot for a noob), no.examples of it being used online, and with all my attempts at making it work failing. it hallucinated some functions that didn’t exist, even when I let it do web search (NOT via my browser). This was a few years ago.

P03 Locke@lemmy.dbzer0.com · 16 days ago

This was a few years ago.

That’s 50 years in LLM terms. You might as well have been banging two rocks together.

CompassRed@discuss.tchncs.de · 16 days ago

The symptoms you describe are caused by bad prompting. If an AI is providing over-complicated solutions, 9 times out of 10 it’s because you didn’t constrain your problem enough. If it’s referencing tools that don’t exist, then you either haven’t specified which tools are acceptable or you haven’t provided the context required for it to find the tools. You may also be wanting too much out of AI. You can’t expect it to do everything for you. You still have to do almost all the thinking and engineering if you want a quality project - the AI is just there to write the code. Sure, you can use an AI to help you learn how to be a better engineer, but AIs typically don’t make good high-level decisions. Treat AI like an intern, not like a principal engineer.

Bronzebeard@lemmy.zip · 15 days ago

“it’s your fault that it just made up tools that don’t exist” is a bold statement, bro.

CompassRed@discuss.tchncs.de · 15 days ago

No, it’s not. It doesn’t have intention. It’s literally just a tool. If you don’t get the results you expect with a tool when other people do get those results, then the problem isn’t the tool.

Bronzebeard@lemmy.zip · 13 days ago

If the tool can’t be consistent in it’s output, it’s not a reliable or worthwhile tool to use.

There is such a thing as a bad tool.

One of Many@lemmy.world · 16 days ago

“It can’t be that stupid, you must be prompting it wrong.”

CompassRed@discuss.tchncs.de · 15 days ago

It’s not about stupid or smart. It’s a tool, not a person. If you don’t get the same results that other people get with the same tool, then what could possibly be the problem other than how the person is using the tool?