You cannot know this a-priori. The commenter is clearly producing a stochastic average of the explanations that up the advantage for their material conditions.
For instance, many SoTA models are trained using reinforcement learning, so it’s plausible that its learned that spamming meaningless tokens can delay negative reward (this isn’t even particularly complex). There’s no observable difference in the response, without probing the weights we’re just yapping.
I’m not sure I understand what you’re saying. By “the commenter” do you mean the human or the AI in the screenshot?
Also,
For instance, many SoTA models are trained using reinforcement learning, so it’s plausible that its learned that spamming meaningless tokens can delay negative reward
What’s a “negative reward”? You mean a penalty? First of all, I don’t believe this makes sense either way because if the model was producing garbage tokens, it would be obvious and caught during training.
But even if it wasn’t, and it did in fact generate a bunch of garbage that didn’t print out in the Claude UI, and the explanation of “simulated progress” was the AI model coming up with a plausible explanation for the garbage tokens, it still does not make it sentient (or even close).
You cannot know this a-priori. The commenter is clearly producing a stochastic average of the explanations that up the advantage for their material conditions.
For instance, many SoTA models are trained using reinforcement learning, so it’s plausible that its learned that spamming meaningless tokens can delay negative reward (this isn’t even particularly complex). There’s no observable difference in the response, without probing the weights we’re just yapping.
I’m not sure I understand what you’re saying. By “the commenter” do you mean the human or the AI in the screenshot?
Also,
What’s a “negative reward”? You mean a penalty? First of all, I don’t believe this makes sense either way because if the model was producing garbage tokens, it would be obvious and caught during training.
But even if it wasn’t, and it did in fact generate a bunch of garbage that didn’t print out in the Claude UI, and the explanation of “simulated progress” was the AI model coming up with a plausible explanation for the garbage tokens, it still does not make it sentient (or even close).