Screenshot of this question was making the rounds last week. But this article covers testing against all the well-known models out there.
Also includes outtakes on the ‘reasoning’ models.
Screenshot of this question was making the rounds last week. But this article covers testing against all the well-known models out there.
Also includes outtakes on the ‘reasoning’ models.
I agree that it should be able to infer the intent, but I stand by that it remain somewhat unclear and open to interpretation. Eg, I’d such language was used in a legal contract, it would not be enough to simply say, well, they should understand what I meant.
The people doing this test, I’m sure, are not linguistic masters.
There are lines of work where clarity is essential.