Screenshot of this question was making the rounds last week. But this article covers testing against all the well-known models out there.

Also includes outtakes on the ‘reasoning’ models.

  • TrackinDaKraken@lemmy.world
    link
    fedilink
    English
    arrow-up
    24
    arrow-down
    3
    ·
    3 hours ago

    I think it’s worse when they get it right only some of the time. It’s not a matter of opinion, it should not change its “mind”.

    The fucking things are useless for that reason, they’re all just guessing, literally.

    • Tetragrade@leminal.space
      link
      fedilink
      English
      arrow-up
      1
      arrow-down
      2
      ·
      edit-2
      2 hours ago

      Same takeaway as the article (everyone read the article, right?).

      Applying it to yourself, can you recall instances when you were asked the same question at different points in time? How did you respond?

    • HugeNerd@lemmy.ca
      link
      fedilink
      English
      arrow-up
      2
      arrow-down
      5
      ·
      2 hours ago

      they’re all just guessing, literally

      They’re literally not.

      • m0darn@lemmy.ca
        link
        fedilink
        English
        arrow-up
        10
        ·
        2 hours ago

        Isn’t it a probabilistic extrapolation? Isn’t that what a guess is?