• Boomer Humor Doomergod@lemmy.world
    link
    fedilink
    English
    arrow-up
    10
    arrow-down
    2
    ·
    edit-2
    18 hours ago

    If you scaled it based on the size of the integer you could get that up to 99.9% test accuracy. Like if it’s less than 10 give it 50% odds of returning false, if it’s under 50 give it 10% odds, otherwise return false.

    • Kairos@lemmy.today
      link
      fedilink
      arrow-up
      3
      ·
      9 hours ago

      That would make it less accurate. It’s much more likely to return true on not a prime than a prime

      • themusicman@lemmy.world
        link
        fedilink
        arrow-up
        1
        ·
        3 hours ago

        Correct. Not are why people are upvoting. If 10% of numbers are prime in a range, and you always guess false, you get 90% right. If you randomly guess true 10% of the time, you get ~80% right.

    • JackbyDev@programming.dev
      link
      fedilink
      English
      arrow-up
      2
      arrow-down
      1
      ·
      10 hours ago

      Makes me wonder where the actual break even would be. Like how long dies making one random number take versus sins lookups. Fuck it, do it in parallel. Fastest wins.

  • Flipper@feddit.org
    link
    fedilink
    arrow-up
    99
    ·
    1 day ago

    Has the same vibes as anthropic creating a C compiler which passes 99% of compiler tests.

    That las percent is really important.

    • PlexSheep@infosec.pub
      link
      fedilink
      arrow-up
      20
      ·
      1 day ago

      If this wasn’t 100% vibe coded, it would be pretty cool.

      A c compiler written in rust, with a lot of basics supported, an automated test suite that compiles well known c projects. Sounds like a fun project or academic work.

    • the rizzler@lemmygrad.ml
      link
      fedilink
      arrow-up
      7
      ·
      1 day ago

      any llm must have several C compilers in its training data, so it would be a reasonably competent almost-clone of gcc/clang/msvc anyway, right?

      is what i would have said if you didn’t put that last part

      • Cethin@lemmy.zip
        link
        fedilink
        English
        arrow-up
        2
        ·
        edit-2
        10 hours ago

        You’re still correct. The thing about LLMs is that they’re statistical models that output one of the most likely responses, from the list of most likely responses. It still has some randomness though. You can tune this, but no randomness is shit, and too much randomness sometimes generates stupid garbage. With a large enough output, you’re statistically likely, with any randomness, to generate some garbage. A compiler is sufficiently large and complex that it’s going to end up creating garbage somewhere, even if it’s trained on these compilers specifically.

        • the rizzler@lemmygrad.ml
          link
          fedilink
          arrow-up
          2
          ·
          9 hours ago

          that’s a great point but wouldn’t the output for a solved problem like “make a working C compiler in rust” work better if the temperature/randomness were zero? or am i fundamentally misunderstanding?

          • Cethin@lemmy.zip
            link
            fedilink
            English
            arrow-up
            2
            ·
            8 hours ago

            Probably. At that point you might as well just copy/paste the existing compiler though. The temperature is basically the thing that makes it seem intelligent, because it gives different responses each time, so it seems like it’s thinking. But yeah, having it just always give the most likely response would probably be better, but also probably wouldn’t play well with copyright laws when you have the exact same code as an existing compiler.

      • CarrotsHaveEars@lemmy.ml
        link
        fedilink
        arrow-up
        2
        ·
        11 hours ago

        This is so fucked up. The AI company has the perfect answer and yet it rolls the die to recreate the same thing by chance. What are they expecting, really?

        • the rizzler@lemmygrad.ml
          link
          fedilink
          arrow-up
          1
          ·
          edit-2
          11 hours ago

          they don’t care. they expect to be able to say “our AI agent made a C compiler that passed 99% of our tests”

  • BradleyUffner@lemmy.world
    link
    fedilink
    English
    arrow-up
    34
    arrow-down
    2
    ·
    edit-2
    1 day ago

    My favorite part of this is that they test it up to 99999 and we see that it fails for 99991, so that means somewhere in the test they actually implemented a properly working function.

    • frank@sopuli.xyz
      link
      fedilink
      arrow-up
      24
      arrow-down
      4
      ·
      1 day ago

      No, it’s always guessing false and 99991 is prime so it isn’t right. This isn’t the output of the program but the output of the program compared with a better (but probably not faster) isprime program

      • BradleyUffner@lemmy.world
        link
        fedilink
        English
        arrow-up
        28
        ·
        1 day ago

        Yes, that’s what I said. They wrote another test program, with a correct implementation of IsPrime in order to test to make sure the pictured one produced the expected output.

          • AItoothbrush@lemmy.zip
            link
            fedilink
            English
            arrow-up
            18
            ·
            1 day ago

            I mean people underestimate how usefull lookup tables are. A lookup table of primes for example is basically always just better except the one case where you are searching for primes which is more maths than computer programming anyways. The modern way is to abstract and reimplement everything when there are much cheaper and easier ways of doing it.

            • ozymandias@sh.itjust.works
              link
              fedilink
              arrow-up
              4
              arrow-down
              1
              ·
              22 hours ago

              more maths than computer programming anyways

              Computer programming is a subset of maths and was invented by a mathematition, originally to solve a maths problem…

              • AItoothbrush@lemmy.zip
                link
                fedilink
                English
                arrow-up
                4
                arrow-down
                1
                ·
                21 hours ago

                Yeah but they slowly develop to be their own fields. You wouldnt argue that physics is math either. Or that chemistry could technically be called a very far branch of philosophy. Computer programing, physics, etc are the applied versions of math. You are no longer studying math, you are studying something else with the help of math. Not that it matters much, just makes distinguising between them easier. You can draw the line anywhere but people do generally have a somewhat shared idea of where that lies.

          • draco_aeneus@mander.xyz
            link
            fedilink
            English
            arrow-up
            3
            ·
            22 hours ago

            For prime numbers, since they’re quite difficult to calculate and there’s not that many of them, that’s what’s most common.

    • anton@lemmy.blahaj.zone
      link
      fedilink
      arrow-up
      8
      ·
      23 hours ago

      That’s a legitimate thing to do if you have a slow implementation that’s easy to verify and a fast implementation that isn’t.

  • zbyte64@awful.systems
    link
    fedilink
    arrow-up
    6
    ·
    20 hours ago

    Pssh, mine uses a random number generator for odd numbers to return true 4% of the time to achieve higher accuracy and a bettor LLM metaphor

  • fckreddit@lemmy.ml
    link
    fedilink
    arrow-up
    33
    arrow-down
    1
    ·
    edit-2
    1 day ago

    LLMs belong to the same category. Seemingly right, but not really right.

  • JustARegularNerd@lemmy.dbzer0.com
    link
    fedilink
    arrow-up
    17
    arrow-down
    5
    ·
    1 day ago

    I’m struggling to follow the code here. I’m guessing it’s C++ (which I’m very unfamiliar with)

    bool is_prime(int x) {
        return false;
    }
    

    Wouldn’t this just always return false regardless of x (which I presume is half the joke)? Why is it that when it’s tested up to 99999, it has a roughly 95% success rate then?

    • kraftpudding@lemmy.world
      link
      fedilink
      arrow-up
      32
      ·
      1 day ago

      I suppose because about 5% of numbers are actually prime numbers, so false is not the output an algorithm checking for prime numbers should return

      • JustARegularNerd@lemmy.dbzer0.com
        link
        fedilink
        arrow-up
        10
        ·
        1 day ago

        Oh I’m with you, the tests are precalculated and expect a true to return on something like 99991, this function as expected returns false, which throws the test into a fail.

        Thank you for that explanation

    • flamingo_pinyata@sopuli.xyz
      link
      fedilink
      arrow-up
      29
      ·
      1 day ago

      That’s the joke. Stochastic means probabilistic. And this “algorithm” gives the correct answer for the vast majority of inputs

  • Kekzkrieger@feddit.org
    link
    fedilink
    English
    arrow-up
    7
    arrow-down
    1
    ·
    1 day ago

    If you think this is bad and not nearly enough accuracy to be called correct, AI is much worse than this.

    It’s not just wrong a lot of times or hallucinates but you can’t pinpoint why or how it produces the result and if you keep putting the same data in, the output may still vary.

  • CanadaPlus@lemmy.sdf.org
    link
    fedilink
    arrow-up
    2
    arrow-down
    6
    ·
    edit-2
    20 hours ago

    I mean, an application could exist where this isn’t even wrong. Maybe as a “subroutine” of another algorithm that only needs a truly composite number most of the time to work.

    That this reads as a joke says a lot about what application we’re intuitively expecting.

    Edit: Not sure why this is being downvoted.

    • ulterno@programming.dev
      link
      fedilink
      English
      arrow-up
      2
      ·
      edit-2
      19 hours ago

      Edit: Not sure why this is being downvoted.

      Perhaps because it would do better, being replaced with noop.

      A link time optimiser might actually do so.

      • CanadaPlus@lemmy.sdf.org
        link
        fedilink
        arrow-up
        2
        ·
        edit-2
        17 hours ago

        That’s kind of what I meant by putting “subroutine” in quotes. You obviously wouldn’t write it like this, you’d just use a random large number with a bit of explanation.

        Oh well, live and learn. I’ll try to be clearer next time.

        • ulterno@programming.dev
          link
          fedilink
          English
          arrow-up
          1
          ·
          edit-2
          6 hours ago

          I just took the above comment as continuing the joke, but this explanation actually confuses me more.
          Well, I don’t really think you need to be very clear in this case. Jokes are more fun when you “get” them rather than have them explained.