• ikidd@lemmy.world
    link
    fedilink
    English
    arrow-up
    1
    arrow-down
    4
    ·
    1 month ago

    I believe this about as much as I believed the “We’re about to experience the AI singularity” morons.

  • rational_lib@lemmy.world
    link
    fedilink
    English
    arrow-up
    14
    arrow-down
    3
    ·
    1 month ago

    As I use copilot to write software, I have a hard time seeing how it’ll get better than it already is. The fundamental problem of all machine learning is that the training data has to be good enough to solve the problem. So the problems I run into make sense, like:

    1. Copilot can’t read my mind and figure out what I’m trying to do.
    2. I’m working on an uncommon problem where the typical solutions don’t work
    3. Copilot is unable to tell when it doesn’t “know” the answer, because of course it’s just simulating communication and doesn’t really know anything.

    2 and 3 could be alleviated, but probably not solved completely with more and better data or engineering changes - but obviously AI developers started by training the models on the most useful data and strategies that they think work best. 1 seems fundamentally unsolvable.

    I think there could be some more advances in finding more and better use cases, but I’m a pessimist when it comes to any serious advances in the underlying technology.

    • raspberriesareyummy@lemmy.world
      link
      fedilink
      English
      arrow-up
      6
      arrow-down
      5
      ·
      1 month ago

      So you use other people’s open source code without crediting the authors or respecting their license conditions? Good for you, parasite.

      • constantturtleaction@lemmy.world
        link
        fedilink
        English
        arrow-up
        1
        ·
        1 month ago

        Ahh right, so when I use copilot to autocomplete the creation of more tests in exactly the same style of the tests I manually created with my own conscious thought, you’re saying that it’s really just copying what someone else wrote? If you really believe that, then you clearly don’t understand how LLMs work.

        • raspberriesareyummy@lemmy.world
          link
          fedilink
          English
          arrow-up
          1
          arrow-down
          3
          ·
          1 month ago

          I know both LLM mechanisms better than you, it would appear, and my point is not so weak that I would have to fabricate a strawman that I then claim is what you said, to proceed to argue the strawman.

          Using LLMs trained on other people’s source code is parasitic behaviour and violates copyrights and licenses.

          • constantturtleaction@lemmy.world
            link
            fedilink
            English
            arrow-up
            1
            ·
            1 month ago

            Look, I recognize that it’s possible for LLMs to produce code that is literally someone else’s copyrighted code. However, the way I use copilot is almost exclusively to autocomplete my thoughts. Like, I write enough code until it guesses what I was about to write next. If that happens to be open source code that someone else has written, then it is complete coincidence that I thought of writing that code. Not all thoughts are original.

            Further, whether I should be at fault for LLM vendors who may be breaking copyright law, is like trying to make a case for me being at fault for murder because I drive a car when car manufacturers lobby to the effect that people die more.

            • raspberriesareyummy@lemmy.world
              link
              fedilink
              English
              arrow-up
              2
              ·
              1 month ago

              Not all thoughts are original.

              Agreed, and I am also 100% opposed to SW patents. No matter what I wrote, if someone came up with the same idea on their own, and finds out about my implementation later, I absolutely do not expect them to credit me. In the use case you describe, I do not see a problem of using other people’s work in a license breaking way. I do however see a waste of time - you have to triple check everything an LLM spits out - and energy (ref: MS trying to buy / restart a nuclear reactor to power their LLM hardware).

              Further, whether I should be at fault for LLM vendors who may be breaking copyright law, is like trying to make a case for me being at fault for murder because I drive a car when car manufacturers lobby to the effect that people die more.

              If you drive a car on “autopilot” and get someone killed, you are absolutely at fault for murder. Not in the legal sense, because fuck capitalism, but absolutely in the moral sense. Also, there’s legal precedent in a different example: https://www.findlaw.com/legalblogs/criminal-defense/can-you-get-arrested-for-buying-stolen-goods/

              If you unknowingly buy stolen (fenced) goods, if found out, you will have to return them to the rightful owner without getting your money back - that you would then have to try and get back from the vendor.

              In the case of license agreements, you would still be participant to a license violation - and if you consider a piece of code that would be well-recognizable, just think about the following thought experiment:

              Assume someone trained the LLM on some source code Disney uses for whatever. Your code gets autocompleted with that and you publish it, and Disney finds out about it. Do you honestly think that the evil motherfuckers at Disney would stop at anything short of having your head served on a silver platter?

      • rational_lib@lemmy.world
        link
        fedilink
        English
        arrow-up
        7
        arrow-down
        1
        ·
        1 month ago

        Very frequently, yes. As well as closed source code and intellectual property of all kinds. Anyone who tells you otherwise is a liar.

        • raspberriesareyummy@lemmy.world
          link
          fedilink
          English
          arrow-up
          1
          arrow-down
          3
          ·
          1 month ago

          Ah, I guess I’ll have to question why I am lying to myself then. Don’t be a douchebag. Don’t use open source without respecting copyrights & licenses. The authors are already providing their work for free. Don’t shit on that legacy.

      • drake@lemmy.sdf.org
        link
        fedilink
        English
        arrow-up
        1
        ·
        1 month ago

        I completely understand where you’re coming from, and I absolutely agree with you, genAI is copyright infringement on a weapons-grade scale. With that said, though, in my opinion, I don’t know if calling people parasites like this will really convince people, or change anything. I don’t want to tone police you, if you want to tell people to get fucked, then go ahead, but I think being a bit more sympathetic to your fellow programmers and actually trying to help them see things from our perspective might actually change some minds. Just something to think about. I don’t have all the answers, feel free to ignore me. Much love!

        • raspberriesareyummy@lemmy.world
          link
          fedilink
          English
          arrow-up
          2
          ·
          1 month ago

          You are right. My apologies, and my congratulations for finding the correct “tone” to respond to me ;) The thing is, I am absolutely fed up with especially the bullshit about snake oil vendors selling LLMs as “AI”, and I am much more fed up with corporations on a large scale getting away with - since it’s for profit - what I guess must already be called theft of intellectual property.

          When people then use said LLMs to “develop software”, I’m kind of convinced they are about as gone mentally as the MAGA cult and sometimes I just want to vent. However, I chose the word parasite for a reason, because it’s a parasitic way of working: they use the work of other people, which for more specific algorithms, an LLM will reproduce more or less verbatim, while causing harm to such people by basically copy-pasting such code while omitting the license statement - thereby releasing such code (if open source) into the “wild” with an illegally(*) modified license.

          • illegal of course only in such countries whose legal system respects copyright and license texts in the first place

          Considering on top the damage done to the environment by the insane energy consumption for little to no gain, people should not be using LLMs at all. Not even outside coding. This is just another way to contribute missing our climate goals by a wide margin. Wasting energy like this - basically because people are too lazy to think for themselves - actually gets people killed due to extreme weather events.

          So yeah, you have a valid point, but also, I am fed up with the egocentric bullshit world that social media has created and that has culminated in what will soon be a totalitarian regime in the country that once brought peace to Europe by defeating the Nazis and doing a PROPER reeducation of the people. Hooray for going off on a tangent…

    • ggppjj@lemmy.world
      link
      fedilink
      English
      arrow-up
      13
      ·
      edit-2
      1 month ago

      Not copilot, but I run into a fourth problem:
      4. The LLM gets hung up on insisting that a newer feature of the language I’m using is wrong and keeps focusing on “fixing” it, even though it has access to the newest correct specifications where the feature is explicitly defined and explained.

      • rumba@lemmy.zip
        link
        fedilink
        English
        arrow-up
        5
        ·
        1 month ago

        Oh god yes, ran into this asking for a shell.nix file with a handful of tricky dependencies. It kept trying to do this insanely complicated temporary pull and build from git instead of just a 6 line file asking for the right packages.

        • ggppjj@lemmy.world
          link
          fedilink
          English
          arrow-up
          6
          ·
          1 month ago

          “This code is giving me a return value of X instead of Y”

          “Ah the reason you’re having trouble is because you initialized this list with brackets instead of new().”

          “How would a syntax error give me an incorrect return”

          “You’re right, thanks for correcting me!”

          “Ok so like… The problem though.”

          • rumba@lemmy.zip
            link
            fedilink
            English
            arrow-up
            2
            ·
            1 month ago

            Yeah, once you have to question its answer, it’s all over. It got stuck and gave you the next best answer in it’s weights which was absolutely wrong.

            You can always restart the convo, re-insert the code and say what’s wrong in a slightly different way and hope the random noise generator leads it down a better path :)

            I’m doing some stuff with translation now, and I’m finding you can restart the session, run the same prompt and get better or worse versions of a translation. After a few runs, you can take all the output and ask it to rank each translation on correctness and critique them. I’m still not completely happy with the output, but it does seem that sometime if you MUST get AI to answer the question, there can be value in making it answer it across more than one session.

      • obbeel@lemmy.eco.br
        link
        fedilink
        English
        arrow-up
        3
        ·
        1 month ago

        I’ve also run into this when trying to program in Rust. It just says that the newest features don’t exist and keeps rolling back to an unsupported library.

    • OsrsNeedsF2P@lemmy.ml
      link
      fedilink
      English
      arrow-up
      1
      ·
      1 month ago
      1. Copilot can’t read my mind and figure out what I’m trying to do.

      Try writing comments

  • Buffalox@lemmy.world
    link
    fedilink
    English
    arrow-up
    9
    arrow-down
    3
    ·
    1 month ago

    Seems to me the rationale is flawed. Even if it isn’t strong or general AI, LLM based AI has found a lot of uses. I also don’t recognize the claimed ignorance among people working with it, about the limitations of current AI models.

    • ohwhatfollyisman@lemmy.world
      link
      fedilink
      English
      arrow-up
      11
      arrow-down
      1
      ·
      1 month ago

      while you may be right, one would think that the problem lies in the overestimated peception of the abilities of llms leading to misplaced investor confidence – which in turn leads to a bubble ready to burst.

      • elgordino@fedia.io
        link
        fedilink
        arrow-up
        13
        ·
        1 month ago

        Yup. Investors have convinced themselves that this time AI development is going to grow exponentially. The breathless fantasies they’ve concocted for themselves require it. They’re going to be disappointed.

    • taladar@sh.itjust.works
      link
      fedilink
      English
      arrow-up
      2
      arrow-down
      2
      ·
      1 month ago

      Can you name some of those uses that you see lasting in the long term or even the medium term? Because while it has been used for a lot of things it seems to be pretty bad at the overwhelming majority of them.

      • Buffalox@lemmy.world
        link
        fedilink
        English
        arrow-up
        2
        arrow-down
        1
        ·
        edit-2
        1 month ago

        AI is already VERY successful in some areas, when you take a photo, it is treated with AI features to improve the image, and when editing photos on your phone, the more sophisticated options are powered by AI. Almost all new cars have AI features.
        These are practical everyday uses, you don’t even have to think about when using them.
        But it’s completely irrelevant if I can see use cases that are sustainable or not. The fact is that major tech companies are investing billions in this.
        Of course all the biggest tech companies could all be wrong, but I bet they researched the issue more than me before investing.
        Show me by what logic you believe to know better.

        The claim that it needs to be strong AI to be useful is ridiculous.

        • taladar@sh.itjust.works
          link
          fedilink
          English
          arrow-up
          1
          arrow-down
          1
          ·
          1 month ago

          The fact is that major tech companies are investing billions in this.

          They have literally invested billions in every single hype cycle of the last few decades that turned out to be a pile of crap in hindsight. This is a bad argument.

          • Buffalox@lemmy.world
            link
            fedilink
            English
            arrow-up
            1
            arrow-down
            1
            ·
            edit-2
            1 month ago

            And which are those? There is no technology all major tech companies have invested in like AI AFAIK.
            Maybe the dot com wave way back, but are you arguing the Internet came to nothing?

  • Optional@lemmy.world
    link
    fedilink
    English
    arrow-up
    32
    ·
    1 month ago

    “The economics are likely to be grim,” Marcus wrote on his Substack. “Sky high valuation of companies like OpenAI and Microsoft are largely based on the notion that LLMs will, with continued scaling, become artificial general intelligence.”

    “As I have always warned,” he added, “that’s just a fantasy.”

    • Pennomi@lemmy.world
      link
      fedilink
      English
      arrow-up
      3
      arrow-down
      2
      ·
      edit-2
      1 month ago

      Even Zuckerberg admits that trying to scale LLMs larger doesn’t work because the energy and compute requirements go up exponentially. There must exist a different architecture that is more efficient, since the meat computers in our skulls are hella efficient in comparison.

      Once we figure that architecture out though, it’s very likely we will be able to surpass biological efficiency like we have in many industries.

      • RogueBanana@lemmy.zip
        link
        fedilink
        English
        arrow-up
        7
        ·
        1 month ago

        That’s a bad analogy. We weren’t able to surpass biological efficiency in industry sector because we figured out human anatomy and how to improve it. It’s simply alternative ways to produce force like electricity and motors which had absolutely no relation to how muscles works.

        I imagine it would be the same for computers, simply another, better method to achieve something but it’s so uncertain that it’s barely worth discussing about.

        • Pennomi@lemmy.world
          link
          fedilink
          English
          arrow-up
          4
          ·
          1 month ago

          Of course! It’s not like animals have jet engines!

          Human brains are merely the proof that such energy efficiencies are possible for intelligence. It’s likely we can match or go far beyond that, probably not by emulating biology directly. (Though we certainly may use it as inspiration while we figure out the underlying principles.)

  • CerealKiller01@lemmy.world
    link
    fedilink
    English
    arrow-up
    34
    arrow-down
    3
    ·
    1 month ago

    Huh?

    The smartphone improvements hit a rubber wall a few years ago (disregarding folding screens, that compose a small market share, improvement rate slowed down drastically), and the industry is doing fine. It’s not growing like it use to, but that just means people are keeping their smartphones for longer periods of time, not that people stopped using them.

    Even if AI were to completely freeze right now, people will continue using it.

    Why are people reacting like AI is going to get dropped?

    • drake@lemmy.sdf.org
      link
      fedilink
      English
      arrow-up
      4
      ·
      1 month ago

      It’s absurdly unprofitable. OpenAI has billions of dollars in debt. It absolutely burns through energy and requires a lot of expensive hardware. People aren’t willing to pay enough to make it break even, let alone profit

      • sugar_in_your_tea@sh.itjust.works
        link
        fedilink
        English
        arrow-up
        2
        ·
        1 month ago

        Eh, if the investment dollars start drying up, they’ll likely start optimizing what they have to get more value for fewer resources. There is value in AI, I just don’t think it’s as high as they claim.

    • Ultraviolet@lemmy.world
      link
      fedilink
      English
      arrow-up
      8
      arrow-down
      5
      ·
      1 month ago

      Because novelty is all it has. As soon as it stops improving in a way that makes people say “oh that’s neat”, it has to stand on the practical merits of its capabilities, which is, well, not much.

      • theherk@lemmy.world
        link
        fedilink
        English
        arrow-up
        10
        arrow-down
        3
        ·
        1 month ago

        I’m so baffled by this take. “Create a terraform module that implements two S3 buckets with cross-region bidirectional replication. Include standard module files like linting rules and enable precommit.” Could I write that? Yes. But does this provide an outstanding stub to start from? Also yes.

        And beyond programming, it is otherwise having positive impact on science and medicine too. I mean, anybody who doesn’t see any merit has their head in the sand. That of course must be balanced with not falling for the hype, but the merits are very real.

        • Eccitaze@yiffit.net
          link
          fedilink
          English
          arrow-up
          5
          ·
          1 month ago

          There’s a pretty big difference between chatGPT and the science/medicine AIs.

          And keep in mind that for LLMs and other chatbots, it’s not that they aren’t useful at all but that they aren’t useful enough to justify their costs. Microsoft is struggling to get significant uptake for Copilot addons in Microsoft 365, and this is when AI companies are still in their “sell below cost and light VC money on fire to survive long enough to gain market share” phase. What happens when the VC money dries up and AI companies have to double their prices (or more) in order to make enough revenue to cover their costs?

          • theherk@lemmy.world
            link
            fedilink
            English
            arrow-up
            2
            ·
            1 month ago

            Nothing to argue with there. I agree. Many companies will go out of business. Fortunately we’ll still have the llama3’s and mistral’s laying around that I can run locally. On the other hand cost justification is a difficult equation with many variables, so maybe it is or will be in some cases worth the cost. I’m just saying there is some merit.

          • obbeel@lemmy.eco.br
            link
            fedilink
            English
            arrow-up
            1
            ·
            1 month ago

            I understand that it makes less sense to spend in model size if it isn’t giving back performance, but why would so much money be spent on larger LLMs then?

        • lightstream@lemmy.ml
          link
          fedilink
          English
          arrow-up
          3
          arrow-down
          2
          ·
          1 month ago

          The merits are real. I do understand the deep mistrust people have for tech companies, but there’s far too much throwing out of the baby with the bath water.

          As a solo developer, LLMs are a game-changer. They’ve allowed me to make amazing progress on some of my own projects that I’ve been stuck on for ages.

          But it’s not just technical subjects that benefit from LLMs. ChatGPT has been a great travel guide for me. I uploaded a pic of some architecture in Berlin and it went into the history of it, I asked it about some damage to an old church in Spain - turned out to be from the Spanish civil war, where revolutionaries had been mowed down by Franco’s firing squads.

          Just today, I was getting help from an LLM for an email to a Portuguese removals company. I sent my message in English with a Portuguese translation, but the guy just replied back with a single sentence in broken English:

          “Yes a can , need tho mow m3 you need delivery after e gif the price”

          The first bit is pretty obviously “Yes I can” but I couldn’t really be sure what he was trying to say with the rest of it. So I asked ChatGPT who responded:

          It seems he’s saying he can handle the delivery but needs to know the total volume (in cubic meters) of your items before he can provide a price. Here’s how I’d interpret it:

          “Yes, I can [do the delivery]. I need to know the [volume] in m³ for delivery, and then I’ll give you the price.”

          Thanks to LLMs, I’m able to accomplish so many things that would have previously taken multiple internet searches and way more effort.

    • ClamDrinker@lemmy.world
      link
      fedilink
      English
      arrow-up
      2
      ·
      edit-2
      1 month ago

      People differentiate AI (the technology) from AI (the product being peddled by big corporations) without making clear that nuance (Or they mean just LLMs, or they aren’t even aware the technology has a grassroots adoption outside of those big corporations). It will take time, and the bubble bursting might very well be a good thing for the technology into the future. If something is only know for it’s capitalistic exploits it’ll continue to be seen unfavorably even when it’s proven it’s value to those who care to look at it with an open mind. I read it mostly as those people rejoicing over those big corporations getting shafted for their greedy practices.

      • sugar_in_your_tea@sh.itjust.works
        link
        fedilink
        English
        arrow-up
        3
        ·
        1 month ago

        the bubble bursting might very well be a good thing for the technology into the future

        I absolutely agree. It worked wonders for the Internet (dotcom boom in the 90s), and I imagine we’ll see the same w/ AI sometime in the next 10 years or so. I do believe we’re seeing a bubble here, and we’re also seeing a significant shift in how we interact w/ technology, but it’s neither as massive or as useless as proponents and opponents claim.

        I’m excited for the future, but not as excited for the transition period.

        • ArchRecord@lemm.ee
          link
          fedilink
          English
          arrow-up
          2
          ·
          1 month ago

          I’m excited for the future, but not as excited for the transition period.

          I have similar feelings.

          I discovered LLMs before the hype ever began (used GPT-2 well before ChatGPT even existed) and the same with image generation models barely before the hype really took off. (I was an early closed beta tester of DALL-E)

          And as my initial fascination grew, along with the interest of my peers, the hype began to take off, and suddenly, instead of being an interesting technology with some novel use cases, it became yet another technology for companies to show to investors (after slapping it in a product in a way no user would ever enjoy) to increase stock prices.

          Just as you mentioned with the dotcom bubble, I think this will definitely do a lot of good. LLMs have been great for asking specialized questions about things where I need a better explanation, or rewording/reformatting my notes, but I’ve never once felt the need to have my email client generate every email for me, as Google seems to think I’d want.

          If we can just get all the over-hyped corporate garbage out, and replace it with more common-sense development, maybe we’ll actually see it being used in a way that’s beneficial for us.

          • sugar_in_your_tea@sh.itjust.works
            link
            fedilink
            English
            arrow-up
            3
            ·
            edit-2
            1 month ago

            I initially started with natural language processing (small language models?) in school, which is a much simpler form of text generation that operates on words instead of whatever they call the symbols in modern LLMs. So when modern LLMs came out, I basically registered that as, “oh, better version of NLP,” with all its associated limitations and issues, and that seems to be what it is.

            So yeah, I think it’s pretty neat, and I can certainly see some interesting use-cases, but it’s really not how I want to interface with computers. I like searching with keywords and I prefer the process of creation more than the product of creation, so image and text generation aren’t particularly interesting to me. I’ll certainly use them if I need to, but as a software engineer, I just find LLMs in all forms (so far) annoying to use. I don’t even like full text search in many cases and prefer regex searches, so I guess I’m old-school like that.

            I’ll eventually give in and adopt it into my workflow and I’ll probably do so before the average person does, but what I see and what the media hypes it up to be really don’t match up. I’m planning to set up a llama model if only because I have the spare hardware for it and it’s an interesting novelty.

    • finitebanjo@lemmy.world
      link
      fedilink
      English
      arrow-up
      19
      ·
      edit-2
      1 month ago

      People are dumping billions of dollars into it, mostly power, but it cannot turn profit.

      So the companies who, for example, revived a nuclear power facility in order to feed their machine with ever diminishing returns of quality output are going to shut everything down at massive losses and countless hours of human work and lifespan thrown down the drain.

      This will have an economic impact quite large as many newly created jobs go up in smoke and businesses who structured around the assumption of continued availability of high end AI need to reorganize or go out of business.

      Search up the Dot Com Bubble.

    • werefreeatlast@lemmy.world
      link
      fedilink
      English
      arrow-up
      1
      arrow-down
      5
      ·
      1 month ago

      AI vagina Fleshlight beds. You just find your sleep inside one and it will do you all night long! Telling you stories of any topic. Massaging you in every possible way. Playing your favorite music. It’s like a living room! Oh I’m sleeping in the living room again. Yeah I’m in the dog house. But that’s why you need an AI vagina Fleshlight bed!

        • werefreeatlast@lemmy.world
          link
          fedilink
          English
          arrow-up
          4
          arrow-down
          1
          ·
          1 month ago

          I woke up at 4 this morning. The fridge made a big ice maker noise that sounded like a door getting slammed. Anyway here I am shit posting and reading shit posts.

  • Someplaceunknown@fedia.io
    link
    fedilink
    arrow-up
    236
    arrow-down
    3
    ·
    1 month ago

    “LLMs such as they are, will become a commodity; price wars will keep revenue low. Given the cost of chips, profits will be elusive,” Marcus predicts. “When everyone realizes this, the financial bubble may burst quickly.”

    Please let this happen

  • Etterra@lemmy.world
    link
    fedilink
    English
    arrow-up
    24
    arrow-down
    6
    ·
    1 month ago

    Good. I look forward to all these idiots finally accepting that they drastically misunderstood what LLMs actually are and are not. I know their idiotic brains are only able to understand simple concepts like “line must go up” and follow them like religious tenants though so I’m sure they’ll waste everyone’s time and increase enshitification with some other new bullshit once they quietly remove their broken (and unprofitable) AI from stuff.

  • Zier@fedia.io
    link
    fedilink
    arrow-up
    4
    arrow-down
    1
    ·
    1 month ago

    It’s gonna crash like a self driving tesla. It’s gonna fall apart like a cybertrukkk.

  • randon31415@lemmy.world
    link
    fedilink
    English
    arrow-up
    31
    arrow-down
    1
    ·
    1 month ago

    The hype should go the other way. Instead of bigger and bigger models that do more and more - have smaller models that are just as effective. Get them onto personal computers; get them onto phones; get them onto Arduino minis that cost $20 - and then have those models be as good as the big LLMs and Image gen programs.

    • Yaky@slrpnk.net
      link
      fedilink
      English
      arrow-up
      23
      ·
      1 month ago

      Other than with language models, this has already happened: Take a look at apps such as Merlin Bird ID (identifies birds fairly well by sound and somewhat okay visually), WhoBird (identifies birds by sound, ) Seek (visually identifies plants, fungi, insects, and animals). All of them work offline. IMO these are much better uses of ML than spammer-friendly text generation.

      • stringere@sh.itjust.works
        link
        fedilink
        English
        arrow-up
        2
        ·
        1 month ago

        Platnet and iNaturalist are pretty good for plant identification as well, I use them all the time to find out what’s volunteering in my garden. Just looked them up and it turns out iNaturalist is by Seek.

      • mm_maybe@sh.itjust.works
        link
        fedilink
        English
        arrow-up
        3
        ·
        1 month ago

        those are all classification problems, which is a fundamentally different kind of problem with less open-ended solutions, so it’s not surprising that they are easier to train and deploy.

    • rumba@lemmy.zip
      link
      fedilink
      English
      arrow-up
      10
      ·
      1 month ago

      This has already started to happen. The new llama3.2 model is only 3.7GB and it WAAAAY faster than anything else. It can thow a wall of text at you in just a couple of seconds. You’re still not running it on $20 hardware, but you no longer need a 3090 to have something useful.

    • _NoName_@lemmy.ml
      link
      fedilink
      English
      arrow-up
      4
      arrow-down
      2
      ·
      1 month ago

      That would be innovation, which I’m convinced no company can do anymore.

      It feels like I learn that one of our modern innovations was already thought up and written down into a book in the 1950s, and just wasn’t possible at that time due to some limitation in memory, precision, or some other metric. All we did was do 5 decades of marginal improvement to get to it, while not innovating much at all.

    • dustyData@lemmy.world
      link
      fedilink
      English
      arrow-up
      8
      arrow-down
      1
      ·
      edit-2
      1 month ago

      Well, you see, that’s the really hard part of LLMs. Getting good results is a direct function of the size of the model. The bigger the model, the more effective it can be at its task. However, there’s something called compute efficient frontier (technical but neatly explained video about it). Basically you can’t make a model more effective at their computations beyond said linear boundary for any given size. The only way to make a model better, is to make it larger (what most mega corps have been doing) or radically change the algorithms and method underlying the model. But the latter has been proving to be extraordinarily hard. Mostly because to understand what is going on inside the model you need to think in rather abstract and esoteric mathematical principles that bend your mind backwards. You can compress an already trained model to run on smaller hardware. But to train them, you still need the humongously large datasets and power hungry processing. This is compounded by the fact that larger and larger models are ever more expensive while providing rapidly diminishing returns. Oh, and we are quickly running out of quality usable data, so shoveling more data after a certain point starts to actually provide worse results unless you dedicate thousands of hours of human labor producing, collecting and cleaning the new data. That’s all even before you have to address data poisoning, where previously LLM generated data is fed back to train a model but it is very hard to prevent it from devolving into incoherence after a couple of generations.

      • mm_maybe@sh.itjust.works
        link
        fedilink
        English
        arrow-up
        1
        ·
        1 month ago

        this is learning completely the wrong lesson. it has been well-known for a long time and very well demonstrated that smaller models trained on better-curated data can outperform larger ones trained using brute force “scaling”. this idea that “bigger is better” needs to die, quickly, or else we’re headed towards not only an AI winter but an even worse climate catastrophe as the energy requirements of AI inference on huge models obliterate progress on decarbonization overall.