• Kay Ohtie@pawb.social
    link
    fedilink
    English
    arrow-up
    4
    ·
    4 days ago

    If AIGM was like VSTs or vocaloids that’d be one thing. But it’s more like imitation of sounds, synthesizing song chunks instead of instruments and voices themselves.

    The best way to think of it is something creating an audio file solely by using the Photoshop clone stamp tool across millions of source files.

      • Kay Ohtie@pawb.social
        link
        fedilink
        English
        arrow-up
        0
        ·
        3 days ago

        Sure, but we’re talking generative here, as is the article, and to pretend it’s referring to a tool that’s been standard in libraries and even VSTs for over a decade is either misunderstanding the article or being disingenuous on purpose.

        • Tja@programming.dev
          link
          fedilink
          English
          arrow-up
          1
          arrow-down
          1
          ·
          2 days ago

          No, I get it. It’s generative. GPT: Generative Pretrained Transformer. Music generators add a diffusion layer, but it’s fundamentally new music being generated, not copies of existing songs.

          My point is that it’s just another tool, that automates it even more. It’s not the same, it’s the next step.

          • Kay Ohtie@pawb.social
            link
            fedilink
            English
            arrow-up
            1
            ·
            20 hours ago

            A text prompt -> audio is not a transformer in the sense of what people are talking about, and you know it or just don’t care, or don’t wholly understand how these systems work under the hood as well.

            What I’m referring to are neural models that take an input audio and are effectively a filter that operates as a neural network. Voice mods, instrument adapters, virtual pedals, amp models… These are all actually transformative. There is actual music and effort going into these. And that is not what Bandcamp is after; those were already in heavy use like 15 years ago.

            The things that generate based on text are a transformer in the most technically correct sense but not in the sense of what is meant when people talk about transformative.

            They’re fundamentally different purposes and usages. It’s not generated vocals from nothing but the lyrics; it’s someone else actually singing it and then a model transforming the sound to match an intended pre-set trained target, not generalization.

            • Tja@programming.dev
              link
              fedilink
              English
              arrow-up
              1
              ·
              13 hours ago

              They are a transformer in the same sense ChatGPT is a transformer. And hence they do generate new content that share characteristics and patterns with existing one. It’s no clone tool. Lyrics are new. They probably follow the grammar rules of certain language, but it’s not copy paste. Chords will probably be shared, but melody is new. Etc.