• Technus@lemmy.zip
    link
    fedilink
    English
    arrow-up
    1
    ·
    3 days ago

    The size of the context window is fixed in the structure of the model. LLMs are still at their core artificial neural networks, so an analogy to biology might be helpful.

    Think of the input layer of the model like the retinas in your eyes. Each token in the context window, after embedding (i.e. conversion to a series of numbers, because ofc it’s just all math under the hood), is fed to a certain set of input neurons, just like the rods and cones in your retina capture light and convert it to electrical signals, which are passed to neurons in your optic nerve, which connect to neurons in your visual cortex, each layer along the way processing and analyzing the signal.

    The number of tokens in the context window is directly proportional to the number of neurons in the input layer of the model. To make the context window bigger, you have to add more neurons to the input layer, but that quickly results in diminishing returns without adding more neurons to the inner layers to be able to process the extra information. Ultimately, you have to make the whole model larger, which means more parameters, which means more data to store and more processing power per prompt.

    • scarabic@lemmy.world
      link
      fedilink
      English
      arrow-up
      2
      ·
      3 days ago

      Oh… so it’s kind of like taking something that’s few-to-many and making it many-to-many, and the number of connections is what costs you.