• hemko@lemmy.dbzer0.com
    link
    fedilink
    English
    arrow-up
    3
    ·
    edit-2
    5 hours ago

    Okay, probably fair. I’ve only been working with LLMs that are extremely non-deterministic in their answers. You can ask same question 17 times and the answers have some variance.

    You can request an LLM to create an OpenTofu scripts for deploying infrastructure based on same architectural documents 17 times, and you’ll get 17 different answers. Even if some, most or all of them still manage to get the core principals right, and follow the industry best practices in details (ie. usually what we consider obvious such as enforcing TLS 1.2) that were not specified, you still have large differences in the actual code generated.

    As long as we can not trust that the output here is deterministic, we can’t truly trust that what we request from the LLM is actually what we want, thus requiring human verification.

    If we write IaC for OpenTofu or whatnot, we can somewhat trust that what we specify is what we will receive, but with the ambiguity of AI we can’t currently make sure if the AI is filling out gaps we didn’t know of. With known providers for, say, azurerm module we can always tell the defaults we did not specify.