I’d like to set up a local coding assistant so that I can stop using Google to ask complex questions to for search results.

I really don’t know what I’m doing or if there’s anything that’s available that respects privacy. I don’t necessarily trust search results for this kind of query either.

I want to run it on my desktop, Ryzen 7 5800xt + Radeon RX 6950xt + 32gb of RAM. I don’t need or expect data center performance out of this thing.

Something like LM Studio and Qwen sounds like it’s what I’m looking for, but since I’m unfamiliar with what exists I figured I would ask for Lemmy’s opinion.

Is LM Studio + Qwen a good combo for my needs? Are there alternatives?

  • perry@aussie.zone
    link
    fedilink
    English
    arrow-up
    6
    ·
    4 days ago

    Qwen coder model from Huggingface, following the instructions there to run it in llama.cpp. Once that’s up: OpenCode and use the custom OpenAI API to connect it.

    You’ll get far better results than trying to use other local options out of the box.

    There may be better models potentially but I’ve found Qwen 2.5 etc to be pretty fantastic overall, and definitely a fine option beside Claude/ChatGPT/Gemini. I’ve tested the lot and it’s usually far more down to instruction and AGENTS.md instructions/layout than it is down to just the model.

    • melfie@lemy.lol
      link
      fedilink
      English
      arrow-up
      1
      ·
      4 days ago

      The main thing that has stopped me from doing this so far is VRAM. My server has a RTX 4060 with 8GB, and not sure that can reasonably run a model like this.

    • 70k32@sh.itjust.works
      link
      fedilink
      English
      arrow-up
      1
      ·
      4 days ago

      This. Llama.cpp with Vulkan backend running in docker-compose, some Qwen3-Coder quantization from huggingface and pointing Opencode to that local setup with a OpenAI-compatible is working great for me.