I want to extract and process the metadata from PNG images and the first line of .safetensors files for LLM’s and LoRA’s. I could spend ages farting around with sed or awk but formats of files are constantly changing. I’d like a faster way to see a summary of training and a few other details when they are available.

    • huginn@feddit.it
      link
      fedilink
      arrow-up
      17
      ·
      3 months ago

      I have a very handy command in my .vimrc for this -

      command! JSON setlocal filetype=json | %!jq .

      Anytime I’m in a json file that isn’t formatted it’s as simple as typing :JSON to have it all sorted.

  • tiredofsametab@kbin.run
    link
    fedilink
    arrow-up
    4
    ·
    3 months ago

    Previously, I coded something in Rust real quick to spit out and manipulate some JSON, but it looks like the jq/yq below would work fine.

  • Nibodhika@lemmy.world
    link
    fedilink
    arrow-up
    8
    ·
    3 months ago

    A week ago I would have said jq, but just the other day I discovered nushell and have been loving it, if you deal with structured data often it’s way easier, just bear in mind it’s not POSIX compatible

    • j4k3@lemmy.worldOP
      link
      fedilink
      English
      arrow-up
      2
      ·
      3 months ago

      I found a Python project that does enough for my needs. Jq looks super powerful though. Thanks. I managed to get yq working for PNG’s, but I had trouble with both jq and yq with safetensor files. I couldn’t figure out how to parse a string embedded in an inconsistent starting binary, and with massive files. I could get in and grab the first line with head. I tried some stuff with expansions, but that didn’t work and sent me looking for others that have solved the issue better than myself.

  • Hammerheart@programming.dev
    link
    fedilink
    arrow-up
    6
    ·
    3 months ago

    What are some goos resources for learning jq? I really struggle when it comes to nested keys/values which obviously limits my ability to use it.

    • timbuck2themoon@sh.itjust.works
      link
      fedilink
      English
      arrow-up
      4
      ·
      3 months ago

      Online json parser. Throw in some data and then structure a query.

      It’ll keep updating the results as you tweak your query. A simple search will probably give you twenty that’ll work. I can’t remember what i normally use off the top of my head.

      • Hammerheart@programming.dev
        link
        fedilink
        arrow-up
        3
        arrow-down
        1
        ·
        edit-2
        3 months ago

        I have perused it, but its both so dense and so broad that its not that helpful unless i know exactly what I’m looking for. I have also tried info and tldr. I actually like tldr the most,. although the exhaustiveness of the man pages must be admired. I dont find it to be the best teacher.

    • Beej Jorgensen@lemmy.sdf.org
      link
      fedilink
      arrow-up
      4
      ·
      3 months ago

      I hate to do this, but AI chatbots are typically pretty good at giving examples for things like this and you can learn from it.

        • Beej Jorgensen@lemmy.sdf.org
          link
          fedilink
          arrow-up
          1
          ·
          2 months ago

          I definitely use them a lot, but I think “very” is too strong a word. It’s pretty easy to get confident, contradictory information from them. They’re a good place to start and brainstorm, but all the information has to be verified either by running and testing the code, or by finding a human source.

          • xavier666@lemm.ee
            link
            fedilink
            English
            arrow-up
            2
            ·
            2 months ago

            True. I wouldn’t use them for very complicated stuff. I currently use them for “what is x?” and “how is x different from y?” kinds of question.

            One advantage of using an AI is that it removes a lot of fluff that you get on blogs. However, that can change very soon when our AI overlords figure out monetization.

    • pingveno@lemmy.ml
      link
      fedilink
      English
      arrow-up
      1
      ·
      3 months ago

      Yeah, I’ve been learning some nushell. If you’re dealing with data, it’s just a great tool. So many sharp edges in the POSIX shell come from it being stringly typed, so having a strongly typed shell is extremely helpful.

  • Diplomjodler@lemmy.world
    link
    fedilink
    arrow-up
    6
    ·
    3 months ago

    Python is very good for working with JSON. Definitely will get you there faster than awk for anything not completely trivial.

  • CaptPretentious@lemmy.world
    link
    fedilink
    arrow-up
    5
    arrow-down
    7
    ·
    3 months ago

    Probably not popular opinion, but pwsh (powershell). It’s got a lot of tooling built in and means I don’t have to learn a different tool just because I’m in a different system.