• floofloof@lemmy.ca
    link
    fedilink
    English
    arrow-up
    8
    ·
    18 hours ago

    I wonder if they gave considered crowdsourcing this, having many people type in small chunks of the data by hand, doing their own character recognition? Get enough people in and enough overlap and the process would have some built-in error correction.

    • apftwb@lemmy.worldOP
      link
      fedilink
      English
      arrow-up
      11
      ·
      18 hours ago

      I mean the problem is that even with human eyes it’s still really hard to tell l and 1 in that font.

      • Kevlar21@piefed.social
        link
        fedilink
        English
        arrow-up
        11
        ·
        edit-2
        17 hours ago

        Not an expert at all but I’m genuinely curious how long it would take to check all possibilities for each I or 1? Is that the full length of the hash or whatever? So in this example image we have 2^8 =256 different possibilities to check? Seems like that would be easy enough for a computer.

        Edit: actually read the article. It’s much more complicated than this. This isn’t really the only issue and the base64 in the example was 76 pages long.