In the days after the US Department of Justice (DOJ) published 3.5 million pages of documents related to the late sex offender Jeffrey Epstein, multiple users on X have asked Grok to “unblur” or remove the black boxes covering the faces of children and women in images that were meant to protect their privacy.

  • Paranoidfactoid@lemmy.world
    link
    fedilink
    English
    arrow-up
    13
    ·
    2 hours ago

    How do these AI models generate nude imagery of children without having been trained with data containing illegal images of nude children?

    • AnarchistArtificer@slrpnk.net
      link
      fedilink
      English
      arrow-up
      7
      ·
      48 minutes ago

      The datasets they are trained on do in fact include CSAM. These datasets are so huge that it easily slips through the cracks. It’s usually removed whenever it’s found, but I don’t know how this actually affects the AI models that have already been trained on that data — to my knowledge, it’s not possible to selectively “untrain” models, and they would need to be retrained from scratch. Plus I occasionally see it crop up in the news about how new CSAM keeps being found in the training data.

      It’s one of the many, many problems with generative AI

    • Senal@programming.dev
      link
      fedilink
      English
      arrow-up
      1
      ·
      49 minutes ago

      Easy answer is , they don’t

      Though that’s just the one admitting to it.

      A lightly more nuanced answer is , it probably depends, there’s likely to be some inference made between age ranges but my guess is that it’d be sub-par given that it sometimes struggles with reproducing images it has a tonne of actual data for.