• Hackworth@piefed.ca
    link
    fedilink
    English
    arrow-up
    2
    arrow-down
    1
    ·
    2 days ago

    Adobe’s image generator (Firefly) is trained only on images from Adobe Stock.

      • Hackworth@piefed.ca
        link
        fedilink
        English
        arrow-up
        1
        ·
        1 day ago

        The Firefly image generator is a diffusion model, and the Firefly video generator is a diffusion transformer. LLMs aren’t involved in either process - rather the models learn image-text relationships from meta tags. I believe there are some ChatGPT integrations with Reader and Acrobat, but that’s unrelated to Firefly.

        • utopiah@lemmy.world
          link
          fedilink
          English
          arrow-up
          1
          ·
          18 hours ago

          Surprising, I would expect it’d rely at some point on something like CLIP in order to be prompted.

          • Hackworth@piefed.ca
            link
            fedilink
            English
            arrow-up
            1
            ·
            edit-2
            1 hour ago

            As I understand it, CLIP (and other text encoders in diffusion models) aren’t trained like LLMs, exactly. They’re trained on image/text pairing, which ya get from the metadata creators upload with their photos in Adobe Stock. Open AI trained CLIP with alt text on scraped images, but I assume Adobe would want to train their own text encoder on the more extensive tags on the stock images its already using.

            All that said, Adobe hasn’t published their entire architecture. And there were some reports during the training of Firefly 1 back in '22 that they weren’t filtering out AI-generated images in the training set. At the time, those made up ~5% of the full stock library. Currently, AI images make up about half of Adobe Stock, though filtering them out seems to work well. We don’t know if they were included in later versions of Firefly. There’s an incentive for Adobe to filter them out, since AI trained on AI tends to lose its tails (the ability to handle edge cases well), and that would be pretty devastating for something like generative fill.

            I figure we want to encourage companies to do better, whatever that looks like. For a monopolistic giant like Adobe, they seem to have at least done better. And at some point, they have to rely on the artists uploading stock photos to be honest. Not just about AI, but about release forms, photo shoot working conditions, local laws being followed while shooting, etc. They do have some incentive to be honest, since Adobe pays them, but I don’t doubt there are issues there too.