Thanks to Samantha Cole at 404 Media, we are now aware that Automattic plans to sell user data from Tumblr and WordPress.com (which is the host for my blog) for “AI” products. In respon…
Oh, you meant using a generative network to modify artwork so that generative networks can’t learn to modify artwork. A process that’s totally not intrinsic to adversarial training.
If you make the cost of bypassing Nightshade higher than the cost of convincing people to opt in to their data being used in LLM training, then the outcome is obvious. “If you show me the incentives, I’ll show you the outcome.”
The cost will become negligible for any nigh-invisible data fuckery. Like how “single pixel attacks” aren’t really a thing, anymore. And how alphanumeric Captcha became so hard that humans struggle to discern letters.
(The cost of Nightshade versus LLMs is nothing, because LLMs are for text.)
There will be nothing you can fuck with in an image that changes what all networks see, without changing what all humans see. Only a style-transfer network that removes the artist’s style will ultimately keep training from discerning that style.
This is downright laughable when Nightshade can be applied to any existing image, locally… meaning people training on scraped data could surely identify the presence and impact of Nightshade. We’re talking about networks which already exist that can look at a blob of pixels and pick out which parts look like a Picasso, or an avocado chair, or Hatsune Miku. Stable Diffusion in particular is a denoiser. Identifying damage and nonsense is all it does. If that environment includes deliberate countermeasures, they will be worked into the model through existing training, just like watermarks, JPEG artifacts, and the random noise used to make this shit work in the first place.
No? That’s not what NightShade is. NightShade isn’t DRM.
https://nightshade.cs.uchicago.edu/whatis.html
Oh, you meant using a generative network to modify artwork so that generative networks can’t learn to modify artwork. A process that’s totally not intrinsic to adversarial training.
If you make the cost of bypassing Nightshade higher than the cost of convincing people to opt in to their data being used in LLM training, then the outcome is obvious. “If you show me the incentives, I’ll show you the outcome.”
The cost will become negligible for any nigh-invisible data fuckery. Like how “single pixel attacks” aren’t really a thing, anymore. And how alphanumeric Captcha became so hard that humans struggle to discern letters.
(The cost of Nightshade versus LLMs is nothing, because LLMs are for text.)
There will be nothing you can fuck with in an image that changes what all networks see, without changing what all humans see. Only a style-transfer network that removes the artist’s style will ultimately keep training from discerning that style.
This is downright laughable when Nightshade can be applied to any existing image, locally… meaning people training on scraped data could surely identify the presence and impact of Nightshade. We’re talking about networks which already exist that can look at a blob of pixels and pick out which parts look like a Picasso, or an avocado chair, or Hatsune Miku. Stable Diffusion in particular is a denoiser. Identifying damage and nonsense is all it does. If that environment includes deliberate countermeasures, they will be worked into the model through existing training, just like watermarks, JPEG artifacts, and the random noise used to make this shit work in the first place.
I choose not to make perfect the enemy of good.
Word salad, in this context.
The cost you expect to matter will not exist.