In the rapidly evolving era of generative AI, artists, entertainers, and performers are stepping up their opposition against AI companies that use their creations for training data without their consent. This practice has prompted several lawsuits, some directed at OpenAI, the creator of ChatGPT. Central to these grievances is the extensive use of multimedia data, including written and visual content produced by artists, to train commercial AI products.
One practice that has stirred controversy is the scraping of material from the internet for training datasets. Initially embraced by artists for indexing their work for search results, it is now met with opposition due to AI’s ability to generate competing content. However, artists are not merely resorting to legal action; they are taking a proactive approach by harnessing technology to protect their work. One such solution is Nightshade, an open-source tool still in development.
Nightshade was developed by researchers at the University of Chicago, led by computer science professor Ben Zhao. It serves as an optional feature for Glaze, another online tool that cloaks digital artwork and subtly alters its pixels to confound AI models regarding its style. Nightshade takes the counterattack a step further by causing AI models to learn incorrect labels for the objects and scenes in images.
For instance, Nightshade can make an image of a dog appear as a cat to an AI model. After learning from just 50 such manipulated images, the AI starts producing bizarre images of dogs with strange characteristics. With 100 poison samples, the AI consistently generates a cat when requested to create a dog. After 300 samples, any request for a cat yields a near-perfect dog image. Researchers achieved these results using Stable Diffusion, an open-source text-to-image generation model.
The peculiar operation of generative AI models, which group related words and concepts into spatial clusters called “embeddings,” allows Nightshade to even mislead AI models into generating cats in response to words like “husky,” “puppy,” and “wolf.”
One of the notable challenges with Nightshade’s data poisoning technique is its subtle nature. Poisoned pixels are virtually imperceptible to the human eye and pose difficulties for software data-scraping tools to detect. Detecting and removing images with poisoned pixels in AI training datasets is a significant task. If AI models have already been trained using these images, they may require retraining.
While acknowledging the potential for misuse, the researchers’ primary aim is to empower artists and restore the balance of power from AI companies to creators. Nightshade serves as a powerful deterrent against infringing on artists’ copyright and intellectual property. The researchers have submitted their work on Nightshade for peer review at the computer security conference Unisex.
In the ongoing struggle between AI companies and artists over creative ownership and the utilization of their works, Nightshade emerges as a promising tool, providing creators with a means to assert control over their art in the age of generative AI.