A recent study by researchers at Carnegie Mellon University tackles the thorny issue of copyright and compensation for generative AI models that create new images.
A team in the Department of Computer Science's Generative Intelligence Lab, in collaboration with Adobe Research and the University of California, Berkeley, has developed two algorithms that help generative AI models take important steps to address these problems. . The first algorithm prevents these models from producing copyrighted material, and the second algorithm compensates human authors when the models use their work to produce images. Develop a method to
Image generation models such as DALL-E 2, Midjourney, and Stable Diffusion are powerful tools for creating realistic visual content from simple text descriptions. These models are trained behind the scenes on millions to billions of internet images, some of which may include copyrighted material, licensed images, and personal photos. There is a gender.
“As researchers in this field, we have a responsibility to address the social issues that come with this field,” Junyang Zhu said.(Opens in new window)Assistant Professor, Robotics Institute(Opens in new window) He is also the director of the Generative Intelligence Lab, which works to solve ethical and social issues related to generative AI. “Developing technology to address these issues is just one aspect, and more work is needed both in law and in how AI is regulated.”
Research team will present two papers at the International Conference on Computer Vision 2023(Opens in new window) October of this year.
First paper: “Concept Removal in Text-to-Image Diffusion Models”(Opens in new window)” helps AI-generated models avoid creating certain copyrighted images and styles.
For example, if you ask an AI program for a painting by a living artist, it will generate an image that closely resembles that artist's style. The algorithm proposed by CMU researchers aims to prevent this and instead forces the AI model to generate a generic painting.
“This is an option for artists if they want to opt out of the AI model at any time,” said Dr. Nupur Kumari. He is a robotics engineering student and lead author of the paper. “It gives more control and freedom to people and businesses who don't want their images used.”
Second paper “Evaluating data attribution for text-to-image models”(Opens in new window)” has developed a way to compensate individuals and companies whose data is used to train AI. The algorithm attempts to determine how much each training image contributes to the generated image. This could potentially be used to fairly distribute payments to owners of copyrighted images. AI database.
For example, if you ask an AI model to generate a watercolor image, the resulting image will be influenced by the watercolor artist. This new algorithm aims to quantify how much each artist contributed to this new composite artwork.
“We are working to answer the question, 'Which set of images influenced the composite image?'” said doctoral student Sheng-Yu Wang. student in robotics and lead author of the paper. “This algorithm could potentially be used to allocate credit to data contributors, with the ultimate goal being to fairly reward data owners who contribute to the creation of generative AI.”
The authors acknowledge that the new algorithm is still in its early stages of development and many questions remain unanswered. For example, it's unclear whether copyrighted content was permanently removed or just hidden somewhere, and how attribution algorithms evaluate the impact of each training image. Further research is needed to explain.
Despite unanswered questions, the new algorithm paves the way to addressing copyright issues across generative AI platforms and takes the first step toward compensating individuals and companies whose works contribute to AI images. Masu.