The European Union recently introduced the AI Act, a new governance framework that requires organizations to be more transparent about the training data of their AI systems.
If enacted, this legislation could undermine the defenses that many in Silicon Valley have built against such detailed scrutiny of the AI development and deployment process.
Since the public release of ChatGPT from Microsoft-backed OpenAI 18 months ago, interest in and investment in generative AI technology has grown significantly. These applications, which can write text, create images, and produce audio content at record speed, have garnered considerable attention. But the increase in AI activity that has accompanied these changes raises an interesting question: how are AI developers actually getting the data they need to train their models? Are they using unauthorized, copyrighted material?
Enforcement of AI Law
The EU's AI law, which is due to be phased in over the next two years, aims to address these issues. While new laws take time to take hold, phasing in the law will allow regulators the time they need to adapt and businesses the time they need to adjust to new obligations. However, questions remain about the implementation of some of the rules.
One of the law's most controversial provisions stipulates that organizations deploying general-purpose AI models such as ChatGPT must provide a “detailed summary” of the content they used to train them. After consulting with stakeholders, the newly established AI Office has announced plans to publish a template for organizations to follow in early 2025.
AI companies have been vehemently resistant to releasing their training data, arguing that the information is a trade secret and that, if made public, it would give their competitors an unfair advantage. The level of detail required in these transparency reports will have major implications not only for large tech companies like Google and Meta, which have made AI technology central to their future business, but also for smaller AI startups.
Over the past year, several major tech companies, including Google, OpenAI, and Stability AI, have faced lawsuits from creators who claim their content was used without their permission to train AI models. But amid increased scrutiny, some tech companies have over the past two years pierced their corporate veil and negotiated content licensing agreements with individual media outlets and websites. Some creators and lawmakers are concerned that these measures don't go far enough.
MEPs divided
In Europe, the differences in opinion among lawmakers are stark: Dragos Tudrache, who led the drafting of the European Parliament's AI bill, argues that AI companies should be required to open-source their datasets, emphasizing the importance of transparency so that creators can determine whether their work has been used to train AI algorithms.
In contrast, the French government, under the leadership of President Emmanuel Macron, has privately opposed the introduction of rules that could stifle the competitiveness of European AI startups, and French Finance Minister Bruno Le Maire has stressed the need for Europe to become a global leader in AI, not just a consumer of American and Chinese products.
AI law recognizes the need to balance the protection of trade secrets with the promotion of the rights of parties with legitimate interests, including copyright holders. But achieving this balance remains a major challenge.
Different industries have different views on the issue. Matthieu Riouff, CEO of AI-powered image editing company Photoroom, likens the situation to cooking, claiming that there are secret recipes that even the best chefs won't share. Riouff's is just one example in a long list of scenarios in which this kind of crime could flourish. But Thomas Wolf, co-founder of the world's leading AI startup Hugging Face, argues that while there will always be a desire for transparency, not the entire industry will adopt a transparency-first approach.
A series of recent controversies has highlighted just how complicated this can be. OpenAI demoed the latest version of ChatGPT in a public session, but the company was heavily criticized for using a synthetic voice that sounded almost identical to actress Scarlett Johansson. These examples show how AI technology can infringe on individual rights and property rights.
During the development of these regulations, there have been heated debates about their potential impact on future innovation and competitiveness in the world of AI. The French government in particular has argued that the starting point should be innovation, not regulation, given the dangers of regulating aspects that are not well understood.
How the EU regulates AI transparency could have profound effects on technology companies, digital creators, and the digital landscape as a whole. Policymakers face the challenge of fostering innovation in a dynamic AI industry, while at the same time guiding safe and ethical decisions and preventing the infringement of intellectual property rights.
In short, the adoption of the EU AI Law will be a big step towards greater transparency in AI development. However, the actual implementation of these regulations and their outcomes for the industry may still be a long way off. Especially at the dawn of this new regulatory paradigm, the balance between innovation, ethical AI development and intellectual property protection will remain a central and contentious issue for all stakeholders to address.
See also: Apple reportedly getting free access to ChatGPT
Want to learn more about AI and big data from industry leaders? Check out the AI & Big Data Expo in Amsterdam, California and London – this comprehensive event will take place alongside other major events such as Intelligent Automation Conference, BlockX, Digital Transformation Week and Cyber Security & Cloud Expo.
Find out about upcoming enterprise technology events and webinars hosted by TechForge here.