UK tries to provide ‘clarity’ over copyrighted data usage by AI firms

The UK has proposed introducing an exception to copyright law for AI training for commercial purposes.

The UK government launched a 10-week consultation yesterday (17 December) with the hopes of providing clarity over how copyright-protected material can be used to train AI models.

The proposal was launched in a bid to “drive growth” in both the country’s creative industries as well as the AI sector, according to the government, by ensuring protection and payment for rights holders while supporting AI developers to “innovate responsibly”.

According to the UK Intellectual Property Office, the Department for Science, Innovation and Technology and the Department for Culture, Media and Sport, the consultation will focus on boosting trust and transparency between the sectors, exploring how copyrighted material can be licensed, how data owners can be remunerated and while strengthening access to “high-quality data” for AI developers.

“Uncertainty about how copyright law applies to AI is holding back both sectors,” said the government departments, creating difficulty for data owners who seek payment for use of their work while creating legal risks for AI developers.

In order to address this, the consultation proposes an exception to copyright law for AI training for commercial purposes while allowing owners to reserve their rights.

This, the departments said, would give data owners more certainty and control, and support them in striking licensing deals with AI companies. Meanwhile, this would also give AI developers more certainty about what material they can and cannot use.

Moreover, the consultation also proposes new requirements for AI developers to be transparent, requiring them to provide more information about what content they use to train their models.

However, the government proposals have brought some criticism and concern from book publishers and news organisations. Dan Conway, the CEO of Publishers Association, said the proposed measures “are as yet untested and unevidenced”.

“There has been no objective case made for a new copyright exception, nor has a water-tight rights-reservation process been outlined anywhere around the world.”

Owen Meredith, the chief executive of News Media Association said that the government’s consultation along with its preferred policy “fails to address the real issue”.

“News publishers deserve control over when and how their content is used and, crucially, fair remuneration for its use. Instead of proposing unworkable systems such as the ‘rights reservations’ (or ‘opt-out’) regime, the government should focus on implementing transparency requirements within the existing copyright framework.”

Meanwhile, senior lawyer with Pinsent Masons, Gill Dennis said: “Putting the onus of action on content creators to opt out is highly controversial and, as the government itself acknowledges, faces significant current technical barriers for giving effect to in practice.”

AI innovators like OpenAI have an opaque policy around what data it uses to train its models. Although this technology uses pre-existing copyrighted data, large language models often do not regurgitate the data it processes, making it difficult for copyright holders to prove infringement.

Recently, the Southern District of New York dismissed a lawsuit taken by two news media outlets against OpenAI for allegedly violating copyright law by scraping news article to train its AI models, stating that the outlets were unable to prove “concrete injury”.

Last month, a number of Canadian news publishers launched a lawsuit against OpenAI for copyright infringement, demanding damages ranging in billions. Meanwhile, the New York Times has been engaged in evidence gathering for its own legal battle against the company, claiming that ChatGPT is trained on millions of articles published by the outlet.

Don’t miss out on the knowledge you need to succeed. Sign up for the Daily Brief, Silicon Republic’s digest of need-to-know sci-tech news.

UK tries to provide ‘clarity’ over copyrighted data usage by AI firms

Comments

Leave a Reply Cancel reply

Links

Help & Info