Insights

Corporate and Commercial: A new regime to protect copyright owners against unlawful scraping?

16 December 2024


The Bill and the amendments

The Data (Use and Access) Bill (the Bill) proposes several reforms to how personal and business data can be used. Despite the obvious overlap between data and copyright law in the context of AI models, it is not a stated aim of the Bill to address this intersection. Nonetheless, at this key stage in the Bill’s reading, several amendments have been tabled by the crossbench peer Baroness Kidron that would require the government to implement a new regime that would apply to operators of internet scrapers and general-purpose AI models. Legislation created under the Bill (if enacted) would require such operators to:

  1. Comply with UK copyright law, and to abide by a set of procedures;
  2. Be transparent about the identity and purpose of their crawlers; operate distinct crawlers for different purposes; and not penalise copyright holders who choose to deny scraping for AI by downranking their content in, or removing their content from, a search engine; and
  3. Be transparent about the copyrighted works they have scraped, allowing copyright holders to understand when their work has been scraped.

Practical implications and possible consequences

The scope of the amendments goes much further than anything we have seen to date in terms of protecting rightsholders from unlawful scraping and web crawling. Regulations created under the Bill would apply at each stage of when AI models may be fed data, namely in pre-training, fine tuning, grounding and retrieval-augmented generation. For operators and developers of AI models, the proposed amendments will undoubtedly be seen as restrictive and anti-innovation.

It is noteworthy that any regulations made under the first proposed amendment may be applied extra-territorially with any operator that markets their AI models in the UK being in scope. This amendment is also interesting in the context of any confirmation that may come from civil decisions in respect of the application of the Copyright, Designs and Patents Act 1988 (CDPA). The government would be able to act quickly and make provisions clarifying the steps the operators of web crawlers and AI models must take to comply with United Kingdom copyright law. This would be particularly relevant if the claim brought by Getty Images against Stability AI were to be decided in Getty’s favour.

Also of note, the last proposed amendment would in effect require operators of AI models to maintain a register of source data that must be accessible to the copyright holder upon request. This amendment is surely an attempt to address the evidential difficulties that currently exist for rightsholders when it comes to establishing copyright infringement against the operators and developers of AI models.

The Lords may ultimately decide that the Bill is not the proper legislative vehicle in which to include such bold measures. The government has promised an AI Bill, which would seem the more obvious fit. However, there is a feeling that too much weight has been placed on the AI Bill, which may come too late for rightsholders. The fact that the technology is outpacing legislation may be a motivation behind the proposed amendments. In any event, these are noteworthy proposals demonstrating the influence of the rightsholders’ lobby in the debate.

What comes next?

The proposed amendments will be debated in the Lords either during the 16 December sitting or the 18 December 2024 sitting. After this committee stage in the Lords, there will be one further reading before the Bill transfers to the Commons. We expect a lively debate and will be monitoring the Bill through its passage.

Tancred Campbell
Tancred Campbell

Solicitor

Sign up for Our Newsletter