12/18/2024 4:48:29 PM | 2 minute read

Data protection and training AI models: Deployers must assess whether the models they use were developed lawfully

Get in touch

Marcus Evans

Head of Information Governance, Privacy and Cybersecurity, EMEA

Rosie Nance

Senior Knowledge Lawyer

Get in touch

Marcus Evans

Head of Information Governance, Privacy and Cybersecurity, EMEA

Rosie Nance

Senior Knowledge Lawyer

The European Data Protection Board (EDPB) was asked by the Irish supervisory authority to issue an Opinion under Article 64(2) General Data Protection Regulation (GDPR) on AI models and processing personal data (the Opinion). The Opinion sets out the EDPB’s answers to the four questions the Irish supervisory authority put to it.

1. When and how AI models can be considered ‘anonymous’
The EDPB considers that AI models trained with personal data cannot, in all cases, be considered anonymous. Anonymity must be assessed on a case-by-case basis. For an AI model to be considered anonymous, both (1) the likelihood of direct (including probabilistic) extraction of personal data regarding individuals whose personal data were used to develop the model and (2) the likelihood of obtaining, intentionally or not, such personal data from queries, should be insignificant, taking into account ‘all the means reasonably likely to be used’ by the controller or another person. It does not refer to the discussion paper from the Hamburg Commissioner for Data Protection and Freedom of Information and the view that LLMs do not store personal data. The EDPB has flagged that it plans to issue guidelines on anonymisation, pseudonymisation, and data scraping in the context of generative AI.

2. and 3. Demonstrating the appropriateness of legitimate interests as a legal basis for (2) developing and (3) deploying an AI model
The Opinion highlights several considerations, including factors that will impact on the balancing test.

4. If an AI Model has been found to have been created, updated or developed using unlawfully processed personal data, what is the impact of this, if any, on the lawfulness of the continued or subsequent processing or operation of the AI model?
The EDPB noted that where a model retained personal data, another controller deploying that model would need to carry out an appropriate assessment to ensure the AI model was not developed by unlawfully processing personal data. The assessment should take account of the risks raised in the deployment phase in terms of the level of detail, and should look at the source of the personal data and whether the processing in the development phase was subject to a finding of infringement.

The finding on the fourth question will be important for controllers using AI models, particularly generative AI. The EDPB has highlighted that controllers deploying AI models may not be able to comply with their GDPR obligations if the model was not developed lawfully – even where the provider is a third-party supplier. Whether the deployer can use the model lawfully will be assessed on a case-by-case basis, taking account of the assessment carried out by the deployer.

Organisations need to ensure they have an AI governance programme in place and that under that programme, all AI models and AI systems are assessed appropriately before deployment. They should also make sure that the process captures everything it needs to capture – for example, new AI functionalities for existing tools. Policies should also be in place to address shadow IT, as the promise of productivity gains may tempt staff into trying tools that have not been assessed at all.

These assessments will also be important to ensure no prohibited use of AI is being made, as the AI Act’s prohibitions apply from 2 February 2025 (with fines of 7% worldwide turnover for non-compliance), and to catch any other AI Act obligations. But the GDPR obligations covered in the Opinion apply now – the EDPB is confirming the position under obligations that already are in place.