On October 11, 2023, the French data protection agency, the Commission Nationale de l'informatique et des libertés or CNIL, published its first AI guidance, in a series of 9 practical how-to sheets. The guidance is directed at the creation of datasets for the development of artificial intelligence systems, and aims to reconcile innovation with a respect for individual rights.
Other how-to sheets are to be published before year-end covering topics including the legal basis for legitimate interests, the management of rights and the provision of information to data subjects during the development phase.
The key conclusion to be taken is that CNIL considers that the development and deployment phases of AI systems are compatible with the protection of privacy and personal data. It was expected, but such confirmation is always welcome.
The how-to sheets concern AI systems [published here] based on the collection and use of personal data in machine learning systems and systems based on logic and knowledge. The guidance confirms that the development and deployment phases of an AI system constitute separate processing of personal data, and the legal regime applicable to each processing must be determine.
Practically speaking, CNIL considers that "if the purpose of the production phase is itself defined, explicit and legitimate, the purpose of the development phase is also defined, explicit and legitimate". This approach will particularly apply to AI tools with a specific defined purpose, but would not apply to general purpose AI systems, where the phases can be decorrelated.
The creation of a dataset containing personal data must have a purpose which is specified, explicit and legitimate. The CNIL expressly sets aside a purpose defined as "development of a generative AI model".
The purpose is considered to be specific, explicit and legitimate only if it is sufficiently precise. In order to achieve that precision, CNIL requires that the purpose refers to both the type of AI system developed (which must be stated in a clear and intelligible manner) and the technically feasible functionalities and capacities (the data controller must list the functionalities which are reasonably foreseeable).
CNIL recommends, to achieve transparency, that the following is also specified:
- the foreseeable capacities that are most at risk;
- the functionalities excluded by design;
- as far as possible, the conditions of use of the AI system.
As regards legal basis for processing in view of developing an AI system, CNIL lists the most relevant grounds in its opinion:
- consent by the data subject, though CNIL insists there that said consent must be free, stating that "this means ensuring the possibility for data subjects to give their consent on a case-by-case basis (granularly) where the intended purposes are distinct."
- legitimate interest; CNIL considers that "more often than not, creating a training dataset whose use is itself lawful can be considered legitimate" but the question of proportionality remains paramount;
- task carried out in the public interest, which is of course limited in its scope;
- contract.
CNIL also recalls that the data controller must carry out a DPIA when necessary, considering especially the following risks:
- misuse of the data contained in the training dataset;
- automated discrimination caused by the AI system, introduced during its development and resulting in lesser performances of the system for certain categories of people;
- producing false content on a real person (especially for generative AI systems);
- automated decision-making caused by an automation or confirmation bias;
- users losing control over their published data and freely accessible online;
- known attacks specific to AI systems such as attacks by data poisoning, insertion of a backdoor, or model inversion;
- confidentiality of the data that could be extracted from the AI system.
A public consultation on these documents is open until December 15, 2023, so it is expected that they will evolve and be modified in the near future.
CNIL aims at publishing its definitive guidance in early 2024.