Is pseudonymised data personal data (Part 2)
An analysis of the Court's verdict in EDPS v SRB
TL;DR
This newsletter is about a decision by the Court of Justice of the European Union (CJEU) in EDPS v SRB. It looks at the Court's view on how the GDPR applies to pseudonymised data and the implications this has for certain data processing activities.
Here are the key takeaways:
Pseudonymisation refers to techniques that reduce the identifiability of personal data. It consists of taking original data, applying a pseudonymisation technique to that data with the output consisting of pseudonymised data.
EDPS v SRB is essentially a case about in what circumstances pseudonymised data may be regarded as personal data under the GDPR. It considers whether it is correct to apply a strict approach to this question, whereby pseudonymised data should always be considered personal data.
The CJEU upheld a relative approach to this issue. This means that, under the GDPR, pseudonymised data is not always personal data.
If an entity receives pseudonymised data from a controller without the additional information required to link the data to a data subject, and also does not have any other reasonable means to identify the data subject, then that entity has not received personal data. The position of the receiving entity in this case is therefore pertinent to the question of whether the information is indeed personal data.
The CJEU's decision in EDPS v SRB could have interesting implications for certain types of data processing activities. This includes on-device processing and encryption, as well as the personal data potentially stored in deployed LLMs.
The issues at play
Back in February this year, I took a look at an Advocate General opinion in the case of EDPS v SRB before the Court of Justice of the European Union (CJEU). This case tests the presumption that pseudonymised data is always personal data under the GDPR.
Pseudonymisation refers to techniques that reduce the identifiability of personal data. It consists of taking original data, applying a pseudonymisation technique to that data with the output consisting of pseudonymised data. It therefore consists of three elements:
The original data
The pseudonymisation technique applied
The pseudonymised data (which itself consists of the output pseudonym and the additional information that can be used to elucidate the original data)
An example of a pseudonymisation technique is encryption:
The original data is the plaintext that is being encrypted
Applied to this plaintext is the cryptographic protocol
The output of this application is the cipher text and the cryptographic keys to decrypt the cipher and turn it back to into the plaintext
A full summary of the facts of the case can be found in my previous post, but to quickly recap:
The Single Resolution Board (SRB) is the resolution authority for the European Bank union.
In June 2017, SRB hired Deloitte to carry out some analysis that involved comments that SRB had collected from shareholders of a bank.
SRB therefore shared with Deloitte copies of the comments along with alphanumeric codes generated for each comment.
SRB did not share with Deloitte any other data SRB had initially collected from the shareholders.
Those shareholders complained to the European Data Protection Supervisor (EDPS) that SRB had not informed them that their personal data would be shared with Deloitte.
SRB argued that the information it shared with Deloitte was not personal data since the comments and alphanumeric codes could not be used by Deloitte to identify individuals.
The EDPS disagreed with SRB's view and found that the information shared with Deloitte constituted personal data.
SRB brought proceedings against the EDPS before the General Court of the CJEU, which ended up ruling in SRB's favour holding that the information shared with Deloitte was not personal data.
The EDPS appealed this verdict by the General Court, which is the subject of this latest case.
AG Spielmann's opinion on the matter was that the information shared with Deloitte was not personal data. The reasoning was threefold:
As per Recital (16), the identifiability of a data subject could be achieved by a data controller 'or by another person.' The reasonable likelihood of identifiability needs to be considered, taking into account the cost and time required to do so and the technology available to achieve identifiability.1
CJEU caselaw has previously set out the parameters for such identifiability. In particular, the Court has held that, in certain circumstances, information could still considered personal data even if "dissociated from the identification data held by someone else."2
Pseudonymised data may not be considered personal data if "the risk of identification is non-existent or insignificant."3
Therefore:
According to AG, this reasoning supports a relative, rather than a strict, approach to the concept of personal data and pseudonymisation. Taking a relative approach to the current case, the AG suggests that the key question for the Court to consider is whether "Deloitte had reasonable means to identify [the data subjects]."4 If Deloitte did have such means, only then should the comments and alphanumeric codes it received from the SRB should be considered personal data.
The case is ultimately about the following question:
If entity A collects personal data, pseudonymises it and shares only the pseudonymised with entity B, with only entity A possessing the additional information required to use the pseudonymised data to identify specific individuals, is the pseudonymised data shared with entity B personal data under the GPDR?
The CJEU's verdict
In simple terms, the CJEU upheld the relative approach to pseudonymised data - pseudonymised data is not always personal data.
In coming to this conclusion, the Court addressed two main matters:
The meaning of personal data
The concept of pseudonymisation
The meaning of personal data
The SRB is regulated by a version of the GDPR that applies specifically to EU institutions (Regulation (EU) 2018/1725). But the definitions of 'personal data' under the GDPR and Regulation (EU) 2018/1725 are the same, which is:
...any information relating to an identified or identifiable natural person (‘data subject’); an identifiable natural person is one who can be identified, directly or indirectly, in particular by reference to an identifier such as a name, an identification number, location data, an online identifier or to one or more factors specific to the physical, physiological, genetic, mental, economic, cultural or social identity of that natural person.
The Court first pointed about that given the phrase 'any information', the notion of 'personal data' under EU data protection law is a wide one encompassing both objective and subjective information (including opinions and assessments).5 Information relates to a data subject (i.e., an identifiable person) when its "content, purpose or effect links it to an identifiable person."6
In this case, the CJEU found that the the comments along with their respective alphanumeric codes were personal data. The comments constituted opinions or views, and such types of information are "an expression of a person's thinking" and therefore "are necessarily closely linked to that person."7 This flows from the Court's decision in the Nowak case in which it held that an examination script can contain personal data belonging to the person taking the exam, including the candidate answers and examiner comments contained therein.
The concept of pseudonymisation
The key question for the Court was whether the information shared with Deloitte (i.e., the comments and their respective alphanumeric codes) still constituted personal data when it received it from the SRB. The EDPS had argued that such information was in fact personal data because the additional information needed to link the comments to individuals merely existed, even though such additional information was not accessible to Deloitte and only accessible to the SRB.
In its judgment, the Court referenced the definition of 'pseudonymisation' under the GDPR8 and Regulation (EU) 2018/1725:9
...the processing of personal data in such a manner that the personal data can no longer be attributed to a specific data subject without the use of additional information, provided that such additional information is kept separately and is subject to technical and organisational measures to ensure that the personal data are not attributed to an identified or identifiable natural person.
The Court therefore acknowledged that the purpose of pseudonymisation is to reduce the identifiability of a data subject.10 This is also mentioned in Recital (17) of Regulation (EU) 2018/1725.
Furthermore, the Court acknowledged that, if there is additional information that exists that can be used to link pseudonymised data to an identifiable person, then that additional information is indeed pseudonymised data.11 So when a cryptographic protocol is applied to plaintext and generates both the cipher text and the cryptographic keys, the existence of those keys means that it is possible to transfer the cipher text back into plaintext. Accordingly, the plaintext must be regarded as pseudonymised data, and not anonymised data.
However, the Court also acknowledged that for effective pseudonymisation, the additional information that makes identifiability possible needs to be kept separately.12 This is in fact explicitly mentioned in the definition of pseudonymisation: provided that such additional information is kept separately and is subject to technical and organisational measures to ensure that the personal data are not attributed to an identified or identifiable natural person." (Emphasis added)
Therefore, if technical and organisational measures are put in place to keep the additional information separate, then this could have an impact on whether that information is ultimately personal data.
But the perspective of the controller and the recipient of the pseudonymised data is important in this regard. If the controller maintains access to the additional data, then the pseudonymised data, from controller's perspective, is personal data. This is because the controller has the means to identify data subjects using the pseudonymised data along with the additional information it holds.
However, this may differ with respect to an entity which only possesses the pseudonymised data and not the additional information. From the recipient's perspective (like Deloitte in this case) the pseudonymised data may not be regarded as personal data if in particular:13
The recipient cannot "lift" the technical and organisational measures to prevent re-identification when it is processing the data
The recipient cannot otherwise perform re-identification through some other means, "such as cross checking with other factors."
Therefore, as provided in Recital (16) of Regulation (EU) 2018/1725, to determine whether pseudonymised data itself constitutes personal data, "account should be taken of all the means reasonably likely to be used" by the recipient of that pseudonymised data. This includes considering "all objective factors, such as the costs of and the amount of time required for identification, taking into consideration the available technology at the time of the processing and technological developments."14 For example, it could be possible to perform re-identification by using the pseudonymised data to find additional information on the internet which together could be used to identify a specific individual.15
However, the Court has also held previously that "means of identifying the data subject is not reasonably likely to be used where the risk of identification appears in reality to be insignificant, in that the identification of that data subject is prohibited by law or impossible in practice."16
These findings led the Court to conclude, as AG Spielmann did, that pseudonymised data are not always personal data.17 It depends on the circumstances, and therefore a relative approach should be take when determining whether pseudonymised data are personal data in the given context:
...pseudonymised data must not be regarded as constituting, in all cases and for every person, personal data for the purposes of the application of Regulation 2018/1725, in so far as pseudonymisation may, depending on the circumstances of the case, effectively prevent persons other than the controller from identifying the data subject in such a way that, for them, the data subject is not or is no longer identifiable.18
Accordingly, the pseudonymised data that the SRB shared with Deloitte (i.e., the comments with their respective alphanumeric codes) may not be personal data. It depends on whether Deloitte possessed 'reasonably likely' means to identify the shareholders that submitted the comments.
SRB's obligations with respect to the pseudonymised data
Another key finding from the CJEU's judgment was the obligations that the SRB was subject to with respect of the pseudonymised data it shared with Deloitte.
On this, the Court held that, since the pseudonymised data in the hands of the SRB was personal data collected from shareholders, the SRB's obligation to inform shareholders of potential recipients of that data, like Deloitte, still applied.
The fact that the information shared with Deloitte may not be personal data in the circumstances is immaterial. This was because the obligation to provide data subjects with information about how their data may be processed applies at the point of collection:
...the purposes of the obligation to provide the data subject – at the time of collection of the personal data linked to him or her – with information relating to the potential recipients of those data is to enable that data subject to decide, in full knowledge of the facts, whether to provide or, on the contrary, refuse to provide the personal data being collected from him or her.19
Therefore:
...the obligation to provide information laid down in Article 15(1)(d) of Regulation 2018/1725 is part of the legal relationship between the data subject and the controller and, therefore, it concerns the information in relation to that data subject as it was transmitted to that controller, thus before any potential transfer to a third party.20
Following this, the SRB was required to provide information to shareholders about the fact that their personal data may be shared with other third parties like Deloitte. This is the case "irrespective of whether or not those data were personal data, from Deloitte’s point of view, after any potential pseudonymisation."21
Thoughts on the case
In my previous post on EDPS v SRB, I looked at the implication of a relative approach to pseudonymised data for on-device processing and encryption. If, in a client-server type of architecture, the service provider receives encrypted data from a client device with no means to decrypt that data (i.e., it does not have its own copy of the cryptographic keys), then it has not received personal data as per the GDPR.
However, if homomorphic encryption becomes more feasible to deploy at scale for these types of services, whereby operations can be performed on encrypted data without decrypting it, would that encrypted data still be considered non-personal data? Perhaps not.
Another interesting implication of the CJEU's decision in EDPS v SRB is in relation to whether deployed LLMs 'contain' personal data. These models are usually trained on personal data, and during their training there is the possibility that they memorise some of that personal data to the extent that, using the right prompt, that very data can be included in a model output.
Using the relative approach from EDPS v SRB, would it be the case that such information is only personal data to the extent that the user of the model is able to work out the prompt to extract it from the model (akin to someone managing to regenerate the cryptographic keys to decrypt encrypted data)? This might depend on two key questions though:
Is it reasonably likely that somebody will use the the LLM with a prompt that results in the generation of personal data that is not included in the prompt itself?
Is it reasonably likely that those with access to these outputs will be able identify the data subject concerned?
Case C-413/23 P, European Data Protection Supervisor v Single Resolution Board (6 February 2025), para. 54-55.
Case C-413/23 P, European Data Protection Supervisor v Single Resolution Board (6 February 2025), para. 56.
Case C-413/23 P, European Data Protection Supervisor v Single Resolution Board (6 February 2025), para. 57.
Case C-413/23 P, European Data Protection Supervisor v Single Resolution Board (6 February 2025), para. 59.
Case C‑413/23 P, EDPS v SRB (4 September 2025), para. 54.
Case C‑413/23 P, EDPS v SRB (4 September 2025), para. 55.
Case C‑413/23 P, EDPS v SRB (4 September 2025), para. 58.
Article 3.6.
Case C‑413/23 P, EDPS v SRB (4 September 2025), para. 72.
Case C‑413/23 P, EDPS v SRB (4 September 2025), para. 73.
Case C‑413/23 P, EDPS v SRB (4 September 2025), para. 74.
Case C‑413/23 P, EDPS v SRB (4 September 2025), para. 77.
Case C‑413/23 P, EDPS v SRB (4 September 2025), para. 79.
Case C‑413/23 P, EDPS v SRB (4 September 2025), para. 81.
Case C‑413/23 P, EDPS v SRB (4 September 2025), para. 82.
Case C‑413/23 P, EDPS v SRB (4 September 2025), para. 82.
Case C‑413/23 P, EDPS v SRB (4 September 2025), para. 86.
Case C‑413/23 P, EDPS v SRB (4 September 2025), para. 108.
Case C‑413/23 P, EDPS v SRB (4 September 2025), para. 110.
Case C‑413/23 P, EDPS v SRB (4 September 2025), para. 112.