Simple principles for good data protection
Some truths and some myths about protecting personal data
1. Data has value and risk
Data is the lifeblood of the digital world. If fuels many of the digital products and services that you use and the activities that support their development.
Without people's data, Big Tech cannot extract the behavioural surplus needed to fuel their surveillance capitalist machines that turn quantified experiences into revenue.
But data also comes with risk. Negative consequences can certainly follow from improper uses of data.
Data that is of poor quality or is biased and non-representative that are then fed into machine learning algorithms can produce models that are equally substandard. And if these models are incorporated into systems used in significantly consequential contexts, such as making medical diagnoses, then you've got serious problems.
So data is not just "the new oil" or "an asset." It is something that requires careful and considered use given its ability to be both a boon or a complication, depending on who is using it and how.
2. Privacy and data protection are not the same thing
Privacy regulates who gets access to your data. Data protection regulates how that data can be used once accessed.
The protection of your data does not stop once you had it over. Data protection is concerned with the whole data lifecycle, to ensure that its use is done fairly and responsibly.
Data protection can therefore be a means to ensure privacy. But data protection can also be a means to protect other rights and interests:
For example, the GDPR protects people from being subject to decisions based solely on automated processing producing significant affects on them, with some exceptions. Even where those exceptions apply, individuals have the right to obtain meaningful information about the logic involved in the automated processing and to contest such a decision.
These rights become highly relevant when, for instance, a company algorithmically manages the shift of its gig economy workers. Such workers have the right to know exactly how this automated processing works, especially when it determines their job opportunities.
This is not really about privacy, since the employer would need to know the availability of its workers to assign shifts for the delivery of its service. But how this is done is important, for this still ought to be done in a manner that does not lead to unfair outcomes.
3. Only take what you need
Data minimisation requires that you only collect and use the data that are necessary for fulfilling the intended purpose.
This means that the type, nature and amount of data should be of relevance and utility to the processing purpose. If the processing purpose is not clearly articulated, data minimisation is not possible.
When you buy a product from an online retailer to be delivered to your home, you only need to provide them with your contact details, payment details and home address. Nothing more is required for that processing activity.
4. Don’t keep it “just in case”
Excess data can be messy and risky.
Maintaining multiple datasets across multiple systems is not easy. Plus, the more data one holds, the higher the possibility that it gets misused and the higher the severity of a breach if it is accessed by a malicious third-party attacking your systems.
So do not keep any more data than you actually need. Again, articulating the processing purpose is essential for this - once you know the purpose of its collection, you know whether you need to keep it and for how long for.
For example, why keep CVs of previously rejected job applicants. Those CVs are likely to get outdated over time, and in any case the data are from people who ended up not joining the organisation. What is the point of keeping such records?
If you do not need to keep the data, then do not keep that data.
5. The use of data must be clear and justifiable to the data subject
If you cannot justify the use of data to the person that it belongs to, then should you be using it at all?
This requires considering things from the perspective of the data subject. If you were a data subject, would you accept the justification for the proposed use of your data and is that justification clear to you?
The context of the data processing and the reasonable expectations of the data subject are therefore key factors here. If you post content on a social media platform for your connections to see, you may not reasonably expect that that content is also used to train an AI model developed by the platform provider. One use of data does not automatically justify another, even if it is remotely related.
And even if the use of data is justifiable, this needs to be made clear to the data subject. Stuffing the justification in a privacy policy which is linked at the bottom of a website is sometimes not enough. Information about how data is used ought to be delivered in a manner convenient for the data subject, namely at the time that the data are collected.
Collecting one's data through deceptive does not really suggest that its use is justifiable or fair.
6. You don’t always need consent
Consent is one way for obtaining someone's data, but it is not the only way.
Sometimes asking for consent may not actually be appropriate. This could be the case in a employer-employee relationship; the power imbalance is such that employees will feel pressured to accept the use of their data by their employer. They may often feel that there is not a genuine choice.
Other times, asking for consent simply does not make sense. If processing personal data is necessary to perform a contract with the data subject, asking for their consent to process their personal data is not needed. If the data subject does not give their consent, then performing the contract cannot be done. Consent is not needed in this case.
As a data protection contrarian, I always embraced the "data is the new oil" cliché:
- in its raw form it's not much use - you have to process it into something that then has a pretty specific use case;
- the business of actually moving the stuff and storing it is at least as important as being able to get it;
- leaks are costly, result in very bad publicity, pollute the wider environment and are nigh on impossible to totally clean up;
- employees need some training or they're going to do the equivalent of lighting a cigarette at the wrong time.
I guess these days I'd add that the industries are pretty similar - it starts out with the romance of rugged individualism and wildcatters, and ends up with becoming an extension of the state, with all the associated corruption and political sycophancy.
Someone should do a remake of There Will be Blood, but with Elon Musk as the protagonist.