5. And in real life?
We are now faced with the inability to meet requirements of the GDPR by anonymizing the information system at a reasonable price. Should we conclude that it is appropriate to do nothing? Well, anonymized data is very complex to implement in order to comply 100% with the GDPR... But let’s come back to more pragmatic debates. Is it really a problem that people who have access to production data can trace back to the individual in an anonymized set? From my point of view, the risk of the company failing to protect personal data is low. There are certain cases, such as the reconciliation of data between competitors or the case of a dishonest subcontractor, which could cause concern, but these concerns are more of the order of industrial espionage or unfair competition rather than of the GDPR. In the case of an unscrupulous subcontractor, a piece of advice: change to a more expensive and trustworthy candidate, you will gain in the end.
Let's go back to the basics of the GDPR. At no point does it say that anonymization is necessary to protect data. Anonymization is a way to take data out of the scope of the GDPR, but by no means the only solution.
6. Data exposure area
The GDPR requires security measures to be adapted to the risks of data subjects in the event of non-consensual use of their personal data.
Thus, the exposure of this data should be reduced to only those who need it for information processing purposes. Most production environments in companies offer sufficient measures to protect against data theft, but data usage in the IT world today creates much larger and less secure exposure areas than production environments.
Some of these exposure surfaces are handled by cryptography, such as the encryption of communications with web browsers or inter-site communications (the so-called HTTPS). Other sources of exposure are mobile phone applications, connected to corporate networks and managed by a strict internal IT security policy. Still other sources of exposure are test environments, which are often copies of the production environment without adequate security. In this case, anonymization, even if incomplete (i.e. pseudonymization), drastically reduces the data attack surface, which improves the overall state of security and compliance with the GDPR. In practice, the use of pseudonymization, although not allowing data to be taken out of the scope of the GDPR, is encouraged and relaxes the companies that use it on several requirements of the regulation.
7. It is therefore still relevant to start an anonymization process
Anonymization, even if incomplete, therefore provides solutions that improve the level of compliance with the GDPR and lays the foundation for an ideal solution.
The cost and technicality of setting up a perfect global anonymization solution make it almost impossible to implement at the current time. However, the reduction of risks and implementation of projects that will evolve over time make these projects relevant and in line with current needs.
Most anonymization solutions are constantly evolving and improve the situation by successive optimizations of the state of anonymity. The evolution in technology will bring its share of solutions but also its share of problems, which is why it is not wise to procrastinate in putting in place protective measures. On the contrary, it is advisable to anticipate the risk.
The less data is exposed, the less emerging technologies such as “machine learning”, “quantum computers” or “AI” (a term that I hate due to its overuse these days) will be able to have an impact on the lives of people who have trusted us with their data. After all, the GDPR is not there to bother us with strict and unfounded rules to follow but to protect individuals.
We don’t need to strictly anonymize to be compliant with the GDPR. We need to put in place a set of data protection elements that ultimately protect people.
8. My advice
Embarking on a project to completely anonymize the production database is a costly and time-consuming exercise. In addition, it carries a significant risk of failure. Anonymization is likely to be incomplete, so it becomes pseudonymization, and does not fall outside the scope of the GDPR.
However, it can be done right. To do this, we need to distinguish between uses, and compartmentalize our needs so that we have full control over what we use:
Open-data or statistics: Identify the relevant scope of data and export to a database that is easier to anonymize, where each useful data item can be processed correctly. Assess the needs precisely for each element, check for noise, and truncate or generalize your data. On controlled sets, anonymization analysis is relevant and possible at a reasonable cost. It is important to bear in mind that this technique is very risky in terms of leakage, as it is usually intended for wide communication outside your company. One must therefore be very careful in the generation of such datasets.
Generation of test samples for development purposes: As the data remains within the company, it generates a lower risk. Again, the sample size to be anonymized should be reduced to minimize the risk of re-identification. By implementing an anonymization process, you will greatly reduce the data attack surface and therefore effectively protect your data from potential leakage. Unless all data generated during your anonymization is altered, it is strongly recommended that you put strong security measures in place on your anonymized databases. In the case of theft, even if the data is not anonymous in the eyes of the law because it can be re-identified by your authorized employees, it will be unusable by the thief.
And I will end on an optimistic note: Don't forget that anonymization is an ongoing project, not only because it will not be perfect the first time, but also because the algorithms that work today may not work tomorrow. You therefore have the time and opportunity to set up a continuous improvement process to finally achieve a perfect anonymization of all the elements within your information system. In other words, you can start small and take the time to do the job right.