Typically, anonymization is required for test environments.
A good knowledge of the overall scope of the database is important, because it will help in assessing which types of data will need to be anonymized.
It is also important to consider how specific data relate to each other, as some data are inseparable.
To assist the administrator, the discovery of the data eligible for anonymization must be as automated as possible, using algorithms catering for the various types of data.
But in some cases, anonymization is needed for production environments. This is especially the case with the "right to be forgotten", which has been considerably reinforced by the GDPR.
Indeed, anyone residing in the European Union and whose organization holds personal data may take control over his/her data.
But in many cases, simply deleting this data would have a significant impact on other data. In such cases anonymization is therefore a better solution as it renders personal data inaccessible, while preserving the usability of data to allow normal application operation and consistency of results.
Take the example of an online commerce site. When a product is sold, out-of-stock, money-in, or parcel-delivery data are necessary for the the business to operate and cannot be removed. However, the name of the buyer, his address or banking data can be.
The right to be forgotten, whether it results from a specific request or a regulation on the conservation of historical data, is the most common reason for anonymizing a production environment.