Data Masking – Data masking & consistency

In today’s databases most values are stored more than once. For example, a person’s name is stored in the Customer_table, but also in the Billing table. In most cases it can be beneficial to keep both values in sync. To make this possible DATPROF Privacy has the possibility to store a translation of an anonymization in a separate table. This feature can be found in the Function Editor, under the tab “Translation table”. Here you can enable/disable the usage of a translation table. When enabled you can select in which schema and under what name you want to save the table (i.e. TT_FIRST_NAME, TT as in Translation Table).

As said, a translation table keeps a translation consisting of the old value (i.e. the original first name) and the new value (i.e. the shuffled first name) of an anonymization function. It also adds the primary key value(s) of the anonymized table. These keys can be used in other functions to find the right anonymized value in the translation table, so another table can be anonymized in the same manner.

Usage of a translation table

A translation table is often used as input for a value lookup. A translation table makes it possible to anonymize consistently throughout a database or a chain of databases. It is imperative that the key you use must be available in both systems and/or tables. A primary key isn’t always the right key for this, so it is possible in DATPROF Privacy to ‘toggle primary key’. This is a virtual key; the toggled key won’t be created on the database, but it will be added to the translation table. Social security numbers and account numbers are usable data for this, amongst others. As long as the key is unique and present in all affected tables, it will be usable.

Advanced uses

Usage of a translation table can be straightforward, but it is also possible to combine multiple translation tables into one view or table. An example, you have multiple translation tables relating to person data, one for the result of the first name shuffle, one for the last name shuffle, one with the new social security number. All these tables contain an id, in this case the customer_id. Then it will be possible to create a combined translation table, including all old and new person’s data. The great benefit of this, is that it makes it possible to use one function on tables that you want to anonymize based on a table that was anonymized earlier in the process, instead of one function for each value that you want to anonymize.

Security

Your translation table contains the original values. Therefore you should treat the translation tables with a care we often advice clients to treat it as if it were production data. For example, you could place the translation table(s) in a separate schema with a separate privilege scheme. Furthermore, if the anonymized database is on another server, from where distribution of the data to test environments takes place, this reduces the risk even further.

Related

Don't miss anything

Signup for our newsletter