A Sample Data Masker WHERE Clause Options Panel
When masking data using Where Clauses there is an important source of error which must be noted. This issue is associated with the mistake of making implicit assumptions regarding the contents of the field on which the Where Clause operates.
For example, if masking operations were to be applied to the FIRST_NAME column of the
EMPLOYEE table the relevance of the column could be preserved by substituting
female and male names where appropriate. All female first names in the table could be substituted by choosing
the Names, First Names, Female dataset and using a WHERE Clause of WHERE EMP_GENDER='F'.
This would cause only the female employee records to be selected and masked.
A separate rule using the Names, First Names, Male dataset and a second WHERE clause of WHERE EMP_GENDER='M' might be used to perform a similar substitution on the male entries.
But there is potentially a BIG problem with the above solution!!!!
In the example above, two rules with two WHERE Clauses were applied to the target table. It is important to realize that if there are EMP_GENDER values in the target table which are not equal to either 'M' or 'F' then the FIRST_NAME fields associated with these values will NOT be masked. This effect is called a Where Clause Skip and is discussed in more detail in the Data Scrambling Issues white paper. In such cases, a masking rule with a default value is usually applied to the columns in the target table before the update with the WHERE Clause takes place.
A better solution, using the above example, would implement a Substitution rule using the Names, First Names, Female dataset without a Where Clause to mask all of the FIRST_NAME fields with female first names (irregardless of gender) then a second rule with a WHERE Clause of WHERE EMP_GENDER='M' and the Names, First Names, Male dataset could be used to mask just the male names. This two step method, causes some rows (the male ones) to be masked twice but also ensures that all rows of sensitive data in the target table get some form of masking. This avoids the need to specifically target each case with a WHERE Clause and also future proofs the masking rule so that if any new values appear in the EMP_GENDER field in the future, the FIRST_NAME field will still automatically be rendered anonymous without any changes to the masking rules. Note that the execution order of the rules is critical in the two-step process just described. It is not valid to have both rules run simultaneously. To control the execution order use
Rule Blocks and Dependencies.
Where Clause Skips