Database Journal
MS SQL Oracle DB2 Access MySQL PostgreSQL Sybase PHP SQL Etc SQL Scripts & Samples Tips Database Forum

» Database Journal Home
» Database Articles
» Database Tutorials
MS SQL
Oracle
DB2
MS Access
MySQL
» RESOURCES
SQL Scripts & Samples
Tips
» Database Forum
» Slideshows
» Sitemap
Free Newsletters:
DatabaseDaily  
News Via RSS Feed


follow us on Twitter
Database Journal |DBA Support |SQLCourse |SQLCourse2
 

Featured Database Articles

MS SQL

Posted January 22, 2018

WEBINAR:
On-Demand

Full Text Search: The Key to Better Natural Language Queries for NoSQL in Node.js


How to Implement Dynamic Data Masking in Azure SQL Database and SQL Server

By Anoop Kumar

In a previous article, we discussed Dynamic Data Masking (DDM) and how DDM helps to secure PII/PHI data from unprivileged users. The DDM feature is available in SQL Server 2016 onwards as well as Azure SQL Database. 

This feature is useful to implement data security for existing data sets and future data. One of the best parts of this feature is that it doesn’t require any change in the application and data doesn’t change at the physical level. The data masking can be achieved by using any available built in functions. We discussed all these functions in previous articles.

In this article, we will discuss how we can use these built in functions to mask the data based on different data patterns, like social security number, credit card number, date of birth, email address, name, etc.

Data Masking Functions

SQL Server provides four built in functions to mask data in SQL tables. These functions are as follows:

  1. partial()
  2. default()
  3. email()
  4. random()

We learned about these functions in a previous article. In this article, we will discuss how we can use these functions to achieve data masking.

Test Data Preparation

First, we will create test tables and insert test data in these tables. Later, we will apply data masking functions and see how the data looks like after data masking.

         Create test tables:

Create table TestDDM 
              (ID Int, 
              PersonName varchar (100), 
              EmailAddress varchar(120), 
              CreditCardNumber varchar(19), 
              SocialSecurityNumber varchar(11))
 
CREATE TABLE TestPerson
             ( ID INT, 
             Name VARCHAR(100),
             DOB DATETIME,
             INCOME NUMERIC)

         Insert data in test tables:

      INSERT INTO TestDDM  Values (1, 'Anoop Kumar','abcdefgh@hotmail.com','1234-5678-4321-8765','123-45-6789')
      INSERT INTO TestPerson VALUES (1, 'John Smith','06-07-1986',10000)

         Retrieve table data:

		SELECT * FROM TestDDM

TestDDM table with Data

Figure 1 TestDDM table with Data

SELECT * FROM TestPerson

TestPerson table with Data
Figure 2 TestPerson table with Data

Currently, no data masking is implemented so data can be visible as it is inserted in tables.

Data Masking Implementation

In this section, we will use all four built in data masking functions and learn the behavior of these functions.

partial() - The partial () function gives the option to customize the masking. The function exposes the first and last letters and adds a custom padding string in the middle.

We have used partial() function to mask the social security number. We masked the initial five digits of the social security number and exposed last four digits.

ALTER TABLE TestDDM
ALTER COLUMN [SocialSecurityNumber] ADD MASKED WITH (FUNCTION = 'partial(0,"XXX-XX-",4)');

email() : The email() function masks the email address.

The command below masks the column EmailAddress in the TestDDM table using the email() function. The function masks the whole string of the email address except the first letter of the email, the special character of the address (@), and the constant suffix of the email address (.com).

ALTER TABLE TestDDM ALTER COLUMN EmailAddress ADD MASKED WITH (FUNCTION = 'email()')

default() : The default() function masks the full value of a column.

This SQL command below masks the CreditCardNumber column in the TestDDM table. The function masks the complete number string and replaces the masked value with ‘X’.

ALTER TABLE TestDDM 
ALTER COLUMN CreditCardNumber varchar(19) MASKED WITH (FUNCTION = 'default()') NULL

Review Results

Now, we will review the behavior of data masking implementation. We will run a Select query and review the data.

SELECT * FROM TestDDM;  

TestDDM table Data - Unmasked
Figure 3 TestDDM table Data - Unmasked

We can see that data is still visible as inserted. There is no change in data behavior and the data doesn’t mask. The reason for this behavior is user permission. In the current scenario, my ID has db_owner permission and has full access to the data.

To understand the behavior of mask functions and masked data, we will create a new database user TestMaskUser (without login) and will grant select permission on the TestDDM table to the newly created database user.

CREATE USER TestMaskUser WITHOUT LOGIN;  
GRANT SELECT ON TestDDM TO TestMaskUser;  

Now, we will change the context of the query execution and review the TestDDM data table.

EXECUTE AS USER = 'TestMaskUser';  
SELECT * FROM TestDDM;  
REVERT;  

TestDDM table Data - Masked
Figure 4 TestDDM table Data - Masked

We can see that after changing the user context, data in the TestDDM table has been masked and complete values are not visible to user TestMaskUser as expected.

  1. SocialSecurityNumber is showing last 4 digits
  2. EmailAddress is visible with first character, special character (@) and suffix (.com) only
  3. CreditCardNumber has masked completely

Now, you can understand how easy it is to mask the data available in your database. In the next example, we will see the use of the random() function and how to mask the date of birth data.

random() - function is used on any numeric type to mask the original value with a random value within a specified range.

This SQL command below uses the random() function to mask the Income column in the TestPerson table. This function will replace the original value with a random number within a given range in the random() function.

ALTER TABLE TestPerson
ALTER COLUMN INCOME NUMERIC MASKED WITH (FUNCTION = 'random(1,100000)') NULL

This SQL command masks the DOB column in the TestPerson table with the default() function. This function will replace the actual DOB with a default value 1900-01-01 00:00:00:000.

ALTER TABLE TestPerson
ALTER COLUMN DOB  DATETIME MASKED WITH (FUNCTION = 'default()') NULL

Now, we will grant select permission to TestMaskUser and change the context of execution to review table TestPerson data.

GRANT SELECT ON TestPerson TO TestMaskUser
EXECUTE AS USER = 'TestMaskUser';  
SELECT * FROM TestPerson;  
REVERT;  

TestPerson table Data - Masked
Figure 5 TestPerson table Data - Masked

We can see that after applying the random() function the value of income was changed and the date of birth has been replace with the default value 1900-01-01 00:00:00:000.

List of Masked Columns

At any point of time, we can find what columns have been masked and what mask functions have been used to implement the data masking in a database.

The SQL query below can help to identify the list of masked columns in a database.

SELECT a.name, b.name as table_name, a.is_masked, a.masking_function  
FROM sys.masked_columns AS
JOIN sys.tables AS b   
    ON a.[object_id] = b.[object_id];  

List of Masked Columns
Figure 6 List of Masked Columns

Grant and Revoke UNMASK Permission

We know that data masking is a feature used to secure data that should not be visible to unprivileged users.However, there are methods to control the access of data. We can use the query below to grant permission to UNMASK the user. After this, the unmasked user can see the actual stored data in a table rather than masked data.

GRANT UNMASK TO TestMaskUser;

Grant UNMASK to DB User
Figure 7 Grant UNMASK to DB User

Once we Grant UNMASK to TestMaskUser, the user can see actual data rather than masked data.

Similarly, if we Revoke UNMASK permission to TestMastUser, the user can see only masked data.

REVOKE UNMASK TO TestMaskUser;

Revoke UNMSK to DB User
Figure 8 Revoke UNMSK to DB User

Summary

Dynamic data masking is a great feature for both on-premise SQL Server and Azure SQL Database as well. This feature can help users to secure their critical data elements without making any change at physical level. All the unprivileged users can only see masked data and don’t have access to actual values. The DDM is one of the methods to de-identify data in SQL Server 2016 onwards and Azure SQL Database.

See all articles by Anoop Kumar



MS SQL Archives

Comment and Contribute

 


(Maximum characters: 1200). You have characters left.

 

 




Latest Forum Threads
MS SQL Forum
Topic By Replies Updated
SQL 2005: SSIS: Error using SQL Server credentials poverty 3 August 17th, 07:43 AM
Need help changing table contents nkawtg 1 August 17th, 03:02 AM
SQL Server Memory confifuration bhosalenarayan 2 August 14th, 05:33 AM
SQL Server Primary Key and a Unique Key katty.jonh 2 July 25th, 10:36 AM