SqlCredit - Part 9: Message Authentication CodesSeptember 27, 2007 Download the files for this article. This is part 9 of a series. If you have not read part 8, you can find it here. The Problem: Searching Encrypted DataOnce we have encrypted the data in our table, we cannot efficiently search that data without making further changes. Remember from part 7 that EncryptByCert, EncryptByKey, EncryptByAsymKey, and EncryptByPassPhrase all create different results (ciphertext) each time they are called. The solution is to use Message Authentication Codes. Note that this is a special use of the general term used for messaging. Hashing is different from encryption in that it always returns the same ciphertext. If I run this code: SELECT HashBytes('MD5', 'secret')
I get this ciphertext: 0x5EBE2294ECD0E0F08EAB7690D2A6EE69 Running it again returns the same result. Running it on a different machine also returns the same result. Different algorithms return different ciphertext. For example, if I run this code: SELECT HashBytes('SHA1', 'secret')
I get this ciphertext: 0xE5E9FA1BA31ECD1AE84F75CAAA474F3A663F05F4 HashBytes always returns varbinary results with the following length:
I created two UDTs (HashResultMD and HashResultSHA) based on the result lengths of the different algorithms. It would be very reasonable to use the maximum length and have only one type. Another option is to use just some of the bytes of the result. I have chosen here to use the complete result. It is important to understand that, even using the full result, there is not a guarantee that you will not have collisions. What About Dictionary Attacks?Because hashing creates consistent ciphertext, it would be possible to run a dictionary attack against a table and find values that match a given string. From the example above, we can see that any entry with ciphertext that matches 0x5EBE2294ECD0E0F08EAB7690D2A6EE69 was created by hashing the string secret using the MD5 algorithm. The way to fix this is by using a salt value. If we can unencrypt a salt value using DecryptByKey (which we have seen is fast) and then pre-pend (or append) the string were trying to store, then a hacker cant just hash a set of values and compare the ciphertext against the result. The salt should be a reasonable length, but I dont think you should expose the exact length in your code. If you create a UDT that allows for the storage of a hashed value prepended with a secret, the UDT should allow for some variance in the secret length. Look-Up by Hashed ValueReference the attached file EncryptByKeyPerfTest_Setup.sql. Given a value (for example, CardNumber or SecureString), we can now perform efficient searches. Since hashing consistently returns the same ciphertext, we can first hash the look-up value and then find a record or records that match that value. Note that SecurityCode1 does not have a corresponding MAC column in CardEncryptByKey. This is because there is no use case for looking up a record by its security code without its card number. If we use a MAC column to allow efficient look-ups of records, the security code column is used only for validation that the proper record has been found. Create MAC columns only for columns that will be searched, and remember that the MAC supports efficient look-up and not exact filtering. A Simple(ish) ExampleReference the attached file SimplestEncrWithMAC.sql. It is easier to show code than explain everything behind it. Here, we build a table to store the salt and another to store data records: CREATE TABLE dbo.SecretSalt ( SaltID tinyint NOT NULL, Salt varbinary(100) NOT NULL CONSTRAINT SecretSalt_PK PRIMARY KEY CLUSTERED (SaltID) ); CREATE TABLE dbo.Hash1 ( RecordID int NOT NULL, EncryptResult varbinary(116) NOT NULL, -- Encrypted HashResult varbinary(20) NOT NULL -- Hashed CONSTRAINT Hash1_PK PRIMARY KEY CLUSTERED (RecordID) ); CREATE NONCLUSTERED INDEX Hash1_HashResult_IDX ON dbo.Hash1 ( HashResult ); First, lets add the salt record.
INSERT INTO dbo.SecretSalt (
SaltID,
Salt
)
SELECT
1,
EncryptByKey(
Key_GUID('TestKey1'),
'd=s_4ZeG3me4E#U4ru6ag&@ru@A2Afr7'
);
Next, add records to Hash1. We run this code within a loop to create n records.
SELECT @string1 = 'Simple' + CONVERT(varchar(11), @ct);
INSERT INTO dbo.Hash1 (
RecordID,
EncryptResult,
HashResult
)
VALUES (
@ct,
EncryptByKey(Key_GUID (N'TestKey1'), @string1),
HashBytes('SHA1', @string1)
);
The code will add records that look like this: RecordID EncryptResult HashResult -------- -------------------------- ------------------- 1 0x00EC5555DE58...FFEDCA8C7 0xF71C033...4C307B5 2 0x00EC5555DE58...D73349886 0x7A66A82...A6B4DBC Now, we can look up a single record that matches an expected string. We pick the string 5 which should match the fifth record and return a RecordID of 5. DECLARE @lookUp int; SELECT @lookUp = 5; -- Pick some record to search for SELECT h.RecordID, CONVERT(int, CONVERT(varchar(11), DecryptByKey(h.EncryptResult))) AS 'Decrypted Value' FROM dbo.Hash1 AS h CROSS JOIN dbo.SecretSalt AS s WHERE s.SaltID = 1 AND h.HashResult = HashBytes ( 'SHA1', CONVERT(varchar(63), DecryptByKey(s.Salt)) + CONVERT(varchar(11), @lookUp) ) -- This uses the index to find the record AND CONVERT(int, CONVERT(varchar(11), DecryptByKey(h.EncryptResult))) = @lookUp; The WHERE clause has three parts: 1. Even though there is only one record in the SecretSalt table, tell it which one to use. 2. Allow the optimizer to use the index on the varbinary hash column. 3. Validate that the correct record or records are returned. All these parts are required for an efficient and reliable look-up. This returns the exact record we expected. What do you think? If you agree, fine. If not, provide feedback in the forum. I have added a new thread called MAC Code. This code is complex. If you find a more straightforward way of doing this, please share. Limitations and OptionsMessage authentication codes work only for finding an exact match. It does not support range queries or wildcards. Using a message authentication code column allows a hacker to determine values that are the same. As an example, if we salted and then hashed two credit card numbers that were the same, the ciphertext would be exactly the same. You may choose to not use the full ciphertext returned. For instance, the varbinary result (16 or 20 bytes) can be converted to an integer (4 bytes) or bigint (8 bytes). Whether you use the full ciphertext or a portion of it, you still have to validate that the encrypted value matches the value you are searching for because of the possibility of hash collisions. An option is to SELECT records that match the hashed value and only validate the encrypted data if the SELECT returns more than one record. For Next TimeNext month, I will discuss the performance results of the combination of encryption and hashing. I will also update the SqlCredit code to include all these changes. References
Download the files for this article. Discuss this article in the SQLCredit Forum. » See All Articles by Columnist Rob Garrison SqlCredit - Developing a Complete SQL Server OLTP Database Project
Performance Testing SQL 2008's Transparent Data Encryption
SQL Server 2008's Change Data Capture - Tracking the Moving Parts Performance Testing - SQL Server 2008 versus SQL Server 2005 Exploring SQL Server's Index INCLUDEs Exploring SQL 2005's Ranking Functions - NTILE() and ROW_NUMBER() SqlCredit - Part 19: Exploring SQL 2005's Ranking Functions - RANK() and DENSE_RANK() SqlCredit, Part 18: Exploring the Performance of SQL 2005's OUTPUT Clause SqlCredit - Part 17: Exploring SQL 2005's OUTPUT Clause SqlCredit - Part 16: The Cost of Bloat SqlCredit - Part 15: The Cost of Distribution SqlCredit - Part 14: The Cost of Translation SqlCredit - Part 13: More on Indexed Persisted Computed Columns SqlCredit - Part 12: Exploring Indexed Persisted Computed Columns SqlCredit - Part 11: Change Tracking Using History Records SqlCredit - Part 10: MAC Performance and Updating SqlCredit SqlCredit - Part 9: Message Authentication Codes SqlCredit Part 8: Comparing Encrypt/DecryptByCert and Encrypt/DecryptByKey SqlCredit Part 7: Performance Impact of EncryptByCert and DecryptByCert SqlCredit Part 6: Exploring EncryptByCert and DecryptByCert SqlCredit - Part 5: Adding Card, Vendor, and Purchase, Plus Much Refactoring SqlCredit - Part 4: Schema and Procedure Security SqlCredit - Part 3: Schema/Proc Updates and Automated Unit Testing SqlCredit - Part 2: Creating the Database, Tables, CRUD Procedures SqlCredit - Developing a Complete SQL Server OLTP Database Project |