Wednesday, August 20, 2014

Create Hash of Multiple Columns

Josh,

So, you need to compare several columns of one table to several columns of another.  Perhaps you are checking to see if data has changed before updating a row, or you need to track if multiple columns have changed.  Besides doing a bunch of AND statements, a nifty feature of SQL is the HASHBYTES function.

This function will hash an input based on the algorithm selected.  The algorithms can be SHA1, MD5, etc.  Care should be used when using this function to compare multiple columns.  A simple concatenation could lead to some misidentified matches.

DECLARE @tbl TABLE (
   
Id INT IDENTITY(1,1),
   
Col1 VARCHAR(5) NULL,
   
Col2 VARCHAR(5) NULL,
   
Col3 VARCHAR(5) NULL
);
INSERT INTO @tbl (Col1Col2Col3)VALUES ('ABC''456','DEF'),
       (
'AB','C456','DEF'),
       (
'AB','C456',NULL),
       (
'ABC','456',NULL),
       (
'ABC','456',NULL)
SELECT
   
t.*,
   
HASHBYTES('md5'Col1 Col2 Col3AS Md5_Concatenation--BAD
   
HASHBYTES('md5''Col1=' ISNULL(Col1,'') + 'Col2=' ISNULL(Col2,'') + 'Col3=' ISNULL(Col3,'')) AS Md5_Concatenation2--GOOD
   
HASHBYTES('md5', (SELECT t2.Col1t2.Col2t2.Col3 FROM @tbl t2 WHERE t.Id t2.Id FOR XML RAW)) AS Md5_Xml--GOOD
   
(SELECT t2.Col1t2.Col2t2.Col3 FROM @tbl t2 WHERE t.Id t2.Id FOR XML RAWAS XmlVal
FROM
   
@tbl t
  

As shown in the sample results below, line Id 1 and 2 show the exact same hash based on simple concatenation.  Instead, the columns should be identified as in the "Md5_Concatenation2" example or using the FOR XML RAW syntax as shown in the "Md5_Xml" example.


No comments:

Post a Comment

Please only include comments that add to the substance of the Blog Post:
- Question or request for clarification
- Idea for improvement
- Identifying an issue

Please refrain from promotion.