Dealing with Comma Delimited Strings
September 29, 2006
When dealing with data you come across many different situations. In this article I will discuss how to deal with a few situations that involve working with comma separated data. Comma separated data can come in many forms. It can be input, a text string stored in a column, or a number of other situations. This article will deal with two different comma separated data situations.
Displaying Multiple Records from a Single Record
In this situation there is a column in a table that holds a series of values that are separated by a comma. For each record, the comma separated column needs to be parsed apart and returned as separate row. So the final output record set will contain multiple records for a given single record stored in a SQL Server table.
To demonstrate this I will run the following code:
set nocount on -- Create Example1 Table and Populate with Data create table Example1 (Id int, TypeOfValues varchar(20), ColumnOfValues char(30)) CREATE UNIQUE CLUSTERED INDEX ID_ind ON Example1(Id) WITH IGNORE_DUP_KEY insert into Example1 values(1, 'Colors','Red,Green,Blue,Black,White') insert into Example1 values(2, 'Models','Normal,Deluxe,Super Deluxe') insert into Example1 values(3, 'Years','2004,2005,2006') -- Create Number Table SELECT IDENTITY(INT) AS Number INTO Numbers FROM sysobjects s1 CROSS JOIN sysobjects s2 CREATE UNIQUE CLUSTERED INDEX Number_ind ON Numbers(number) WITH IGNORE_DUP_KEY SELECT Id, TypeOfValues, SUBSTRING( ColumnOfValues, Number, CHARINDEX( ',', ColumnofValues + ',', Number ) - Number ) as Value FROM Example1 INNER JOIN Numbers ON SUBSTRING( ',' + ColumnOfValues, Number, 1 ) = ',' where Number <= Len(ColumnOfValues) + 1 drop table Example1 drop table Numbers
When I run this code on my server I get the following results:
Id TypeOfValues Value ----------- -------------------- ------------------------------ 1 Colors Red 1 Colors Green 1 Colors Blue 1 Colors Black 1 Colors White 2 Models Normal 2 Models Deluxe 2 Models Super Deluxe 3 Years 2004 3 Years 2005 3 Years 2006
Here you can see that for each Id there are multiple rows. Each row has only one value from the comma delimited column ColumnOfValues. To understand how this was accomplished lets review my code.
First I create the table Example1, and then populate it with three different records. Each record contains a type column (TypeOfValues) and a value column (ColumnOfValues) which contains a comma delimited string of with different values. The ColumnOfValues column will be the column that that is parsed apart to create multiple records for each record in the Example1 table.
Next I create a Numbers table, by joining sysobjects to itself. This table will contain a series of sequential numbers starting from 1.
Finally the SELECT statement parses apart the ColumnOfValue column into multiple records. It does this by using the Numbers table to identify the offset of each comma. This is done by joining the Numbers table to a single character substring of the ColumnOfValues column, starting with the first character, then second, and so on. Whenever the join condition finds a comma, it uses the Number value to identify the starting point of the SUBSTRING and CHARINDEX functions, so these functions can extract a single character string of value out of the ColumnOfValues columns. To increase the performance of this statement a WHERE clause is added to reduce the number of rows from the Numbers table that needs to be joined to the Example1 table.
Displaying a Single Record with a Comma Separated Column from Multiple Records
Some times you might have a table that contains a series of records that contain a key and a value column. In your table there might be many different values for a given key. This example will show you how to collapse all those key value pairs into a single record. That single record will contain a unique key followed by a comma separate string composed of all the values associated with the key.
Here is an example of the table I will be using that contains a key (id_no) and a value (item):
id_no item ----------- -------------------- 1 Skiing 1 Diving 2 Diving 2 Skiing 2 Hunting 2 Fishing 4 Sailing 4 Skiing 5 Skiing
In this table for each id_no there is one or more items identified. Each record contains a single item value. I will use the code listed below to populate the above table:
-- create example table CREATE TABLE Example2(id_no int not null, item varchar(20) not null) -- populate the example table INSERT INTO Example2 VALUES (1, 'Skiing') INSERT INTO Example2 VALUES (1, 'Diving') INSERT INTO Example2 VALUES (2, 'Diving') INSERT INTO Example2 VALUES (2, 'Skiing') INSERT INTO Example2 VALUES (2, 'Hunting') INSERT INTO Example2 VALUES (2, 'Fishing') INSERT INTO Example2 VALUES (4, 'Sailing') INSERT INTO Example2 VALUES (4, 'Skiing') INSERT INTO Example2 VALUES (5, 'Skiing')
The next code snippet returns a record set that contains a single record for each id_no, followed by a comma delimited string that concatenates each item value together into a single column value:
-- declare local variables declare @p varchar(1000) declare @i char(5) declare @sm int declare @m int -- Print Report Heading print 'id_no' + ' items' print '----- ' + '------------------------------------------' set @p = '' -- set @m to the first id number select top 1 @m = id_no from Example2 order by id_no set @sm = 0 -- Process each id_no until no more items while @m <> @sm begin set @sm = @m -- string together all items with a comma between select @i = id_no, @p = case when @p = '' then item else @p + ', ' + item end from Example2 a where id_no = @m -- print id_no, and comma delimited string print @i + ' ' + @p -- increment id number select top 1 @m = id_no from Example2 where id_no > @sm order by id_no set @p = '' end -- remove example table drop table Example2
When I run this code against my Example2 table I get the following output:
id_no items ----- ------------------------------------------ 1 Skiing, Diving 2 Diving, Skiing, Hunting, Fishing 4 Sailing, Skiing 5 Skiing
Let me explain how this code works. This code iteratively process each id_no value using a WHILE loop. Each pass through the WHILE loop strings together all the item values for a given id_no. The variable @m contains the value of the id_no for the records being collapsed into a single record. The following SELECT statement does all the work to collapse all the records for a given id_no value into a single row in the output:
select @i = id_no, @p = case when @p = '' then item else @p + ', ' + item end from Example2 a where id_no = @m
This code concatenates a comma with the value of the item column and adds it to the variable @p. This method allows you a way to summarize a character string, in this case the value of the item column followed by a comma. After this command has completed execution the variable @i contains the id_no, and the @p variable contains a comma delimited string of item column values for the give id_no.
The PRINT statement is used to display each row of comma delimited values for a given id_no. The last SELECT statement in the WHILE loop set the @m variable to the next id_no to be processed. This WHILE loop continues to creating comma delimited strings for each item column processing one id_no at a time until all id_no records have been processed.
This article showed you only two examples of how to deal with comma separated data. One example showed you how to break apart comma separated data, where as the other one showed you how to join multiple records into a single record where the data was separated by commas. Hopefully next time you have to deal with comma separated data these examples will give you a jump start on writing your T-SQL code to work with comma separated data.