Scenario
I have been asked on numerous occasions to provide a data feed that provides data for a customer, or plant or some other entity with a catch, can you take a one to many relationship and make one column of data out of it. That answer, of course, is “Sure, I can do anything!” but then I get into the how and I start to build a stored procedure that uses temp tables, involves a WHILE loop of some kind and I have to manually code the correct answer. The process is nothing complicated but it definitely involved some extra steps before providing back the results.
My thinking though is that it would be great if I could do that in a SQL statement and let SQL Server do all of the work. Low and behold, I ran across a method that turned my original process into a single query statement, which obviously is much simpler!
Example Data Set
To illustrate, I have the following scenario:
- Please provide a Customer list by State and include all of the Customer Regions in one column on each state’s row.
Simple request but the issue is that within a state, I have multiple regions depending on a postal code of the address of the customer office.
In my sample database, my data is broken out into three tables: (for the sake of being able to create this locally as an example, I have done everything in tempdb)
create table #tmpCustomer (CustomerId int identity, CustomerName varchar(max))
create table #tmpCustomerAddress (AddressId int identity, CustomerId int, StateAbbreviation varchar(5), PostalCode varchar(10))
create table #tmpPostalCode (PostalCodeId int identity, PostalCode varchar(20), PostalRegion varchar(30))
The relationships are:
- A customer can have multiple addresses
- For a state, you can have multiple regions based upon the postal code
- The postal code belongs to a postal region
These relationships imply that a customer can have multiple regions per state as well, where the postal code of the customer address will map to the region.
For sample data, the following inserts are executed:
insert into #tmpCustomer (CustomerName)
select 'Customer A'
insert into #tmpCustomerAddress(CustomerId, StateAbbreviation, PostalCode)
select 1, 'PA', '19111'
union
select 1, 'PA', '19103'
union
select 1, 'NJ', '08034'
insert into #tmpPostalCode (PostalCode, PostalRegion)
select '19111', 'NE Philly'
union
select '19103', 'Center City'
union
select '08034', 'South Jersey'
Query Solution
Now, if we queried the data using straight joins we would get the following:
select
c.CustomerName,
ca.StateAbbreviation,
pc.PostalRegion
from
#tmpCustomer c
inner join #tmpCustomerAddress ca
on c.CustomerId = ca.CustomerId
inner join #tmpPostalCode pc
on ca.PostalCode = pc.PostalCode
| CustomerName | StateAbbreviation | PostalRegion |
| Customer A | NJ | South Jersey |
| Customer A | PA | Center City |
| Customer A | PA | NE Philly |
However, the above dataset does not provide the solution, the request is to roll up the State Regions for the customer into one row per state.
The following query gives us the answer:
select distinct
c.CustomerName,
ca.StateAbbreviation,
substring(
(select '; ' + ipc.PostalRegion
from #tmpPostalCode ipc
inner join #tmpCustomerAddress ica
on ipc.PostalCode = ica.PostalCode
where
ica.CustomerId = ca.CustomerId
and ica.StateAbbreviation = ca.StateAbbreviation
for xml path('')), 3, 1000) as Region
from
#tmpCustomer c
inner join #tmpCustomerAddress ca
on c.CustomerId = ca.CustomerId
inner join #tmpPostalCode pc
on ca.PostalCode = pc.PostalCode
Here are the results:
| CustomerName | StateAbbreviation | Region |
| Customer A | NJ | South Jersey |
| Customer A | PA | Center City; NE Philly |
The key of the query is using the for xml path('') syntax as a subquery in the select portion of the field list, as well as using the distinct keyword to remove duplicate entries.
To note, the for xml path('') statement would put in xml tags if the column was named in the subquery but since we left the column name empty when we concatenated the semicolon and the postal region, the xml nodes are empty. The extra substring is just to parse off the preceding delimiter that we start all records with.
The above example could then be used as an inner query tied to other tables within the database for more fields that are also at a single Customer State level.
Conclusion
As the above example showed, the ability to roll up a one to many relationship into a single row set can be be very useful and it would have saved me time in the past if I had come across this method earlier. In addition, having the ability to do this using one T-SQL statement as opposed to having to generate a loop and a procedure to build your dataset is definitely a simpler and in my opinion a better answer.





