Looping through tables in SQL might seem counterintuitive at first.
After all, SQL is designed for set-based operations, not procedural programming. However, there are scenarios where row-by-row processing is necessary or even more efficient.
Let’s explore various techniques to loop through tables in SQL, complete with practical examples and real-world use cases.
Cursors are the most straightforward way to loop through a table in SQL. They allow you to process one row at a time. Here’s a basic example:
DECLARE @id INT, @name VARCHAR(50)
DECLARE cur CURSOR FOR SELECT id, name FROM employees
OPEN cur
FETCH NEXT FROM cur INTO @id, @name
WHILE @@FETCH_STATUS = 0
BEGIN
-- Process each row
PRINT 'ID: ' + CAST(@id AS VARCHAR) + ', Name: ' + @name
FETCH NEXT FROM cur INTO @id, @name
END
CLOSE cur
DEALLOCATE cur
This code declares a cursor over the employees
table, fetches each row, and prints the ID and name. While simple, cursors can be slow for large datasets and are often frowned upon due to performance concerns.
A more efficient alternative uses a WHILE loop with ROW_NUMBER():
DECLARE @id INT, @name VARCHAR(50), @rownum INT = 1
WHILE EXISTS (SELECT 1 FROM (SELECT ROW_NUMBER() OVER (ORDER BY id) AS rownum FROM employees) t WHERE t.rownum = @rownum)
BEGIN
SELECT @id = id, @name = name
FROM (SELECT ROW_NUMBER() OVER (ORDER BY id) AS rownum, id, name FROM employees) t
WHERE t.rownum = @rownum
-- Process each row
PRINT 'ID: ' + CAST(@id AS VARCHAR) + ', Name: ' + @name
SET @rownum = @rownum + 1
END
This method avoids the overhead of cursor management and can be more performant, especially for larger tables.
Tally tables can be incredibly useful for looping. Here’s how you might use one:
-- Create a tally table
WITH
E1(N) AS (SELECT 1 FROM (VALUES (1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) AS E1(N)),
E2(N) AS (SELECT 1 FROM E1 a, E1 b), -- 10 * 10 = 100 rows
E4(N) AS (SELECT 1 FROM E2 a, E2 b), -- 100 * 100 = 10,000 rows
Tally(N) AS (SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM E4)
SELECT e.id, e.name
FROM employees e
INNER JOIN Tally t ON t.N <= (SELECT COUNT(*) FROM employees)
ORDER BY e.id
This method generates a tally table on the fly and joins it with your actual table, effectively creating a loop without procedural code.
Let’s apply these looping techniques to a real-world scenario. Imagine you need to update customer statuses based on their recent order history:
DECLARE @customerId INT, @lastOrderDate DATE, @newStatus VARCHAR(20)
DECLARE @today DATE = GETDATE()
DECLARE cur CURSOR FOR
SELECT c.customer_id, MAX(o.order_date) as last_order_date
FROM customers c
LEFT JOIN orders o ON c.customer_id = o.customer_id
GROUP BY c.customer_id
OPEN cur
FETCH NEXT FROM cur INTO @customerId, @lastOrderDate
WHILE @@FETCH_STATUS = 0
BEGIN
SET @newStatus =
CASE
WHEN @lastOrderDate IS NULL THEN 'New'
WHEN DATEDIFF(day, @lastOrderDate, @today) <= 30 THEN 'Active'
WHEN DATEDIFF(day, @lastOrderDate, @today) <= 90 THEN 'At Risk'
ELSE 'Inactive'
END
UPDATE customers
SET status = @newStatus
WHERE customer_id = @customerId
FETCH NEXT FROM cur INTO @customerId, @lastOrderDate
END
CLOSE cur
DEALLOCATE cur
This script loops through customers, calculates their last order date, and updates their status accordingly. While this could be done with a single UPDATE statement, the looping approach allows for more complex logic and potential additional processing per customer.
Another common use case for looping is data migration. Suppose you’re moving data from an old schema to a new one, with some transformations along the way:
DECLARE @oldId INT, @oldName VARCHAR(50), @oldEmail VARCHAR(100)
DECLARE @newId INT
DECLARE cur CURSOR FOR SELECT id, name, email FROM old_customers
OPEN cur
FETCH NEXT FROM cur INTO @oldId, @oldName, @oldEmail
WHILE @@FETCH_STATUS = 0
BEGIN
-- Insert into new customers table
INSERT INTO new_customers (name, email)
VALUES (@oldName, @oldEmail)
SET @newId = SCOPE_IDENTITY()
-- Migrate related data
INSERT INTO new_orders (new_customer_id, order_date, amount)
SELECT @newId, order_date, amount
FROM old_orders
WHERE old_customer_id = @oldId
FETCH NEXT FROM cur INTO @oldId, @oldName, @oldEmail
END
CLOSE cur
DEALLOCATE cur
This script migrates customer data from an old schema to a new one, including related order data. The loop allows for complex transformations and handling of related data that might be difficult to achieve in a single set-based operation.
When dealing with large datasets, using temporary tables can improve performance:
-- Create a temp table with the data to process
SELECT id, name, email
INTO #CustomersToProcess
FROM customers
WHERE last_update < DATEADD(day, -30, GETDATE())
DECLARE @id INT, @name VARCHAR(50), @email VARCHAR(100)
WHILE EXISTS (SELECT 1 FROM #CustomersToProcess)
BEGIN
-- Get the first row
SELECT TOP 1 @id = id, @name = name, @email = email
FROM #CustomersToProcess
-- Process the customer (e.g., send an email)
EXEC sp_SendEmail @email, 'Welcome Back!', 'We miss you...'
-- Remove the processed row
DELETE FROM #CustomersToProcess WHERE id = @id
END
DROP TABLE #CustomersToProcess
This approach is particularly useful when you need to process a subset of a large table efficiently.
Before resorting to loops, consider if a set-based approach might work. Here’s an example of updating customer statuses without a loop:
WITH CustomerLastOrder AS (
SELECT c.customer_id, MAX(o.order_date) as last_order_date
FROM customers c
LEFT JOIN orders o ON c.customer_id = o.customer_id
GROUP BY c.customer_id
)
UPDATE c
SET c.status =
CASE
WHEN clo.last_order_date IS NULL THEN 'New'
WHEN DATEDIFF(day, clo.last_order_date, GETDATE()) <= 30 THEN 'Active'
WHEN DATEDIFF(day, clo.last_order_date, GETDATE()) <= 90 THEN 'At Risk'
ELSE 'Inactive'
END
FROM customers c
JOIN CustomerLastOrder clo ON c.customer_id = clo.customer_id
This set-based approach is often more efficient than looping, especially for large datasets.
When looping through tables, it’s crucial to handle potential errors gracefully. Here’s an example using TRY…CATCH:
DECLARE @customerId INT
DECLARE cur CURSOR FOR SELECT customer_id FROM customers
OPEN cur
FETCH NEXT FROM cur INTO @customerId
WHILE @@FETCH_STATUS = 0
BEGIN
BEGIN TRY
-- Attempt some operation that might fail
EXEC sp_UpdateCustomerDetails @customerId
-- If successful, log it
INSERT INTO operation_log (customer_id, operation, status)
VALUES (@customerId, 'UpdateDetails', 'Success')
END TRY
BEGIN CATCH
-- Log the error
INSERT INTO error_log (customer_id, error_message)
VALUES (@customerId, ERROR_MESSAGE())
END CATCH
FETCH NEXT FROM cur INTO @customerId
END
CLOSE cur
DEALLOCATE cur
This approach ensures that an error in processing one row doesn’t stop the entire operation.
While SQL is optimized for set-based operations, looping through tables has its place in a SQL developer’s toolkit. We’ve covered:
Remember, the best approach depends on your specific use case, data volume, and performance requirements. While loops can solve complex problems, always consider if a set-based solution might be more efficient.