Problem Statement
We are given a Logs
table structured as:
Column Name | Type |
---|---|
id | int |
num | varchar |
The id
column is the primary key and is an autoincrement column.
Our task is to write an SQL query to find all numbers that appear at least three times consecutively. We can return the result table in any order.
Example:
Input:
id | num |
---|---|
1 | 1 |
2 | 1 |
3 | 1 |
4 | 2 |
5 | 1 |
6 | 2 |
7 | 2 |
Output:
ConsecutiveNums |
---|
1 |
Here, '1' is the only number that appears consecutively for at least three times.
Solution Approaches
We will walk through five different solutions to this problem, highlighting their differences, strengths, and weaknesses.
Approach 1: Using LEAD, LAG with CTE
The first approach uses LEAD
and LAG
window functions with a Common Table Expression (CTE) to create a virtual table that includes the current number, its previous two numbers, and its next two numbers. It then identifies the distinct numbers that appear consecutively three times.
WITH lead_lag AS (
SELECT
num,
LAG(num, 1) OVER (ORDER BY id) [prev_num],
LAG(num, 2) OVER (ORDER BY id) [prev2_num],
LEAD(num, 1) OVER (ORDER BY id) [next_num],
LEAD(num, 2) OVER (ORDER BY id) [next2_num]
FROM Logs
)
SELECT DISTINCT num [ConsecutiveNums]
FROM lead_lag
WHERE
(num = prev_num AND prev_num = prev2_num)
OR (prev_num = num AND num = next_num)
OR (num = next_num AND next_num = next2_num)
This method uses explicit window functions for looking ahead and behind. Its runtime is 1302ms, beating 8.96% of LeetCode submissions.
Approach 2: Using LEAD, LAG with Subquery
The second approach is similar to the first one, but it implements the window functions inside a subquery instead of a CTE.
SELECT DISTINCT num [ConsecutiveNums]
FROM
(
SELECT
num,
LAG(num, 1) OVER (ORDER BY id) [prev_num],
LAG(num, 2) OVER (ORDER BY id) [prev2_num],
LEAD(num, 1) OVER (ORDER BY id) [next_num],
LEAD(num, 2) OVER (ORDER BY id) [next2_num]
FROM Logs
) [l]
WHERE
(num = prev_num AND prev_num = prev2_num)
OR (prev_num = num AND num = next_num)
OR (num = next_num AND next_num = next2_num)
Despite the similarity, this method is slightly less performant with a runtime of 1539ms, beating 5.11% of LeetCode submissions.
Approach 3: Using CASE with CTE
The third approach simplifies the comparison logic by implementing a CASE
statement that assigns a value to a flag (is_valid
) indicating whether a number appears consecutively three times or not.
WITH consecutive_group AS (
SELECT
num,
CASE
WHEN
LAG(num, 2) OVER (ORDER BY id) = LAG(num, 1) OVER (ORDER BY id)
AND LAG(num, 1) OVER (ORDER BY id) = num
THEN 1
WHEN
LAG(num, 1) OVER (ORDER BY id) = num
AND num = LEAD(num, 1) OVER (ORDER BY id)
THEN 1
WHEN
num = LEAD(num, 1) OVER (ORDER BY id)
AND LEAD(num, 1) OVER (ORDER BY id) = LEAD(num, 2) OVER (ORDER BY id)
THEN 1
ELSE 0
END [is_valid]
FROM Logs
)
SELECT DISTINCT num [ConsecutiveNums]
FROM consecutive_group
WHERE is_valid = 1
This method simplifies the final query and improves performance. The runtime is 914ms, beating 60.10% of LeetCode submissions.
Approach 4: Using CASE with Subquery
Similar to the comparison between approaches 1 and 2, this approach is the subquery version of approach 3.
SELECT DISTINCT num [ConsecutiveNums]
FROM
(
SELECT
num,
CASE
WHEN
LAG(num, 2) OVER (ORDER BY id) = LAG(num, 1) OVER (ORDER BY id)
AND LAG(num, 1) OVER (ORDER BY id) = num
THEN 1
WHEN
LAG(num, 1) OVER (ORDER BY id) = num
AND num = LEAD(num, 1) OVER (ORDER BY id)
THEN 1
WHEN
num = LEAD(num, 1) OVER (ORDER BY id)
AND LEAD(num, 1) OVER (ORDER BY id) = LEAD(num, 2) OVER (ORDER BY id)
THEN 1
ELSE 0
END [is_valid]
FROM Logs
) [c]
WHERE is_valid = 1
While being slightly less performant, the runtime is 988ms, beating 42.41% of LeetCode submissions.
Approach 5: Simplified Approach with LEAD
In the final approach, we simplify the SQL query by focusing on the consecutive numbers in the 'future' and ignoring the 'past'. This is achieved by only using the LEAD
window function.
SELECT DISTINCT num [ConsecutiveNums]
FROM (
SELECT
num,
LEAD(num, 1) OVER (ORDER BY id) AS next_num,
LEAD(num, 2) OVER (ORDER BY id) AS next2_num
FROM Logs
) [t]
WHERE num = next_num AND num = next2_num
This method provides a cleaner syntax while maintaining a reasonable performance. The runtime is 1049ms, beating 30.79% of LeetCode submissions.
Conclusion
We have explored five different methods to solve this problem, each with their unique strengths and trade-offs. Each method has shown a different use of window functions, subqueries, and common table expressions.
In terms of performance on LeetCode:
- Approach 3 is the most performant (914ms runtime, beats 60.10%).
- Approach 4 follows closely (988ms runtime, beats 42.41%).
- Approach 5, although simplified, takes third place (1049ms runtime, beats 30.79%).
- Approach 1, while detailed, comes fourth (1302ms runtime, beats 8.96%).
- Approach 2, which uses a subquery, is the least performant (1539ms runtime, beats 5.11%).
Note that these rankings may vary based on the actual real-world RDBMS and data distribution.
The original problem can be found on LeetCode.
For more insightful solutions and tech-related content, feel free to connect with me on my Beacons page.
Top comments (1)
It does not work for :
Input
output is
but expected
We need to check the rows IDs:
UPDATED