The Problem
In a given Stadium
table, we are tasked with finding records that follow a specific pattern. The table schema is as follows:
Column Name | Type |
---|---|
id | int |
visit_date | date |
people | int |
visit_date
is the primary key for this table. Each row of this table contains the visit date, id, and the number of people during the visit. There are no duplicate visit_dates, and as the id increases, the dates increase as well.
Our goal is to write an SQL query that will display the records with three or more rows having consecutive ids, with the number of people greater than or equal to 100 for each of these records. The result table should be ordered by visit_date
in ascending order.
Here is an example to better illustrate this problem:
Input:
id | visit_date | people |
---|---|---|
1 | 2017-01-01 | 10 |
2 | 2017-01-02 | 109 |
3 | 2017-01-03 | 150 |
4 | 2017-01-04 | 99 |
5 | 2017-01-05 | 145 |
6 | 2017-01-06 | 1455 |
7 | 2017-01-07 | 199 |
8 | 2017-01-09 | 188 |
Output:
id | visit_date | people |
---|---|---|
5 | 2017-01-05 | 145 |
6 | 2017-01-06 | 1455 |
7 | 2017-01-07 | 199 |
8 | 2017-01-09 | 188 |
The four rows with ids 5, 6, 7, and 8 have consecutive ids and each of them has >= 100 people attended. Note that row 8 was included even though the visit_date was not the next day after row 7.
The Solution
We are presenting two solutions that achieve the same goal using different SQL constructs. Both approaches use window functions (LEAD
, LAG
), which provide access to rows at a given physical offset.
Source Code 1
The first approach uses a Common Table Expression (CTE) to create a temporary view with added columns that represent the values of people
at one and two positions behind and ahead of the current row.
The main query then checks if there are three consecutive rows (prev2_people
, prev_people
, people
or prev_people
, people
, next_people
or people
, next_people
, next2_people
) where each people
is greater than or equal to 100.
WITH lead_lag AS (
SELECT *,
LAG(people, 1) OVER (ORDER BY id) [prev_people],
LAG(people, 2) OVER (ORDER BY id) [prev2_people],
LEAD(people, 1) OVER (ORDER BY id) [next_people],
LEAD(people, 2) OVER (ORDER BY id) [next2_people]
FROM Stadium
)
SELECT
id,
visit_date,
people
FROM lead_lag
WHERE
(people >= 100 AND prev_people >= 100 AND prev2_people >= 100)
OR (people >= 100 AND prev_people >= 100 AND next_people >= 100)
OR (people >= 100 AND next_people >= 100 AND next2_people >= 100)
ORDER BY visit_date
This code took 551ms to run and beat 44.95% of other SQL solutions in LeetCode.
Source Code 2
The second approach, instead of adding additional columns to the existing table as the first solution does, directly calculates and assigns a flag, isValid
, to each row in the Stadium
table based on whether the row and its adjacent rows satisfy the given conditions.
We create a CTE named ConsecutiveGroups
where each row is flagged with a 1
if it is part of a valid group of three consecutive records (including itself) each having people
>= 100. These groups can either be in the pattern: LAG(people, 2), LAG(people, 1), people
or LAG(people, 1), people, LEAD(people, 1)
or people, LEAD(people, 1), LEAD(people, 2)
.
If a row does not satisfy any of these conditions, it is flagged with a 0
.
Finally, the main query selects all the records from the ConsecutiveGroups
CTE where isValid = 1
and orders the result by visit_date
.
This approach utilizes the CASE
statement to conditionally assign the isValid
flag to each record in the Stadium
table. The LAG
and LEAD
window functions are used to look at the people
attribute of the previous and next records, respectively.
WITH ConsecutiveGroups AS (
SELECT
id,
visit_date,
people,
CASE
WHEN people >= 100
AND LAG(people, 1) OVER (ORDER BY visit_date) >= 100
AND LAG(people, 2) OVER (ORDER BY visit_date) >= 100 THEN 1
WHEN people >= 100
AND LAG(people, 1) OVER (ORDER BY visit_date) >= 100
AND LEAD(people, 1) OVER (ORDER BY visit_date) >= 100 THEN 1
WHEN people >= 100
AND LEAD(people, 1) OVER (ORDER BY visit_date) >= 100
AND LEAD(people, 2) OVER (ORDER BY visit_date) >= 100 THEN 1
ELSE 0
END AS isValid
FROM
Stadium
)
SELECT
id,
visit_date,
people
FROM ConsecutiveGroups
WHERE isValid = 1
ORDER BY visit_date
This code took 521ms to run and beat 54.29% of other SQL solutions in LeetCode.
Conclusion
Both solutions fulfill the task but differ in terms of their performance on LeetCode's platform. The second solution is slightly faster due to performing fewer computations, leading to a higher ranking. However, it's important to note that actual runtime may vary on different RDBMS due to differences in how they handle SQL constructs and optimizations.
In terms of rankings, if we consider the execution speed as the primary metric, then Source Code 2 is better followed by Source Code 1.
Remember, the perfect solution is not always the fastest one but the one that best fits your needs and constraints!
You can find the original problem at LeetCode.
For more insightful solutions and tech-related content, feel free to connect with me on my Beacons page.
Top comments (0)