The Problem
We are given three tables: Students
, Subjects
, and Examinations
.
Students
table structure:
Column Name | Type |
---|---|
student_id | int |
student_name | varchar |
The student_id
is the primary key for this table. Each row contains the ID and the name of one student in the school.
Subjects
table structure:
Column Name | Type |
---|---|
subject_name | varchar |
The subject_name
is the primary key for this table. Each row contains the name of one subject in the school.
Examinations
table structure:
Column Name | Type |
---|---|
student_id | int |
subject_name | varchar |
This table doesn't have a primary key and may contain duplicates. Each row indicates that a student with the student_id
attended the exam of subject_name
.
The task is to write an SQL query to find the number of times each student attended each exam. Return the result table ordered by student_id
and subject_name
.
Explanation
For example:
Students
table:
student_id | student_name |
---|---|
1 | Alice |
2 | Bob |
13 | John |
6 | Alex |
Subjects
table:
subject_name |
---|
Math |
Physics |
Programming |
Examinations
table:
student_id | subject_name |
---|---|
1 | Math |
1 | Physics |
1 | Programming |
2 | Programming |
1 | Physics |
1 | Math |
13 | Math |
13 | Programming |
13 | Physics |
2 | Math |
1 | Math |
Expected Output:
student_id | student_name | subject_name | attended_exams |
---|---|---|---|
1 | Alice | Math | 3 |
1 | Alice | Physics | 2 |
1 | Alice | Programming | 1 |
2 | Bob | Math | 1 |
2 | Bob | Physics | 0 |
2 | Bob | Programming | 1 |
6 | Alex | Math | 0 |
6 | Alex | Physics | 0 |
6 | Alex | Programming | 0 |
13 | John | Math | 1 |
13 | John | Physics | 1 |
13 | John | Programming | 1 |
Alice attended the Math exam 3 times, the Physics exam 2 times, and the Programming exam 1 time. Bob attended the Math exam 1 time, the Programming exam 1 time, and didn't attend the Physics exam. Alex didn't attend any exams. John attended the Math exam 1 time, the Physics exam 1 time, and the Programming exam 1 time.
The Solution
We'll discuss three SQL solutions with varying degrees of complexity. The primary differences among these solutions lie in the SQL features they use, such as conditional statements, CROSS JOIN, and RIGHT JOIN.
Source Code 1
This query starts by joining the Students
and Subjects
tables on a constant condition (1=1), effectively generating all possible combinations of students and subjects. Next, a subquery is used to calculate the attendance count for each student and subject from the Examinations
table. This subquery result is then left joined with the combination of students and subjects on matching student_id and subject_name. A CASE statement is used to handle scenarios where no examination record exists for a student-subject pair, replacing NULL with 0.
SELECT
s.student_id,
s.student_name,
u.subject_name,
CASE
WHEN e.counter IS NULL THEN 0
ELSE e.counter
END [attended_exams]
FROM Students s JOIN Subjects u ON 1=1
LEFT JOIN (
SELECT
*,
COUNT(student_id) [counter]
FROM Examinations
GROUP BY
student_id,
subject_name
) e ON e.student_id = s.student_id AND e.subject_name = u.subject_name
ORDER BY
s.student_id,
u.subject_name
The runtime for this solution is 715ms, beating 53.70% of submissions on LeetCode.
Source Code 2
This query also generates all possible combinations of students and subjects, but it uses the CROSS JOIN operation instead of joining on a constant condition. It then LEFT JOINs the Examinations
table on matching student_id and subject_name. The attendance count for each student-subject pair is calculated using the COUNT function within an IIF statement to handle nulls.
SELECT
s.student_id,
s.student_name,
sb.subject_name,
IIF(COUNT(e.student_id) IS NULL, 0, COUNT(e.student_id)) [attended_exams]
FROM
Students s
CROSS JOIN
Subjects sb
LEFT JOIN
Examinations e ON s.student_id = e.student_id AND sb.subject_name = e.subject_name
GROUP BY
s.student_id,
s.student_name,
sb.subject_name
ORDER BY
s.student_id,
sb.subject_name
The runtime for this solution is 874ms, beating 24.33% of submissions on LeetCode.
Source Code 3
In this solution, we first generate a subquery from Examinations
that calculates the attendance count for each student and subject. We also generate a subquery from the CROSS JOIN of Students
and Subjects
tables. Then, we RIGHT JOIN the two subqueries on matching student_id and subject_name. The ISNULL function is used to handle scenarios where no examination record exists for a student-subject pair, replacing NULL with 0.
SELECT
s_u.student_id,
s_u.student_name,
s_u.subject_name,
ISNULL(e.attended_exams, 0) [attended_exams]
FROM (
SELECT
student_id,
subject_name,
COUNT(*) [attended_exams]
FROM Examinations
GROUP BY
student_id,
subject_name
) e
RIGHT JOIN (SELECT * FROM Students CROSS JOIN Subjects) s_u
ON s_u.student_id = e.student_id AND s_u.subject_name = e.subject_name
ORDER BY
s_u.student_id,
s_u.subject_name
The runtime for this solution is 732ms, beating 49.77% of submissions on LeetCode.
Conclusion
All three solutions solve the problem effectively but with different performance results. The solution using CROSS JOIN (Source Code 2) performs the slowest due to the overhead of generating all combinations of students and subjects. Both Source Code 1 and 3 are faster due to their more efficient handling of the attendance count calculations.
Based on LeetCode performance metrics, Source Code 1 is the fastest solution, followed by Source Code 3, and then Source Code 2. However, in real-world scenarios, the performance may vary based on the RDBMS in use and the specific database structure and data volume.
You can find the original problem at LeetCode.
For more insightful solutions and tech-related content, feel free to connect with me on my Beacons page.
Top comments (0)