The Problem
In this problem, we are given two tables: Sales
and Product
:
Sales Table:
Column Name | Type |
---|---|
sale_id | int |
product_id | int |
year | int |
quantity | int |
price | int |
Product Table:
Column Name | Type |
---|---|
product_id | int |
product_name | varchar |
Our task is to write an SQL query that reports the product_name
, year
, and price
for each sale_id
in the Sales
table. We must return the resulting table in any order.
For example, given the following tables:
Sales:
sale_id | product_id | year | quantity | price |
---|---|---|---|---|
1 | 100 | 2008 | 10 | 5000 |
2 | 100 | 2009 | 12 | 5000 |
7 | 200 | 2011 | 15 | 9000 |
Product:
product_id | product_name |
---|---|
100 | Nokia |
200 | Apple |
300 | Samsung |
The output would be:
product_name | year | price |
---|---|---|
Nokia | 2008 | 5000 |
Nokia | 2009 | 5000 |
Apple | 2011 | 9000 |
The Solution
To solve this problem, we will join the Sales
and Product
tables using different types of joins and see which one yields the best performance. The type of join will affect the performance and the output depending on the underlying dataset and the SQL database management system.
Source Code 1
In this solution, we use an INNER JOIN
to combine rows from Sales
and Product
where the product_id
matches. The INNER JOIN
keyword selects records that have matching values in both tables.
SELECT
p.product_name,
s.year,
s.price
FROM
Sales s JOIN Product p ON s.product_id = p.product_id
This solution has a runtime of 5853ms and it's faster than 5.4% of all SQL online submissions for this problem.
Source Code 2
In this solution, we use a LEFT JOIN
to combine rows from Sales
and Product
where the product_id
matches. The LEFT JOIN
keyword returns all records from the left table (Sales
), and the matched records from the right table (Product
).
SELECT
p.product_name,
s.year,
s.price
FROM
Sales s LEFT JOIN Product p ON s.product_id = p.product_id
This solution has a runtime of 5759ms and it's faster than 8.69% of all SQL online submissions for this problem.
Source Code 3
In this solution, we use the same INNER JOIN
as in Source Code 1, but we put the join operation in a subquery. A subquery, also known as an inner query or nested query, is a query within another SQL query and embedded within the WHERE clause.
SELECT a.*
FROM (
SELECT
p.product_name,
s.year,
s.price
FROM
Sales s JOIN Product p ON s.product_id = p.product_id
) a
This solution has a runtime of 5329ms and it's faster than 46.12% of all SQL online submissions for this problem.
Source Code 4
In this solution, we use a RIGHT JOIN
to combine rows from Sales
and Product
where the product_id
matches. The RIGHT JOIN
keyword returns all records from the right table (Sales
), and the matched records from the left table (Product
). The DISTINCT
keyword is used to eliminate duplicate records from the result set. If there is no match, the result is NULL
on the left side. This code will return each unique combination of product_name
, year
, and price
where the product_id
in Sales
matches the product_id
in Product
.
SELECT DISTINCT
p.product_name,
s.year,
s.price
FROM
Product p RIGHT JOIN Sales s ON s.product_id = p.product_id
This solution has a runtime of 5274ms and it's faster than 51.8% of all SQL online submissions for this problem.
Source Code 5
This solution is identical to Source Code 1, but with the addition of the DISTINCT
keyword. DISTINCT
is used to remove duplicate rows from the result set.
SELECT DISTINCT
p.product_name,
s.year,
s.price
FROM
Sales s JOIN Product p ON s.product_id = p.product_id
This solution has a runtime of 5624ms and it's faster than 18.97% of all SQL online submissions for this problem.
Conclusion
All the solutions presented above yield the correct result, but with different performances. Based on the runtime, the rank of the solutions from fastest to slowest is:
- Source Code 4 (5274ms)
- Source Code 3 (5329ms)
- Source Code 5 (5624ms)
- Source Code 2 (5759ms)
- Source Code 1 (5853ms)
It's interesting to note that using a RIGHT JOIN
(Source Code 4) results in the best performance on LeetCode. However, in real-world RDBMS, the performance might vary depending on the specific dataset and indexing strategies.
Always bear in mind that understanding the problem thoroughly and choosing the right approach based on the context is the key to writing efficient SQL queries.
You can find the original problem at LeetCode.
For more insightful solutions and tech-related content, feel free to connect with me on my Beacons page.
Top comments (0)