The Problem
Consider the following table in our database:
Table: Views
Column Name | Type |
---|---|
article_id | int |
author_id | int |
viewer_id | int |
view_date | date |
There is no primary key for this table, so it may have duplicate rows. Each row indicates that a viewer has viewed an article (written by some author) on a certain date. Note that the same author_id
and viewer_id
indicate the same person.
The challenge is to write an SQL query to find all the authors that viewed at least one of their own articles. The result should be returned sorted by author_id
in ascending order.
Explanation
For example, consider the following data:
Views table:
article_id | author_id | viewer_id | view_date |
---|---|---|---|
1 | 3 | 5 | 2019-08-01 |
1 | 3 | 6 | 2019-08-02 |
2 | 7 | 7 | 2019-08-01 |
2 | 7 | 6 | 2019-08-02 |
4 | 7 | 1 | 2019-07-22 |
3 | 4 | 4 | 2019-07-21 |
3 | 4 | 4 | 2019-07-21 |
In this case, the output should be:
id |
---|
4 |
7 |
Both authors with author_id
4 and 7 have viewed their own articles.
The Solution
There are two main approaches to solving this problem, and we'll discuss both of them in this post.
Source Code 1: Using WHERE Clause
The first solution is a simple SELECT
statement with a WHERE
clause that checks if the author_id
equals the viewer_id
, which means the author has viewed their own article. The DISTINCT
keyword is used to avoid duplicate author_id
s in the output.
SELECT DISTINCT author_id [id]
FROM Views
WHERE author_id = viewer_id
This code runs in 1594ms and beats 50.31% of other submissions.
Source Code 2: Adding ORDER BY Clause
The second solution is similar to the first one but with an additional ORDER BY
clause. The ORDER BY
clause sorts the results in ascending order by author_id
.
SELECT DISTINCT author_id [id]
FROM Views
WHERE author_id = viewer_id
ORDER BY author_id
This code runs in 1898ms and beats 19.91% of other submissions.
Conclusion
Both of these methods achieve the desired result. However, the addition of an ORDER BY
clause in the second solution slows down the query slightly, as sorting results comes with a performance cost.
The first solution performs better in terms of runtime on LeetCode, but performance may vary depending on the specific
RDBMS and data distribution in real-world scenarios. Therefore, it's essential to consider the characteristics of your own dataset and the capabilities of your RDBMS when choosing a solution.
You can find the original problem at LeetCode.
For more insightful solutions and tech-related content, feel free to connect with me on my Beacons page.
Top comments (0)