Retiago Drago

Posted on May 21, 2023

Contest Leaderboard | HackerRank | MSSQL

#codenewbie #beginners #hackerrank #mssql

The Problem

In this problem, we need to calculate the total score of each hacker, where the total score is defined as the sum of their maximum scores across all challenges. We then need to print the hacker_id, name, and total score of the hackers, sorted by descending total score and ascending hacker_id for ties. We need to exclude all hackers with a total score of 0.

The Input

The input consists of two tables:

Hackers Table: The hacker_id is the id of the hacker, and name is the name of the hacker.

Submissions Table: The submission_id is the id of the submission, hacker_id is the id of the hacker who made the submission, challenge_id is the id of the challenge for which the submission belongs to, and score is the score of the submission.

Sample input and output are available for more in-depth understanding of the problem.

Sample Input
Hackers Table:

Submissions Table:

The Output

Sample Output:

4071 Rose 191
74842 Lisa 174
84072 Bonnie 100
4806 Angela 89
26071 Frank 85
80305 Kimberly 67
49438 Patrick 43

Explanation

Hacker 4071 submitted solutions for challenges 19797 and 49593, so the total score =95+max(43,96)=191.
Hacker 74842 submitted solutions for challenges 19797 and 63132, so the total score =max(98,5)+76=174.
Hacker 84072 submitted solutions for challenges 49593 and 63132, so the total score =100+0=100.
The total scores for hackers 4806, 26071, 80305, and 49438 can be similarly calculated.

The Solution

We'll discuss three SQL solutions, each with different strategies and trade-offs.

Source Code 1

The first source code creates a total_score Common Table Expression (CTE) that joins the Hackers and Submissions tables and calculates the maximum score per challenge for each hacker. It then sums these max scores per hacker and filters out hackers with a total score of 0. It orders the result by total score in descending order and hacker_id in ascending order for ties. This solution uses the OVER clause with PARTITION BY to calculate the max score per challenge per hacker, which simplifies the subsequent aggregation.

WITH total_score AS (
    SELECT DISTINCT
        h.hacker_id,
        h.name,
        s.challenge_id,
        MAX(s.score) OVER (PARTITION BY h.hacker_id, s.challenge_id) AS max_score_per_subm
    FROM
        Hackers h JOIN Submissions s ON h.hacker_id = s.hacker_id
)
SELECT
    hacker_id,
    name,
    SUM(max_score_per_subm) AS total
FROM total_score
GROUP BY
    hacker_id,
    name
HAVING SUM(max_score_per_subm) > 0
ORDER BY
    total DESC,
    hacker_id

Source Code 2

The second solution differs in that it first calculates the max score per challenge per hacker using a GROUP BY clause in the total_score CTE, then sums these max scores per hacker in the score_per_hacker CTE. This involves two separate groupings, which may increase execution time. However, it also separates concerns and can be easier to understand.

WITH total_score AS (
    SELECT 
        s.hacker_id,
        MAX(s.score) AS max_score
    FROM 
        Submissions s
    GROUP BY
        s.hacker_id,
        s.challenge_id
),
score_per_hacker AS (
    SELECT
        ts.hacker_id,
        SUM(ts.max_score) AS total_score
    FROM
        total_score ts
    GROUP BY
        ts.hacker_id
)
SELECT
    sp.hacker_id,
    h.name,
    sp.total_score
FROM
    score_per_hacker sp
JOIN
    Hackers h ON sp.hacker_id = h.hacker_id
WHERE
    sp.total_score > 0
ORDER BY
    sp.total_score DESC,
    sp.hacker_id

Source Code 3

The third solution uses the ROW_NUMBER() function to assign a unique rank to each submission by each hacker for each challenge. It then only includes the highest-ranked (i.e., highest-scoring) submission for each challenge in the total score. This strategy avoids the need to use DISTINCT in the CTE or to group by challenge_id, potentially improving performance.

SELECT 
    hacker_id, 
    name, 
    SUM(CASE WHEN rn = 1 THEN score ELSE 0 END) AS total_score
FROM (
    SELECT 
        h.hacker_id,
        h.name,
        s.score,
        ROW_NUMBER() OVER(PARTITION BY s.hacker_id, s.challenge_id ORDER BY s.score DESC) as rn
    FROM 
        Hackers h 
    JOIN 
        Submissions s ON h.hacker_id = s.hacker_id
) t
GROUP BY 
    hacker_id, 
    name
HAVING 
    SUM(CASE WHEN rn = 1 THEN score ELSE 0 END) > 0
ORDER BY 
    total_score DESC, 
    hacker_id

Conclusion

All three solutions achieve the same result but use different techniques to calculate the total score of each hacker. The first solution is straightforward but might not be as efficient due to its use of DISTINCT. The second solution separates concerns by creating a CTE for each step, which could improve readability at the cost of execution time. The third solution is likely the most efficient due to its use of the ROW_NUMBER() function to select the highest score per challenge per hacker directly.

In summary, while different SQL queries can achieve the same result, their performance can vary significantly based on the SQL features and functions used. Therefore, it's crucial to consider different approaches and understand their trade-offs.

You can find the original problem at HackerRank.

For more insightful solutions and tech-related content, feel free to connect with me on my Beacons page.

ranggakd - Link in Bio & Creator Tools | Beacons

@ranggakd | center details summary summary Oh hello there I m a an Programmer AI Tech Writer Data Practitioner Statistics Math Addict Open Source Contributor Quantum Computing Enthusiast details center.

beacons.ai

DEV Community

Contest Leaderboard | HackerRank | MSSQL

The Problem

The Input

The Output

Explanation

The Solution

Source Code 1

Source Code 2

Source Code 3

Conclusion

ranggakd - Link in Bio & Creator Tools | Beacons

Top comments (0)

Read next

Beginners Guide To CDN

Software Dev Roles and Salary ranges in the Philippines (2024)

Key Components of a VPC: Detailed Breakdown

Why Seeing Data Beats Reading It: The Case for Data Visualization