DEV Community

Cover image for Customer Who Visited but Did Not Make Any Transactions | LeetCode | MSSQL
Retiago Drago
Retiago Drago

Posted on • Edited on

Customer Who Visited but Did Not Make Any Transactions | LeetCode | MSSQL

The Problem

The task at hand is to analyze two tables: Visits and Transactions.

Visits contains information about customers who visited a mall. The structure of the Visits table is as follows:

Column Name Type
visit_id int
customer_id int

visit_id is the primary key for this table.

Transactions contains information about transactions made during a particular visit. The structure of the Transactions table is:

Column Name Type
transaction_id int
visit_id int
amount int

transaction_id is the primary key for this table.

The objective is to write a SQL query to find the customer IDs of users who visited the mall without making any transactions, as well as the number of times they made such visits. The result should be returned in any order.

For instance, consider the following input:

Visits

visit_id customer_id
1 23
2 9
4 30
5 54
6 96
7 54
8 54

Transactions

transaction_id visit_id amount
2 5 310
3 5 300
9 5 200
12 1 910
13 2 970

The expected output is:

customer_id count_no_trans
54 2
30 1
96 1

The explanation for this output is as follows:

  • Customer 23 visited the mall once and made one transaction during visit 1.
  • Customer 9 visited the mall once and made one transaction during visit 2.
  • Customer 30 visited the mall once but did not make any transactions.
  • Customer 54 visited the mall three times. During two of these visits, they did not make any transactions, but during one visit, they made three transactions.
  • Customer 96 visited the mall once but did not make any transactions.

Hence, customers 30 and 96 visited the mall once without making any transactions. Customer 54 visited twice without making any transactions.

The Solution

We have six source codes to address this problem. All six solutions use different SQL join types and methods for filtering records. The solutions use SQL functions such as LEFT JOIN, RIGHT JOIN, DISTINCT, COUNT, GROUP BY, and NOT EXISTS in different combinations. The different strategies employed by these codes include joining tables and filtering the result to only include visits without transactions, using the NOT IN clause to exclude visits that have corresponding transactions, and using the NOT EXISTS clause to exclude visits that have corresponding transactions.

Source Code 1

This code uses a LEFT JOIN to combine the Visits and Transactions tables and then filters the result to only include visits that don't have a corresponding transaction. The COUNT function is then used to find the number of such visits for each customer.

SELECT
    v.customer_id,
    COUNT(v.visit_id) [count_no_trans]
FROM Visits v LEFT JOIN Transactions t ON v.visit_id = t.visit_id
WHERE t.visit_id IS NULL
GROUP BY v.customer_id
Enter fullscreen mode Exit fullscreen mode

This code has a runtime of 2934ms, beating 13.82% of submissions.
s1

Source Code 2

This code is similar to Source Code 1, but it uses a RIGHT JOIN instead of a LEFT JOIN.

SELECT
    v.customer_id,
    COUNT(v.visit_id) [count_no_trans]
FROM Transactions t RIGHT JOIN Visits v ON v.visit_id = t.visit_id
WHERE t.visit_id IS NULL
GROUP BY v.customer_id
Enter fullscreen mode Exit fullscreen mode

This code has a runtime of 2931ms, also beating 13.82% of submissions.
s2

Source Code 3

Source Code 3 uses a subquery in the WHERE clause to filter out visits that have corresponding transactions. The NOT IN clause excludes visit IDs that are in the Transactions table.

SELECT
    customer_id,
    COUNT(visit_id) [count_no_trans]
FROM Visits 
WHERE visit_id NOT IN (
    SELECT DISTINCT visit_id
    FROM Transactions
)
GROUP BY customer_id
Enter fullscreen mode Exit fullscreen mode

This code has a runtime of 2848ms, beating 16.79% of submissions.
s3

Source Code 4

This code is similar to Source Code 3, but it also uses the DISTINCT keyword in the main query, not just in the subquery. Additionally, it uses the OVER clause with the PARTITION BY clause to compute the count of visits for each customer.

SELECT DISTINCT
    customer_id,
    COUNT(visit_id) OVER(PARTITION BY customer_id) [count_no_trans]
FROM Visits 
WHERE visit_id NOT IN (
    SELECT DISTINCT visit_id
    FROM Transactions
)
Enter fullscreen mode Exit fullscreen mode

This code has a runtime of 2444ms, beating 39.85% of submissions.
s4

Source Code 5

Source Code 5 uses the NOT EXISTS clause with a subquery to exclude visits that have corresponding transactions. It also uses the OVER clause with the PARTITION BY clause to compute the count of visits for each customer.

SELECT DISTINCT
    customer_id,
    COUNT(visit_id) OVER(PARTITION BY customer_id) [count_no_trans]
FROM Visits v
WHERE NOT EXISTS (
    SELECT 1
    FROM Transactions t
    WHERE t.visit_id = v.visit_id
)
Enter fullscreen mode Exit fullscreen mode

This code has a runtime of 2464ms, beating 38.63% of submissions.
s5

Source Code 6

Source Code 6 is similar to Source Code 5, but it uses the GROUP BY clause instead of the OVER clause with the PARTITION BY clause to compute the count of visits for each customer.

SELECT
    customer_id,
    COUNT(visit_id) [count_no_trans]
FROM Visits v
WHERE NOT EXISTS (
    SELECT 1
    FROM Transactions t
    WHERE t.visit_id = v.visit_id
)
GROUP BY customer_id
Enter fullscreen mode Exit fullscreen mode

This code has a runtime of 3669ms, beating 5.2% of submissions.
s6

Conclusion

These solutions demonstrate different ways to solve the problem using SQL. Some methods are more efficient than others in terms of runtime. Based on these examples, it appears that using NOT EXISTS with a subquery, and using the OVER clause with the PARTITION BY clause for counting, provides the most efficient solution. However, the efficiency of a solution can depend on the specifics of the database management system being used, as well as the data distribution. Hence, different solutions may be more efficient in different circumstances.

Ranking the solutions from best to worst in terms of LeetCode performance, we get:

  1. Source Code 4
  2. Source Code 5
  3. Source Code 3
  4. Source Code 1 and 2
  5. Source Code 6

Keep in mind that performance on LeetCode might not always translate directly to performance in a real-world RDBMS. LeetCode measures how quickly a solution runs in its specific environment, which may differ from the environment of a production database. Factors like indexing, query optimization by the database engine, the hardware the database is running on, and more can all affect the real-world performance of a solution.

You can find the original problem at LeetCode.

For more insightful solutions and tech-related content, feel free to connect with me on my Beacons page.

ranggakd - Link in Bio & Creator Tools | Beacons

@ranggakd | center details summary summary Oh hello there I m a an Programmer AI Tech Writer Data Practitioner Statistics Math Addict Open Source Contributor Quantum Computing Enthusiast details center.

favicon beacons.ai

Top comments (0)