DEV Community

Cover image for Search for the Closest Matching Record within the Group — From SQL to SPL #9
Judith-Data-Processing-Hacks
Judith-Data-Processing-Hacks

Posted on • Edited on

2 1 1 1 1

Search for the Closest Matching Record within the Group — From SQL to SPL #9

Problem description & analysis:

The table mytable in the MS SQL database has one ConfirmationStarted and multiple Closed statuses for each ID.

source table

Task: Now we need to find the record closest to ConfirmationStarted among all the Closed records before ConfirmationStarted in each ID, and retrieve the ID and time fields of the record.

expected table

Code comparisons:

SQL solution

WITH cte AS (
    SELECT ID, CreatedAt, NewStatus,
        ROW_NUMBER() OVER (PARTITION BY ID ORDER BY CreatedAt DESC) AS rn
    FROM mytable
    WHERE NewStatus = 'Closed'
    AND CreatedAt < (
        SELECT CreatedAt FROM mytable AS sub
        WHERE sub.ID = mytable.ID AND sub.NewStatus = 'ConfirmationStarted'
    )
)
SELECT ID, CreatedAt as xdate
FROM cte
WHERE rn = 1
ORDER BY ID;
With cte AS (
    SELECT ID, CreatedAt, NewStatus,
        ROW_NUMBER() OVER (PARTITION BY ID ORDER BY CreatedAt DESC) AS rn
    FROM mytable
    WHERE NewStatus = 'Closed'
    AND CreatedAt < (
        SELECT CreatedAt FROM mytable AS sub
        WHERE sub.ID = mytable.ID AND sub.NewStatus = 'ConfirmationStarted'
    )
)
SELECT ID, CreatedAt as xdate
FROM cte
WHERE rn = 1
ORDER BY ID;
Enter fullscreen mode Exit fullscreen mode

SQL does not have natural sequence numbers, so it needs to generate sequence numbers using window functions first. After SQL grouping, it must aggregate immediately and records within the group cannot be filtered. It can only be solved in a roundabout way by filtering repeatedly using multi-level subqueries. The overall code is a bit cumbersome and difficult to understand.

SPL solution:

SPL has natural sequence numbers and provides rich position related calculations. SPL grouping can retain subsets after grouping, making it easier to process data within the group. 👉🏻 try.DEMO

spl code

A1: Load data and sort it by time.

A2: Group by ID, but do not aggregate.

A3: Filter each group of data, first find the records before ConfirmationStarted, and then filter out Closed from them, getting the last one. The select function is used for conditional filtering, which supports position related calculations during filtering, @c represents starting from the first record that makes the condition true and stopping when encountering a record that makes the condition false. @1 represents getting the first piece of the results, and @z represents filtering from back to front.

A2-A4 can be combined into one statement: =A1.group(ID;~.select@c(NewStatus!=”ConfirmationStarted”).select@z1(NewStatus==”Closed”).CreatedAt:xdate)


esProc SPL is now Free to Download, and please feel free to give it a try on your own: esProc SPL FREE Download

API Trace View

Struggling with slow API calls?

Dan Mindru walks through how he used Sentry's new Trace View feature to shave off 22.3 seconds from an API call.

Get a practical walkthrough of how to identify bottlenecks, split tasks into multiple parallel tasks, identify slow AI model calls, and more.

Read more →

Top comments (1)

Collapse
 
judith677 profile image
Judith-Data-Processing-Hacks

Feel free to download esProc SPL and share your thoughts with us!

🌼Discord
🌼Reddit

AWS Security LIVE!

Join us for AWS Security LIVE!

Discover the future of cloud security. Tune in live for trends, tips, and solutions from AWS and AWS Partners.

Learn More

👋 Kindness is contagious

Engage with a wealth of insights in this thoughtful article, valued within the supportive DEV Community. Coders of every background are welcome to join in and add to our collective wisdom.

A sincere "thank you" often brightens someone’s day. Share your gratitude in the comments below!

On DEV, the act of sharing knowledge eases our journey and fortifies our community ties. Found value in this? A quick thank you to the author can make a significant impact.

Okay