In a previous lesson, we learned that we can apply ROW_NUMBER() to a data set and output the row number of the data set based on the order we specify. This means that if we have an alphabetized list, we could have ROW_NUMBER() return that list in an ascending or descending manner based on the alphabet. What may be a drawback here is that we have possible duplicates in the table, or we have possible multiples in the table and we want these multiples to be the same value rather than being different row numbers. Think of an example where we have millions of home sales organized by state and we want the states ranked by a number, rather than having the rows returned with the Nth value. In the video, SQL Basics: How To Use DENSE_RANK() and Why, we look at using DENSE_RANK() to solve a similar problem to this.
Some questions that are answered in the video:
- You'll notice that the table being used has merchants by names and product ids associated by the merchant. As a note, we would normally use an associative table here, but for the sake of this video, we are using this contrived example, as it helps to illustrate how we can use DENSE_RANK().
- In the video, both ROW_NUMBER() and DENSE_RANK() are compared side-by-side. Using merchant AAA as an example, how do these two return different values?
- Suppose we had 100 home sales by three states of Oklahoma, Kentucky and North Carolina - what would be the maximum DENSE_RANK() value if we ranked by state? What would be the maximum ROW_NUMBER()?
- In considering "groups" and "ordering" what would we use for "ordering" and what would we use for "grouping" when comparing ROW_NUMBER() or DENSE_RANK()?
Notice what happens when we add PARTITION BY with ROW_NUMBER() and then look at the data set with DENSE_RANK() in another column: we see the unique grouping (from DENSE_RANK()) along with the ordered value of that unique grouping (ROW_NUMBER() with PARTITION BY). This is extremely useful when we want to organize and order related data.
Ordering and grouping data sets help us solve numerous challenges with data. As previously mentioned, a business application of this might be home sales by state (or another area type that we want to group by). Often before we have our "final data product" we'll use this type of functionality to create our data sets and tables. We rarely will have a final data product that looks like the table in the video example: usually we'll have an associative table between these and that associative table can be created from the DENSE_RANK() functionality (provided the grouping remains the same).
Top comments (0)