DEV Community

Cong Li
Cong Li

Posted on

Introduction to the Pivot Function in GBase 8s Database

In this article, we will take a look at the Pivot function in GBase 8s, which transforms row data into column data. Whether for reporting or data analysis, the Pivot function can help us view data from different perspectives.

Overview of the Pivot Function in GBase 8s

Pivot, or row-to-column transformation, is a process of converting row data in a table into column data. By performing a Pivot operation, we can transform one or more columns of data into multiple columns, allowing for better horizontal statistics and analysis.

Unpivot, or column-to-row transformation, is the opposite of Pivot and converts column data into row data, which is better suited for vertical statistics and analysis.

Pivot and Unpivot are two very useful functions in data transformation. They can alter the presentation of data as needed, making it easier to understand and analyze. These functions are widely used in database queries, data analysis, and report generation. We will introduce the Unpivot function in the next article.

Detailed Syntax of GBase 8s Pivot

The process of Pivot transformation: multiple rows are aggregated and then transformed into columns, with each column representing a different range of aggregated data. The syntax is as follows:

SELECT ...         
FROM ...   
PIVOT
( 
    pivot_clause      --- The value of the column after the row-to-column transformation, an aggregate value that supports multiple columns.
    pivot_for_clause  --- The field that needs to be transformed into columns, turning one column into multiple columns.
    pivot_in_clause   --- A filter for the pivot_for_clause field, specifying the values to be transformed into columns.
)   
WHERE ...
Enter fullscreen mode Exit fullscreen mode

Pivot follows directly after the FROM clause and is placed between FROM and WHERE.

In simple terms, the syntax is as follows:

PIVOT (aggregate_function1 AS alias1, aggregate_function2 AS alias2 ...) 
FOR column_to_transform 
IN ((value1_for_column1, value2_for_column2) AS alias1, (value1_for_column2, value2_for_column2) AS alias2...)
Enter fullscreen mode Exit fullscreen mode

Example of GBase 8s Pivot

Note: The following examples use the GBase8sV8.8_TL_3.5.1_x86_64 database version.

Here are some simple demonstrations of Pivot syntax using the emp table. Before the transformation, let’s take a look at the base data:

create table emp(empno int,ename varchar(10),job varchar(9),sal1 DECIMAL(10,2),sal2 DECIMAL(10,2), deptno int);
-- Insert 10 rows of data:
INSERT INTO emp  VALUES (1, 'smith', 'clerk', 800.00,910,10);
INSERT INTO emp  VALUES (2, 'allen', 'salesman', 1600.00,5491,30);
INSERT INTO emp  VALUES (3, 'ward', 'salesman', 1250.00,3429,30);
INSERT INTO emp  VALUES (4, 'jones', 'manager', 2975.00,1298,20);
INSERT INTO emp  VALUES (5, 'martin', 'salesman',1250.00,1256.34,20);
INSERT INTO emp  VALUES (6, 'blake', 'manager', 2850.00,1234.34,20);
INSERT INTO emp  VALUES (7, 'clark', 'manager', 2450.00,1000,10);
INSERT INTO emp  VALUES (8, 'scott', 'clerk', 3000.00,2000,20);
INSERT INTO emp  VALUES (9, 'king', 'president',5000.00,4500,10);
INSERT INTO emp  VALUES (10, 'turner', 'salesman', 1500.00,1000,10);
> select job,deptno,sum(sal1) as sum_sal1 from emp group by job,deptno order by job,deptno;
job            deptno         sum_sal1
clerk              10           800.00
clerk              20          3000.00
manager            10          2450.00
manager            20          5825.00
president          10          5000.00
salesman           10          1500.00
salesman           20          1250.00
salesman           30          2850.00
8 row(s) retrieved.
ORDER BY job, deptno;
Enter fullscreen mode Exit fullscreen mode

Case 1: Single Column Transformation

Here, we summarize sal1 by deptno and job, then convert the department numbers into their respective columns:

Aggregated column: sal1

Column to transform: deptno

Values to filter for transformation: 10, 20, 30, 40

select * from (SELECT deptno,job, sal1 FROM emp)a               
PIVOT
(                                                                                 
SUM(sal1) --pivot_clause
FOR deptno --pivot_for_clause
IN (10, 20,30,40) --pivot_in_cluase                                   
) ;
Result:
job                  10               20               30               40
president           5000.00
manager             2450.00          5825.00
salesman            1500.00          1250.00          2850.00
clerk               800.00          3000.00
4 row(s) retrieved.
Enter fullscreen mode Exit fullscreen mode

Points to Note About This Example:

The pivot operation actually performs an implicit GROUP BY using the columns not appearing in the pivot_clause (in this case, job and deptno). Most transformation queries are executed on specific subsets of columns. As with all aggregate queries, the presence of additional columns affects the grouping. In this example, all columns except sal1 become the grouping set, with deptno being the transformed column. If other columns from the base data, such as ename, are included, they would affect the grouping. Therefore, you can use subqueries to define the set of base columns, as in the example where SELECT deptno, job, sal1 FROM emp defines the base set of columns.

Case 2: Effect of Involving All Columns

SELECT * FROM emp
PIVOT (
SUM(sal1)
FOR deptno
IN (10, 20, 30)
 );
Result:

empno  2
ename  allen
job    salesman
sal2   5491.00
10     
20     
30     1600.00

empno  7
ename  clark
job    manager
sal2   1000.00
10     2450.00
20     
30     

empno  3
ename  ward
job    salesman
sal2   3429.00
10     
20     
30     1250.00

empno  8
ename  scott
job    clerk
sal2   2000.00
10     
20     3000.00
30     

empno  1
ename  smith
job    clerk
sal2   910.00
10     800.00
20     
30     

empno  9
ename  king
job    president
sal2   4500.00
10     5000.00
20     
30     

empno  5
ename  martin
job    salesman
sal2   1256.34
10     
20     1250.00
30     

empno  4
ename  jones
job    manager
sal2   1298.00
10     
20     2975.00
30     

empno  10
ename  turner
job    salesman
sal2   1000.00
10     1500.00
20     
30     

empno  6
ename  blake
job    manager
sal2   1234.34
10     
20     2850.00
30     

10 row(s) retrieved.
Enter fullscreen mode Exit fullscreen mode

In this example, all columns except sal1 become the grouping set, with deptno being the transformed column. This transformation doesn't make much sense because it includes all columns in the grouping, leading to results that may not be meaningful.

Case 3: Using WITH AS

Below is an example using a WITH AS subquery, achieving the same result as the above inline view.

WITH pivot_data AS (SELECT deptno, job, sal1 FROM emp)
SELECT * FROM PIVOT_data
PIVOT
(
SUM(sal1) 
FOR deptno 
IN (10, 20, 30,40)
);
Result:

job                10               20             30               40

president          5000.00
manager          2450.00         5825.00
salesman          1500.00         1250.00          2850.00
clerk              800.00          3000.00

4 row(s) retrieved.
Enter fullscreen mode Exit fullscreen mode

Case 4: Multiple Column Transformation

select * from (SELECT deptno,job, sal1,sal2 FROM emp)a 
PIVOT
(
SUM(sal1) AS sum_sal1,
SUM(sal2) AS sum_sal2
FOR deptno
IN (10,20)
);
Result:
job            10_sum_sal1      20_sum_sal1      10_sum_sal2      20_sum_sal2

president         5000.00                         4500.00
manager         2450.00         5825.00          1000.00         2532.34
salesman         1500.00         1250.00          1000.00         1256.34
clerk             800.00          3000.00           910.00          2000.00
4 row(s) retrieved.
Enter fullscreen mode Exit fullscreen mode

In the above example, two aggregates are defined in the pivot_clause, resulting in double the number of transformed columns (since the number of aggregates is doubled).

  • Number of transformed columns = Number of aggregate columns × Number of values in pivot_in_clause (2×2=4).
  • The number of transformed columns cannot exceed 1024.

Case 5: Multiple Columns and Multiple Values Transformation

SELECT * FROM (SELECT deptno,job, sal1,sal2 FROM emp)a 
PIVOT
(
SUM(sal1) AS sum_sal1,
SUM(sal2) AS sum_sal2
FOR (deptno, job)
IN ((10, 'clerk') as a,(20, 'manager') as b)
);
Result:
     a_sum_sal1       b_sum_sal1       a_sum_sal2       b_sum_sal2
           800.00          5825.00           910.00          2532.34
1 row(s) retrieved.
Enter fullscreen mode Exit fullscreen mode

Notes on GBase 8s Pivot Transformation Queries

  • The column used in pivot_for_clause (e.g., deptno in the examples) cannot appear in the SELECT projection list. The columns in pivot_for_clause are grouped based on the range of values provided in pivot_in_clause. Attempting to query the deptno column will result in an error because deptno is completely removed from the projection.

Example:

> SELECT deptno
FROM emp
PIVOT
(
SUM(sal1)
FOR deptno
IN(10, 20, 30, 40)
);
   217: Column (deptno) not found in any table in the query (or SLV is undefined).
Error in line 2
Near character position 0
Enter fullscreen mode Exit fullscreen mode
  • Similarly, you cannot query any columns that appear in the pivot_clause. For example, attempting to query the sal1 column will also result in an error.

  • Columns in the pivot_clause must use aggregate functions. Not using an aggregate function will cause an error.

Example:

SELECT * FROM emp PIVOT(sal1 FOR deptno IN (10, 20, 30, 40));
201: A syntax error has occurred.
Enter fullscreen mode Exit fullscreen mode
  • In the pivot_in_clause, if an alias is specified, use the alias; if not, use the value as the column name. In the pivot_clause, if an alias is specified, it is appended to the corresponding transformed column name from pivot_in_clause; if not, it is omitted. When there are multiple aggregate functions in the pivot_clause, GBase 8s allows you not to set aliases; it will automatically append _1, _2, etc., in sequence. Aliases do not support single quotes.

Example:

> SELECT * FROM (SELECT deptno,job, sal1,sal2 FROM emp)a
PIVOT(SUM(sal1),SUM(sal2)  FOR (deptno, job) IN ((10, 'clerk'),(20, 'manager'))); 
Result      
10_clerk       20_manager       10_clerk_1     20_manager_1
   800.00          5825.00           910.00          2532.34
1 row(s) retrieved.
Enter fullscreen mode Exit fullscreen mode

Through this introduction, you should now have a comprehensive understanding of the Pivot function in GBase 8s. It's not just a simple data transformation tool but a powerful asset for data analysis. In the next article, we will discuss the Unpivot function. Thank you for reading!

Top comments (0)