Introduction
Currently there is no built-in pivot/unpivot function in HANA. In this blog you will find a workaround how to implement this in SQLScript.
Pivot/Unpivot
Pivoting is the transformation from the rows into the columns.
Unpivoting is the transformation from the columns into the rows.
Idea & Solution
Even if there does no built-in function in HANA exists, there are workarounds how to do this in SQL. For the pivot-function it is possible to create new columns with CASE/WHEN and filter for particular values.
Pivot
First of all, we will create a test table with three columns. Let’s call the table TEST_A and the columns PROD, DATE_YEAR and SALES.
CREATE TABLE TEST_A(
PROD VARCHAR(1),
DATE_YEAR INT,
SALES INT);
Insert some data into the table and display the values.
INSERT INTO TEST_A VALUES ('A',2015,123456);
INSERT INTO TEST_A VALUES ('A',2016,234567);
INSERT INTO TEST_A VALUES ('A',2017,345678);
INSERT INTO TEST_A VALUES ('A',2018,456789);
INSERT INTO TEST_A VALUES ('A',2019,567890);
PROD | DATE_YEAR | SALES |
A | 2015 | 123456 |
A | 2016 | 234567 |
A | 2017 | 345678 |
A | 2018 | 456789 |
A | 2019 | 567890 |
Now we can pivot the values with CASE/WHEN
SELECT
PROD,
SUM(CASE WHEN DATE_YEAR = 2015 THEN SALES END) AS YEAR_2015,
SUM(CASE WHEN DATE_YEAR = 2016 THEN SALES END) AS YEAR_2016,
SUM(CASE WHEN DATE_YEAR = 2017 THEN SALES END) AS YEAR_2017,
SUM(CASE WHEN DATE_YEAR = 2018 THEN SALES END) AS YEAR_2018,
SUM(CASE WHEN DATE_YEAR = 2019 THEN SALES END) AS YEAR_2019
FROM TEST_A
GROUP BY PROD;
The result of the query is:
PROD | YEAR_2015 | YEAR_2016 | YEAR_2017 | YEAR_2018 | YEAR_2019 |
A | 123456 | 234567 | 345678 | 456789 | 567890 |
Well, you don’t want to write this code every time, if you need to pivot a table. Because of this, we can use SQLScript to generate this code and execute it directly. If you have trouble to execute this code, you can also use instead of CREATE OR REPLACE just CREATE.
CREATE OR REPLACE PROCEDURE P_PIVOT(
IN table_name VARCHAR(127),
IN schema_name VARCHAR(127) ,
IN select_columns VARCHAR(2147483647),
IN agg_typ VARCHAR(20),
IN agg_column VARCHAR(127),
IN pivot_column VARCHAR(2147483647),
IN pivot_values VARCHAR(2147483647))
LANGUAGE SQLSCRIPT
SQL SECURITY INVOKER
READS SQL DATA AS
BEGIN
USING SQLSCRIPT_STRING AS lib;
DECLARE v_table_name VARCHAR(127) = table_name;
DECLARE v_schema_name VARCHAR(127) = schema_name;
DECLARE v_select_columns VARCHAR(2147483647) = select_columns;
DECLARE v_agg_typ VARCHAR(20) = agg_typ;
DECLARE v_agg_column VARCHAR(127) = agg_column;
DECLARE v_pivot_column VARCHAR(2147483647) = pivot_column;
DECLARE v_pivot_values VARCHAR(2147483647) = pivot_values;
DECLARE v_count INT= 0;
DECLARE v_idx INT;
DECLARE v_statement VARCHAR(2147483647);
DECLARE a_pivot_values VARCHAR(127) ARRAY;
/*
if all columns needed, use * to get the column names except v_agg_column,v_pivot_column
*/
IF v_select_columns = '*'
THEN
SELECT string_agg(column_name,',' ORDER BY position) INTO v_select_columns FROM sys.columns
WHERE table_name = v_table_name
AND schema_name = v_schema_name
AND column_name NOT IN (v_agg_column,v_pivot_column);
END IF;
/*
if all values in the pivot column should use for pivoting
*/
IF v_pivot_values = '*'
THEN
EXECUTE IMMEDIATE ('select string_agg('||:v_pivot_column ||', '','' order by ' || :v_pivot_column || ') from (select distinct ' || :v_pivot_column || ' from '|| :v_table_name || ')') into v_pivot_values;
END IF;
a_pivot_values := LIB:split_to_array( :v_pivot_values, ',' );
v_count = cardinality(:a_pivot_values);
v_statement = 'select ' || v_select_columns;
/*
generate the statement
*/
FOR v_idx IN 1.. v_count DO
v_statement = v_statement || ', ' || v_agg_typ || '(case when ' || v_pivot_column || ' = ' || :a_pivot_values[:v_idx] || ' then ' || v_agg_column || ' end ) as ' || v_pivot_column || '_' || :a_pivot_values[:v_idx];
END FOR;
v_statement = v_statement || ' from ' || v_table_name || ' group by ' || v_select_columns;
/*
execute the statement
*/
EXECUTE IMMEDIATE v_statement;
END;
You can call this procedure with list of parameters as listed below:
Parameter name | Description |
table_name | Name of table, which should pivot |
schema_name | Name of schema, where the table is created |
select_columns | List of columns, which should display or * |
agg_typ | Type of aggregation like sum, count, min, max etc. |
agg_column | Column, which should aggregate |
pivot_column | Column, which should pivot |
pivot_values | List of values in the pivot_column, which should generate as separate columns or *. Please consider the maximum number of columns. |
For example, the procedure can be called like this:
CALL P_PIVOT('TEST_A', '', 'PROD','SUM','SALES','DATE_YEAR','*');
CALL P_PIVOT('TEST_A', '', 'PROD','SUM','SALES','DATE_YEAR','2015,2016');
CALL P_PIVOT('TEST_A', 'SYSTEM', '*','SUM','SALES','DATE_YEAR','*');
Unpivot
The unpivot functionality can be realized by using the MAP function and SERIES_GENERATE_INTEGER. As the picture above describes, unpivot means generate rows out of columns. The function SERIES_GENERATE_INTEGER can be utilized to generate an integer table, which can be used for joining with the source table to generate multiple rows.
We create now a table with product and several sales years and call it TEST_B.
CREATE TABLE TEST_B(
PROD VARCHAR(1),
SALES_YEAR_2015 INT,
SALES_YEAR_2016 INT,
SALES_YEAR_2017 INT,
SALES_YEAR_2018 INT,
SALES_YEAR_2019 INT);
Insert data into the table and display.
INSERT INTO TEST_B VALUES('A', 123456, 234567, 345678, 456789, 567890);
INSERT INTO TEST_B VALUES('B', 123456, 234567, 345678, 456789, null);
PROD | YEAR_2015 | YEAR_2016 | YEAR_2017 | YEAR_2018 | YEAR_2019 |
A | 123456 | 234567 | 345678 | 456789 | 567890 |
B | 123456 | 234567 | 345678 | 456789 |
Now we can use the functions mentioned above to generate the unpivot table:
SELECT
PROD,
MAP(element_number,
1, '2015',
2, '2016',
3, '2017',
4, '2018',
5, '2019') AS "DATE_YEAR",
MAP(element_number ,
1, SALES_YEAR_2015,
2, SALES_YEAR_2016,
3, SALES_YEAR_2017,
4, SALES_YEAR_2018,
5, SALES_YEAR_2019) AS "SALES"
FROM TEST_B,
SERIES_GENERATE_INTEGER(1, 1, 6)
ORDER BY element_number;
The result of the query is:
PROD | DATE_YEAR | SALES |
A | 2015 | 123.456 |
B | 2015 | 123.456 |
A | 2016 | 234.567 |
B | 2016 | 234.567 |
A | 2017 | 345.678 |
B | 2017 | 345.678 |
A | 2018 | 456.789 |
B | 2018 | 456.789 |
A | 2019 | 567.890 |
B | 2019 | ? |
To automatically generate this code, the following procedure can be used:
CREATE OR REPLACE PROCEDURE P_UNPIVOT(
IN table_name VARCHAR(127),
IN schema_name VARCHAR(127) ,
IN select_columns VARCHAR(2147483647),
IN unpivot_val_name VARCHAR(127),
IN unpivot_col_name VARCHAR(127),
IN unpivot_columns VARCHAR(2147483647),
IN unpivot_column_values VARCHAR(2147483647),
IN include_nulls BOOLEAN DEFAULT TRUE)
LANGUAGE SQLSCRIPT
SQL SECURITY INVOKER
READS SQL DATA AS
BEGIN
USING SQLSCRIPT_STRING AS lib;
DECLARE v_table_name VARCHAR(127) = table_name;
DECLARE v_schema_name VARCHAR(127) = schema_name;
DECLARE v_select_columns VARCHAR(2147483647) = select_columns;
DECLARE v_unpivot_val_name VARCHAR(127) = unpivot_val_name;
DECLARE v_unpivot_col_name VARCHAR(127) = unpivot_col_name;
DECLARE v_unpivot_columns VARCHAR(2147483647) = unpivot_columns;
DECLARE v_unpivot_column_values VARCHAR(2147483647) = unpivot_column_values;
DECLARE v_count INT= 0;
DECLARE v_idx INT;
DECLARE v_statement VARCHAR(2147483647);
DECLARE v_statement_map1 VARCHAR(2147483647) = '';
DECLARE v_statement_map2 VARCHAR(2147483647) = '';
DECLARE v_include_nulls BOOLEAN = include_nulls;
DECLARE a_unpivot_columns VARCHAR(127) array;
DECLARE a_unpivot_column_values VARCHAR(100) array;
a_unpivot_columns = LIB:split_to_array( :v_unpivot_columns, ',' );
a_unpivot_column_values = LIB:split_to_array( :v_unpivot_column_values, ',' );
v_count = cardinality(:a_unpivot_columns);
tbl_pivot_columns = UNNEST(:a_unpivot_columns) AS ("EXCLUDE_COLUMNS");
/*
if all columns needed, use * to get the column names except tbl_pivot_columns
*/
IF v_select_columns = '*'
THEN
SELECT string_agg(column_name,',' ORDER BY position) INTO v_select_columns FROM sys.columns
WHERE table_name = v_table_name
AND schema_name = v_schema_name
AND column_name NOT IN (SELECT EXCLUDE_COLUMNS FROM :tbl_pivot_columns);
end IF;
v_statement = 'select ' || v_select_columns || ', ' || 'map(element_number';
/*
generate the statement
*/
FOR v_idx IN 1.. v_count DO
v_statement_map1 = v_statement_map1 || ',' || v_idx || ',''' || :a_unpivot_column_values[:v_idx] || '''' ;
v_statement_map2 = v_statement_map2 || ',' || v_idx || ',' || :a_unpivot_columns[:v_idx] ;
END FOR;
v_statement = v_statement || v_statement_map1 || ') as "'|| v_unpivot_col_name || '", map(element_number ' || v_statement_map2 || ') as "' || v_unpivot_val_name || '" from ' || v_table_name || ', SERIES_GENERATE_INTEGER(1,1,' || v_count+1 || ') order by element_number';
IF v_include_nulls = FALSE
THEN
v_statement = 'select * from (' || v_statement || ' ) where ' || v_unpivot_val_name || ' is not null';
END IF;
EXECUTE IMMEDIATE v_statement;
END;
The input parameters are:
Parameter name | Description |
table_name | Name of table, which should unpivot |
schema_name | Name of schema, where the table is created |
select_columns | List of columns, which should display or * |
unpivot_val_name | Name of column for unpivot value columns |
unpivot_col_name | Name of column for unpivot columns |
unpivot_columns | List of unpivot columns |
unpivot_column_values | Values for unpivot columns |
include_nulls | If the null values should display, default TRUE. |
For example, the procedure can be called like this:
CALL P_UNPIVOT('TEST_B','SYSTEM','*','VAL','JAHR','SALES_YEAR_2015,SALES_YEAR_2016,SALES_YEAR_2017,SALES_YEAR_2018,SALES_YEAR_2019','2015,2016,2017,2018,2019',FALSE);
CALL P_UNPIVOT('TEST_B','SYSTEM','*','VAL','JAHR','SALES_YEAR_2015,SALES_YEAR_2016,SALES_YEAR_2017,SALES_YEAR_2018,SALES_YEAR_2019','2015,2016,2017,2018,2019');
Further steps
Because the created table structure is not known during compile time, no table typed output parameter can be used. To store the result for later usage, you can add the following code to the procedures to store the result into a dynamically generated table. In addition, you need to define the variable v_new_table_name, which defines the table to be used for storage. Please note, that the usage of dynamic executed DDL statements is not recommended:
if exists(select 1 from tables where table_name = v_new_table_name and schema_name = v_schema_name)
then
execute immediate 'drop table ' || v_new_table_name;
end if;
execute immediate 'create table ' || v_new_table_name || ' as ( ' || v_statement || ')';
The procedures using dynamic SQL to execute the generated SELECT statement.