Quantcast
Channel: SAP HANA Central
Viewing all articles
Browse latest Browse all 711

How to pivot/unpivot in SAP HANA

$
0
0

Introduction


Currently there is no built-in pivot/unpivot function in HANA. In this blog you will find a workaround how to implement this in SQLScript.

Pivot/Unpivot


Pivoting is the transformation from the rows into the columns.
SAP HANA, SAP HANA Certifications, SAP HANA Tutorials and Materials, SAP HANA Guides

Unpivoting is the transformation from the columns into the rows.

Idea & Solution


Even if there does no built-in function in HANA exists, there are workarounds how to do this in SQL. For the pivot-function it is possible to create new columns with CASE/WHEN and filter for particular values.

Pivot


First of all, we will create a test table with three columns. Let’s call the table TEST_A and the columns PROD, DATE_YEAR and SALES.

CREATE TABLE TEST_A(
PROD VARCHAR(1),
DATE_YEAR INT,
SALES INT);

Insert some data into the table and display the values. 

INSERT INTO TEST_A VALUES ('A',2015,123456);
INSERT INTO TEST_A VALUES ('A',2016,234567);
INSERT INTO TEST_A VALUES ('A',2017,345678);
INSERT INTO TEST_A VALUES ('A',2018,456789);
INSERT INTO TEST_A VALUES ('A',2019,567890);

PRODDATE_YEARSALES
2015 123456 
2016 234567 
2017 345678 
2018 456789 
2019 567890 

Now we can pivot the values with CASE/WHEN

SELECT
PROD,
SUM(CASE WHEN DATE_YEAR = 2015 THEN SALES END) AS YEAR_2015,
SUM(CASE WHEN DATE_YEAR = 2016 THEN SALES END) AS YEAR_2016,
SUM(CASE WHEN DATE_YEAR = 2017 THEN SALES END) AS YEAR_2017,
SUM(CASE WHEN DATE_YEAR = 2018 THEN SALES END) AS YEAR_2018,
SUM(CASE WHEN DATE_YEAR = 2019 THEN SALES END) AS YEAR_2019 
FROM TEST_A 
GROUP BY PROD;

The result of the query is:

PRODYEAR_2015YEAR_2016YEAR_2017 YEAR_2018 YEAR_2019
123456234567345678456789 567890 

Well, you don’t want to write this code every time, if you need to pivot a table. Because of this, we can use SQLScript to generate this code and execute it directly. If you have trouble to execute this code, you can also use instead of CREATE OR REPLACE just CREATE.

CREATE OR REPLACE PROCEDURE P_PIVOT(
IN table_name VARCHAR(127),
IN schema_name VARCHAR(127) , 
IN select_columns VARCHAR(2147483647),
IN agg_typ VARCHAR(20),
IN agg_column VARCHAR(127),
IN pivot_column VARCHAR(2147483647),
IN pivot_values VARCHAR(2147483647))
LANGUAGE SQLSCRIPT
SQL SECURITY INVOKER
READS SQL DATA AS
BEGIN
USING SQLSCRIPT_STRING AS lib;
DECLARE v_table_name VARCHAR(127) = table_name;
DECLARE v_schema_name VARCHAR(127) = schema_name;
DECLARE v_select_columns VARCHAR(2147483647) = select_columns;
DECLARE v_agg_typ VARCHAR(20) = agg_typ;
DECLARE v_agg_column VARCHAR(127) = agg_column;
DECLARE v_pivot_column VARCHAR(2147483647) = pivot_column;
DECLARE v_pivot_values VARCHAR(2147483647) = pivot_values;

DECLARE v_count INT= 0;
DECLARE v_idx INT;
DECLARE v_statement VARCHAR(2147483647);
DECLARE a_pivot_values VARCHAR(127) ARRAY;

/*
if all columns needed, use * to get the column names except v_agg_column,v_pivot_column
*/
IF v_select_columns = '*'
THEN
SELECT string_agg(column_name,',' ORDER BY position) INTO v_select_columns FROM sys.columns 
WHERE table_name = v_table_name
AND schema_name = v_schema_name 
AND column_name NOT IN (v_agg_column,v_pivot_column);
END IF;
/*
if all values in the pivot column should use for pivoting
*/
IF v_pivot_values = '*'
THEN
EXECUTE IMMEDIATE ('select string_agg('||:v_pivot_column ||', '','' order by ' || :v_pivot_column || ') from (select distinct ' || :v_pivot_column || ' from '|| :v_table_name || ')') into v_pivot_values;
END IF;

a_pivot_values := LIB:split_to_array( :v_pivot_values, ',' );
v_count = cardinality(:a_pivot_values);
v_statement = 'select ' || v_select_columns;

/*
generate the statement
*/
FOR v_idx IN 1.. v_count DO
v_statement = v_statement || ', ' || v_agg_typ || '(case when ' || v_pivot_column || ' = ' || :a_pivot_values[:v_idx] || ' then ' || v_agg_column || ' end ) as ' || v_pivot_column || '_' || :a_pivot_values[:v_idx];
END FOR;
v_statement = v_statement || ' from ' || v_table_name || ' group by ' || v_select_columns;
/*
execute the statement
*/
EXECUTE IMMEDIATE v_statement;
END;

You can call this procedure with list of parameters as listed below:

Parameter nameDescription 
table_nameName of table, which should pivot
schema_name Name of schema, where the table is created 
select_columns List of columns, which should display or * 
agg_typ Type of aggregation like sum, count, min, max etc. 
agg_column Column, which should aggregate 
pivot_column  Column, which should pivot 
pivot_values List of values in the pivot_column, which should generate as separate columns or *. Please consider the maximum number of columns. 

For example, the procedure can be called like this:

CALL P_PIVOT('TEST_A', '', 'PROD','SUM','SALES','DATE_YEAR','*');
CALL P_PIVOT('TEST_A', '', 'PROD','SUM','SALES','DATE_YEAR','2015,2016');
CALL P_PIVOT('TEST_A', 'SYSTEM', '*','SUM','SALES','DATE_YEAR','*');

Unpivot


The unpivot functionality can be realized by using the MAP function and SERIES_GENERATE_INTEGER. As the picture above describes, unpivot means generate rows out of columns. The function SERIES_GENERATE_INTEGER can be utilized to generate an integer table, which can be used for joining with the source table to generate multiple rows.

We create now a table with product and several sales years and call it TEST_B.

CREATE TABLE TEST_B(
PROD VARCHAR(1),
SALES_YEAR_2015 INT,
SALES_YEAR_2016 INT,
SALES_YEAR_2017 INT,
SALES_YEAR_2018 INT,
SALES_YEAR_2019 INT);

Insert data into the table and display.

INSERT INTO TEST_B VALUES('A', 123456, 234567, 345678, 456789, 567890);
INSERT INTO TEST_B VALUES('B', 123456, 234567, 345678, 456789, null);

PRODYEAR_2015YEAR_2016YEAR_2017 YEAR_2018 YEAR_2019
123456234567345678456789 567890 
B123456234567345678456789 

Now we can use the functions mentioned above to generate the unpivot table:

SELECT
PROD,
MAP(element_number,
1, '2015',
2, '2016',
3, '2017',
4, '2018',
5, '2019') AS "DATE_YEAR",
MAP(element_number ,
1, SALES_YEAR_2015,
2, SALES_YEAR_2016,
3, SALES_YEAR_2017,
4, SALES_YEAR_2018,
5, SALES_YEAR_2019) AS "SALES" 
FROM TEST_B,
SERIES_GENERATE_INTEGER(1, 1, 6) 
ORDER BY element_number;

The result of the query is:

PRODDATE_YEAR  SALES 
A2015123.456
2015 123.456 
2016 234.567
2016 234.567 
2017 345.678 
2017 345.678 
2018 456.789 
2018 456.789 
2019 567.890 
2019 

To automatically generate this code, the following procedure can be used:

CREATE OR REPLACE PROCEDURE P_UNPIVOT(
IN table_name VARCHAR(127),
IN schema_name VARCHAR(127) , 
IN select_columns VARCHAR(2147483647),
IN unpivot_val_name VARCHAR(127),
IN unpivot_col_name VARCHAR(127),
IN unpivot_columns VARCHAR(2147483647),
IN unpivot_column_values VARCHAR(2147483647),
IN include_nulls BOOLEAN DEFAULT TRUE)
LANGUAGE SQLSCRIPT
SQL SECURITY INVOKER
READS SQL DATA AS
BEGIN
USING SQLSCRIPT_STRING AS lib;
DECLARE v_table_name VARCHAR(127) = table_name;
DECLARE v_schema_name VARCHAR(127) = schema_name;
DECLARE v_select_columns VARCHAR(2147483647) = select_columns;
DECLARE v_unpivot_val_name VARCHAR(127) = unpivot_val_name;
DECLARE v_unpivot_col_name VARCHAR(127) = unpivot_col_name;
DECLARE v_unpivot_columns VARCHAR(2147483647) = unpivot_columns;
DECLARE v_unpivot_column_values VARCHAR(2147483647) = unpivot_column_values;
DECLARE v_count INT= 0;
DECLARE v_idx INT;
DECLARE v_statement VARCHAR(2147483647);
DECLARE v_statement_map1 VARCHAR(2147483647) = '';
DECLARE v_statement_map2 VARCHAR(2147483647) = '';
DECLARE v_include_nulls BOOLEAN = include_nulls;
DECLARE a_unpivot_columns VARCHAR(127) array;
DECLARE a_unpivot_column_values VARCHAR(100) array;

a_unpivot_columns = LIB:split_to_array( :v_unpivot_columns, ',' );
a_unpivot_column_values = LIB:split_to_array( :v_unpivot_column_values, ',' );
v_count = cardinality(:a_unpivot_columns);
tbl_pivot_columns = UNNEST(:a_unpivot_columns) AS ("EXCLUDE_COLUMNS");

/*
if all columns needed, use * to get the column names except tbl_pivot_columns
*/
IF v_select_columns = '*'
THEN
SELECT string_agg(column_name,',' ORDER BY position) INTO v_select_columns FROM sys.columns 
WHERE table_name = v_table_name
AND schema_name = v_schema_name 
AND column_name NOT IN (SELECT EXCLUDE_COLUMNS FROM :tbl_pivot_columns);
end IF;

v_statement = 'select ' || v_select_columns || ', ' || 'map(element_number';

/*
generate the statement
*/
FOR v_idx IN 1.. v_count DO
v_statement_map1 = v_statement_map1 || ',' || v_idx  || ',''' || :a_unpivot_column_values[:v_idx] || '''' ;
v_statement_map2 = v_statement_map2 || ',' || v_idx  || ',' || :a_unpivot_columns[:v_idx]  ;
END FOR;
v_statement = v_statement || v_statement_map1 || ') as "'|| v_unpivot_col_name || '", map(element_number ' || v_statement_map2 || ') as "' || v_unpivot_val_name || '" from ' || v_table_name || ', SERIES_GENERATE_INTEGER(1,1,' || v_count+1 || ') order by element_number';

IF v_include_nulls = FALSE
THEN
v_statement = 'select * from (' || v_statement || ' ) where ' || v_unpivot_val_name || ' is not null';
END IF;

EXECUTE IMMEDIATE v_statement;

END;

The input parameters are:

Parameter nameDescription 
table_nameName of table, which should unpivot
schema_name  Name of schema, where the table is created 
select_columns  List of columns, which should display or * 
unpivot_val_name  Name of column for unpivot value columns 
unpivot_col_name  Name of column for unpivot columns 
unpivot_columnsList of unpivot columns 
unpivot_column_values Values for unpivot columns 
include_nullsIf the null values should display, default TRUE. 

For example, the procedure can be called like this:

CALL P_UNPIVOT('TEST_B','SYSTEM','*','VAL','JAHR','SALES_YEAR_2015,SALES_YEAR_2016,SALES_YEAR_2017,SALES_YEAR_2018,SALES_YEAR_2019','2015,2016,2017,2018,2019',FALSE);
CALL P_UNPIVOT('TEST_B','SYSTEM','*','VAL','JAHR','SALES_YEAR_2015,SALES_YEAR_2016,SALES_YEAR_2017,SALES_YEAR_2018,SALES_YEAR_2019','2015,2016,2017,2018,2019');

Further steps


Because the created table structure is not known during compile time, no table typed output parameter can be used. To store the result for later usage, you can add the following code to the procedures to store the result into a dynamically generated table. In addition, you need to define the variable v_new_table_name, which defines the table to be used for storage. Please note, that the usage of dynamic executed DDL statements is not recommended:

if exists(select 1 from tables where table_name = v_new_table_name and schema_name = v_schema_name)
then
execute immediate 'drop table ' || v_new_table_name;
end if;

execute immediate 'create table ' || v_new_table_name || ' as ( ' || v_statement || ')';

The procedures using dynamic SQL to execute the generated SELECT statement.

Viewing all articles
Browse latest Browse all 711

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>