Introduction
There are different types of joins available in the database. SAP HANA offers all of these joins.
The different types of joins in SAP HANA
◈ INNER
◈ RIGHT OUTER
◈ LEFT OUTER
◈ FULL OUTER
◈ TEXT
◈ SPATIAL
◈ REFERENTIAL
◈ TEMPORAL
◈ DYNAMIC
In this blog post we will be exploring the “Dynamic Join” properties in HANA to create a single view and show revenue data based on different attributes.
Definition –
Whenever a join is defined as dynamic, then the modeler dynamically defines the join condition columns based on the columns requested by Dynamic joins improves the join execution process and helps reduce the number of records that join node process at run-time. Dynamic joins reduce the number of records processed by the join view node at run-time, which helps improve the join execution process.
Dynamic Joins VS Static Join
Dynamic Joins | Static Join |
The join condition changes with fields requested in query. The query gives run-time error if the client query to the join doesn’t request a join column. | The join condition doesn’t change with fields requested in query. |
Dynamic join enforces aggregation before executing the join. This means that, if a join column is not requested by the client query, its value is first aggregated, and later the join condition is executed based on columns requested in the client query. | Static joins the aggregation happens after the join. |
Prerequisite
At least one of the fields involved in the join condition is part of the client query. If you define a join as dynamic, the engine dynamically defines the join fields based on the fields requested by the client query. But, if the field is not part of the client query, it results in query run-time error. We can use dynamic join when we have a composite join(More than one field in join condition).
Business Scenario –
We have Global system in HANA where we have multiple region.e.g-APAC ,EMEA, LAO and NA.
We have a requirement where we have to generate a report which will give the revenue numbers as per region,company and product . But the business does not want multiple HANA VDM created for this.
In the below steps we will be creating one single VDM where we will use the dynamic join to accomplish business requirement which is to show the gross revenue and revenue ratio of the product across the region.
We basically do 2 type of reporting for the same query.
1-Revenue ratio based on Region, Product and Company
2- Revenue ratio only based on Product and Company
If we take static join the Gross Revenue doesn’t show correct value.
The below example is part of my experience working with HANA in my current assignment.
Source Table-
REGION | PRODUCT | COMPANY | REVENUE |
APAC | WIPER | C1 | 20 |
APAC | WIPER | C2 | 20 |
APAC | NAPKIN | C1 | 20 |
APAC | NAPKIN | C2 | 10 |
APAC | DIAPER | C1 | 30 |
APAC | DIAPER | C2 | 20 |
EMEA | WIPER | C1 | 20 |
EMEA | WIPER | C2 | 10 |
EMEA | WIPER | C3 | 30 |
EMEA | NAPKIN | C1 | 10 |
EMEA | NAPKIN | C2 | 30 |
EMEA | NAPKIN | C3 | 20 |
EMEA | DIAPER | C1 | 20 |
EMEA | DIAPER | C2 | 20 |
EMEA | DIAPER | C3 | 10 |
In above example companies C1, C2, and C3 have business share in the Region EMEA but for APAC we have only business share for company C1 and C2.
To implement this in SAP HANA, we need to create a calculation view and use the same table in two “aggregation” nodes. Create the first aggregation node as below:
Steps-
1. Let us start by creating the source table where the revenue data is being stored-
Create Table <SCHEMA>.”REVENUE_MARKET”( “REGION” VARCHAR(10), “PRODUCT” VARCHAR(15),”COMPANY” VARCHAR(10),”SALES” INT);
2. Now, let us create a Calculation view. In this calculation view we will be using the same table which we have just created in two different aggregation nodes.
i. In the first aggregation node, I am taking the table and using the column to get the revenue details based in region, product and company.
ii.In the second aggregation node, I am taking the table and using the column to get the revenue details based on Product and company across the region.
First aggregation node –
3. We want Gross Revenue by Company so on second aggregation node we take same table but will not enable “COMPANY” because we want to gross revenue by company. When adding the aggregated measure Revenue from this node, rename it as Gross Revenue.
4. Now we join these two aggregation node with an inner join on “REGION” and “PRODUCT” and we need to derive Revenue Ratio. So we create Calculated Column “Revenue Ratio”
Formula- Revenue Ratio = Revenue/Total Revenue
Issue Faced-
Now when we do data preview-
We reconcile the source table data with HANA data. When we check Gross Revenue of a particular Product with respect to Region we find he correct values.
Issue arrives when we check the Revenue Ratio of a company for all the regions. We want Gross revenue to be shown across both APAC and EMEA.
Since Company C3 has no business share in APAC, the total Revenue for C3 will be only for EMEA.
So the required output is not reflecting as want the Gross Revenue to be calculated across the region(APAC & EMEA).
Solution –
To achieve the required result, we need to enable Dynamic Join between the two aggregation nodes. The dynamic join comes in picture for the field we select in output . Since we are not selecting “REGION” in the query, HANA does not execute the join on “REGION”.This allows the gross revenue to remain “constant” across all companies.
We can see that we are getting same data as static join if we take all fields in report output. (Which was correct).
Now we exclude the Region from the query output-
In this way by using “dynamic join” we can achieve the required result. Now we can report the Gross Revenue across the region e.g ( APAC and EMEA)